contain a built-in Spider that provides
integrated searching of remote Web
site content, along with locally-available
data. The dtSearch Spider can index
and search dynamically-generated content,
such as ASP/ASP.NET, MS CMS, SharePoint,
Spider can index XML, HTML, ASP and
ASP.NET Web pages, as well as online
postings of text based documents such
as PDF, word processor files and spreadsheets.
Desktop and Network will display Web
pages and documents with highlighted
hits as well as links and images intact
within HTML and PDF files.
the dtSearch Spider Works
index a Web site, in the Update Index Dialog
select "Add Web..."
the dialog box, type or paste the name (URL)
of the Web site, for example http://www.federalreserve.gov/
then select the crawl depth; a crawl depth
of 1 will reach only pages linked directly
to the home page, a crawl depth of 4 will
reach four levels deep into the site
and so on.
Options allow the Spider to crawl across
multiple servers from a single starting
URL, limit the maximum size of files to
download, the number of files to index and
number of minutes to spend indexing on a
single web site. The Spider supports the
robot's "no index" and "no
follow" meta tags. The Spider can perform
"vertical" searching of pages
linked from a URL, as well as "horizontal"
crawling of sites linked to a URL.
a Spider demo operating through dtSearch
The www.dtsearch.com spidered site is hosted
on a completely different hosting system
and physical location from the site that
is running the Search Site demo.
pages or text can be cached in version 7
here for details.
addition to searching publicly available
Web sites, the Spider also supports indexing
and searching of secure content HTTPS sites
and password-accessible sites.The Spider
also supports forms-based authentication.
information on searching ASP, please see
this FAQ article:
to use dtSearch Web with dynamically-generated