Hello!
Billy wrote:
> I've just got udmsearch going and started to index
> some sites, however things did not go as planned and I
> had to drop table dict in mysql to stop udmsearch in
> midstream (you guys know a better way? - I'm keen to
> learn)
Press Ctrl-C
>
> What is crc in url table for?
>
It is to eliminate the same documents, for example such things
like mirrors, etc.
> Can I use "include servers.conf" or similar to include
> a list of servers into the main conf file to make
> maintainance easier?
Yes, take a look in indexer.conf-dist. There are explanations of
all available commands.
> How do I get rid of the crappy text colum in url
> because it really sucks... who wants to see:
> Wednesday, January 05 About Advertising Forums Contact
> Customize Downloads Links News Letter News Archive
> #
> ...as a summary of a document? Is there any way to
> improve that?
If you are indexing your own site, you can change the top of document
if User-Agent is UdmSearch. But if it is not your site, I have no idea
of
how to do it better.
> How do I improve the search results by changing
> relevance of title and body etc... how effective
> actually is this technique?
Take a look in indexer.conf-dist
> Is it possable to stop the indexer from indexing
> stupid and irrelivant words like "the, it, to, and,
> we"
> etc etc. Is this what "stopwords" is for and if so how
> do I use it?
Yes. It is explained in INSTALL file in the top of UdmSearch
distribution.
> Is it really neccasary to have a word list of about
> 250,000 in table dict from just ONE site!?
What do you mean?
> Why do I get no description and no or only a few
> keywords in the respective colums? Are these for meta
> tags?
Yes, for META tags.
> How do I index only certain parts of a site and not
> others?
Use Allow/Disallow indexer.conf commands. Those are explained in
indexer,conf-dist
> Why if I type www.testdomain.com/ as a server will it
> only index index.html or the first page?
May be there are not links to this server from index.html
> Does followoutside no/yes mean follow outside / on the
> server or follow outside the server completely to a
> whole seperate domain?
completely
> Maxhops does not seem to work. I had set it to 256 and
> it went on for longer than that.
MaxHops is a way in "mouse-clicks" from the first page to current one.
256 is VERY big number. Note that it does not mean "the total number of
pages"
wich is may be considerably bigger.
--
Alexander Barkov
IZHCOM, Izhevsk
email: [EMAIL PROTECTED] | http://www.izhcom.ru
Phone: +7 (3412) 51-55-45 | Fax: +7 (3412) 78-70-10
ICQ: 7748759
______________
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]