Hello!


Billy wrote:
> I've just got udmsearch going and started to index
> some sites, however things did not go as planned and I
> had to drop table dict in mysql to stop udmsearch in
> midstream (you guys know a better way? - I'm keen to
> learn) 

Press Ctrl-C


> 
> What is crc in url table for?
> 

It is to eliminate the same documents, for example such things
like mirrors, etc.




> Can I use "include servers.conf" or similar to include
> a list of servers into the main conf file to make
> maintainance easier?

Yes, take a look in indexer.conf-dist. There are explanations of
all available commands.



> How do I get rid of the crappy text colum in url
> because it really sucks... who wants to see:
> Wednesday, January 05 About Advertising Forums Contact
> Customize Downloads Links News Letter News Archive
> #
> ...as a summary of a document? Is there any way to
> improve that?

If you are indexing your own site, you can change the top of document
if User-Agent is UdmSearch. But if it is not your site, I have no idea
of
how to do it better.



> How do I improve the search results by changing
> relevance of title and body etc... how effective
> actually is this technique?

Take a look in indexer.conf-dist



> Is it possable to stop the indexer from indexing
> stupid and irrelivant words like "the, it, to, and,
> we"
> etc etc. Is this what "stopwords" is for and if so how
> do I use it?

Yes. It is explained in INSTALL file in the top of UdmSearch
distribution.


> Is it really neccasary to have a word list of about
> 250,000 in table dict from just ONE site!?
What do you mean?



> Why do I get no description and no or only a few
> keywords in the respective colums? Are these for meta
> tags?
Yes, for META tags.



> How do I index only certain parts of a site and not
> others?

Use Allow/Disallow indexer.conf commands. Those are explained in
indexer,conf-dist



> Why if I type www.testdomain.com/ as a server will it
> only index index.html or the first page?
May be there are not links to this server from index.html



> Does followoutside no/yes mean follow outside / on the
> server or follow outside the server completely to a
> whole seperate domain?
completely



> Maxhops does not seem to work. I had set it to 256 and
> it went on for longer than that.

MaxHops is a way in "mouse-clicks" from the first page to current one.
256 is VERY big number. Note that it does not mean "the total number of
pages"
wich is may be considerably bigger.



-- 
Alexander Barkov
IZHCOM, Izhevsk
email:    [EMAIL PROTECTED]      | http://www.izhcom.ru
Phone:    +7 (3412) 51-55-45 | Fax: +7 (3412) 78-70-10
ICQ:      7748759
______________
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]

Reply via email to