OK, so I've been reading the source code, and I'm having real trouble with
what some parts of the url table are for...

Why bother to compute a crc32 for the urls? do I need it? [I'm currently
using crc-multi db mode on mySQL]

It seems to be the primary key, but then what's the point of rec_id? Seeing
as how rec_id autoincrements, surely it's actually more unique than a crc,
which by your own analysis is not unique for about 250 in any given
1,600,000 urls?

Why is there a keywords field? I thought that the search worked by:

0)  Compute the crc's of the keywords we're looking for
1)  looking up the crc's we're searching for from the dict tables
2)  using url_id as a foreign key, look up the relevant url for rec_id key
in the url table
3)  also look up all the other information from the url table, such as
description, title, text

Surely this doesn't need a keyword field, since we're searching other tables
based on keywords anyway?


What's the difference between txt and description? I assume that Description
is the descrption if there's a description meta-tag, and txt is an extract
of the text.
Why is there both? Surely a unified field that could contain description, if
it's there, or an extract, if there's not?


Thank-you very much,
Gary
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to