According to ronald:
> when htdig exports results from an index as textformat it generates two
> files. The files look like this :
>
> file1:
> 0 u:http://www.htdig.org/ t:ht://Dig -- Internet search engine software a:0
> m:936027636 s:373 h: h: l:940510479 L:2 I:373
>d:http://www.htdig.org/www.htdig.orght://Dig Search Software (yes, the developers
>use it)ht://DigParent Directory A:
First field:doc ID
u: URL of doc
t: doc title
a: doc state (refer to source)
m: date/time last modified, sec since 1970-01-01 00:00:00 UTC
s: doc size in bytes
h: doc head (excerpt of first max_head_length bytes of doc)
h: (2nd)meta description contents
(this 2nd h is a bug - it really should be a unique value
like D or something)
l: date/time document was indexed (sec since 1970)
L: no. of links doc has to other docs
I: "docImageSize" - has nothing to do with images, but seems to
contain document size, and may be cumulative in some
circumstances - can anyone else make any sense of this?
d: link descriptions - text of links to this doc, ^A separated
A: anchor names (bookmarks) in doc, ^A separated
All fields are tab (^I) separated. Sub-fields of d & A use ^A separator.
doc head field has all runs of white space (space, tab, newline, etc.)
collapsed to single spaces.
> file2:
This is db.wordlist...
> 01oct99 i:115 l:0 w:100998c:2
> 01oct99 i:116 l:0 w:100998c:2
> 01oct99 i:45l:6 w:100381c:2
> 01oct99 i:46l:0 w:100998c:2
> 02aug1999 i:48l:361 w:639 a:2
> 02jun1999 i:50l:262 w:1382 c:2 a:2
> 02mar1999 i:53l:378 w:622 a:2
> 02may1999 i:51l:280 w:1349 c:2 a:2
First field:indexed word (lower case)
i: doc ID (to match up with records from above)
l: location of word in doc (0-1000, i.e. tenth of a percent units)
w: weight of word in searches
c: no. of occurrences of word in document, if > 1
a: index into "A:" list above, to indicate which anchor name,
if any, preceded this word
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You'll receive a message confirming the unsubscription.