Hi Guys,
Can anyone explain 'how' the contentLength field is populated? I've been
indexing a few sites and some seem to have this field available while others
don't. I really don't understand why. I've looked through
MoreIndexingFilter.java, ParseData.java, HttpHeaders.java and Metadata.java
source files as well as the logs of the various crawls (fetch...index) but
can't seem to figure out why..
I'm using nutch trunk with index-more and query-more enabled.
Regards,
Hilkiah G. Lavinier MEng (Hons), ACGI
6 Winston Lane,
Goodwill,
Roseau, Dominica
Mbl: (767) 275 3382
Fax: (767) 440 4991
VoIP (646) 432 4487
Email: [EMAIL PROTECTED]
Email: [EMAIL PROTECTED]
IM: Yahoo hilkiah / MSN [EMAIL PROTECTED]
IM: ICQ #8978201 / AOL hilkiah21