Lucene does not provide this out of the box. You will have to write a program to do it and feed the results to Lucene. If I remember right, these files are in XML, so you can probably use SAX or a pull parser. I think a number of TREC participants, in the past, have used Lucene, so you may be able to find someone on the web who is generous enough to have shared their implementation.

trupti mulajkar wrote:
hi can anyone suggest how to split files using lucene.
i am trying to index the TREC collection using lucene-1.4.3
i want lucene to read the multiple files within single TREC file and create an
index accordingly.

cheers,
trupti mulajkar
MSc Advanced Computer Science




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--

Grant Ingersoll Sr. Software Engineer Center for Natural Language Processing Syracuse University School of Information Studies 335 Hinds Hall Syracuse, NY 13244 http://www.cnlp.org Voice: 315-443-5484 Fax: 315-443-6886

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to