Hi Nader, As you talked about using Lucene for your http://www.bayt.com web site. Do you convert CV's or any other documents to XML format before submitting to Lucene for indexing?
Regards, Jagdip -----Original Message----- From: Nader S. Henein [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 08, 2003 1:55 AM To: 'Lucene Users List' Subject: RE: converting text/doc to XML XML is an organized, standardized format so let's say your document has the following characteristics File name : foobar.doc Firt line title : Foo Bar File content : Blah blah blah blah Blah blah blah blah Blah blah blah blah Blah blah blah blah Then you have to read the file ( simple file read, java can do this in about ten different ways, pick one ) But each of the files characteristincs in a variable And then parse it in a valid XML: <doc doc_id=1> <file_name>foobar.doc</file_name> <title>Foo Bar</title> <content> Blah blah blah blah Blah blah blah blah Blah blah blah blah Blah blah blah blah </content> </doc> There are probably packages that will do this for you but it's so simple you could pull it off in under a hundred lines, it's also good exercise to familiarize yourself with XML (if you haven't played around with it before) -----Original Message----- From: Jagdip Singh [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 08, 2003 9:41 AM To: 'Lucene Users List' Subject: converting text/doc to XML Hi, How can I convert text/doc to XML? Please help. Regards, Jagdip --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]