I am very new in Lucene. Your comments are really helpfull for me. Thank you. Harun.
----- Original Message ----- From: <[EMAIL PROTECTED]> To: "Lucene Users List" <[EMAIL PROTECTED]> Sent: Saturday, January 12, 2002 5:41 PM Subject: Re: I want to search on BOTH --> (1) "XML" data and (2) "Text" data. > Hi, > I don't know how much you know about Lucene, but Otis is exactly right. > You can create two (or more) different Lucene Documents and retrieve the > information you want and put it into fields. > > I am in the midst of creating a Document Repository that I haven't > gotten up yet. > > Here are some links to different documents that might help > > > XML Based Document Implementations > http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg00346.html > http://marc.theaimsgroup.com/?l=lucene-dev&m=100723333506246&w=2 > > The best HTML Document, is the one used in the lucene demo. > > I hope this helps. > > > --Peter > > > > On Saturday, January 12, 2002, at 01:54 PM, Otis Gospodnetic wrote: > > > Hello, > > > > You could write an XML parser (see http://xml.apache.org/ for some XML > > tools) and store XML elements as Fields in Lucene Documents. > > To search for 'Hello' and 'Hello Mr. President!' you can store the > > whole article body as a Text (or maybe UnStored) Field. > > You can also look on www.mail-archive.com and search this list's > > archive for some related discussions. Try searching for Philip Ogren > > (I think I got the name right), he sent some code that lets you go from > > XML -> Lucene Document quickly, I think. > > > > Otis > > > > > > --- Harun Altay <[EMAIL PROTECTED]> wrote: > >> Hello Friends, > >> > >> I want to search on BOTH --> (1) "XML" data and (2) "Text" data. > >> > >> > >> (1). "Text Data" --> mostly consist of HTML pages, residing on the > >> server... > >> example : hundreds of HTML, TXT file, etc... > >> > >> > >> (2). "XML Data" --> for example, Articles that was stored in XML > >> format, lets say like this : > >> > >> <article> > >> <article code> .... </article code> > >> <article title> .... </article title> > >> <author> .... </author> > >> <date> ... </date> > >> <etc> ... </etc> > >> > >> <body of th eTEXT> > >> . > >> .......................... the article body, TEXT ...... > >> . > >> . > >> . > >> . > >> </body of th eTEXT> > >> > >> </article> > >> > >> In this type of search, we need to search this "XML-based author > >> file" in two different ways : > >> 2.a. First Way of searching : Searching XML file through its > >> KEYWORDS, like : date = "Jan-01-2002" and author = "George > >> Washington" > >> 2.b. Second Way of Searching : Free search on the article body. > >> For example : All the articles, whose body has the word 'Hello', or > >> the sentence 'Hello Mr. President!' > >> > >> > >> Note-1: > >> > >> XML file may reside either Operating System level, or in a > >> XML-supporting DATABASE, as well. > >> > >> > >> Note-2: > >> > >> If I need to have them, I can write extra java classes to support > >> some more functionality, if possible... > >> > >> > >> Thank you very much, > >> Harun. > >> > >> > >> > >> > >> > >> > > > > > > __________________________________________________ > > Do You Yahoo!? > > Send FREE video emails in Yahoo! Mail! > > http://promo.yahoo.com/videomail/ > > > > -- > > To unsubscribe, e-mail: <mailto:lucene-user- > > [EMAIL PROTECTED]> > > For additional commands, e-mail: <mailto:lucene-user- > > [EMAIL PROTECTED]> > > > > > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> > > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>