Hello, You could write an XML parser (see http://xml.apache.org/ for some XML tools) and store XML elements as Fields in Lucene Documents. To search for 'Hello' and 'Hello Mr. President!' you can store the whole article body as a Text (or maybe UnStored) Field. You can also look on www.mail-archive.com and search this list's archive for some related discussions. Try searching for Philip Ogren (I think I got the name right), he sent some code that lets you go from XML -> Lucene Document quickly, I think.
Otis --- Harun Altay <[EMAIL PROTECTED]> wrote: > Hello Friends, > > I want to search on BOTH --> (1) "XML" data and (2) "Text" data. > > > (1). "Text Data" --> mostly consist of HTML pages, residing on the > server... > example : hundreds of HTML, TXT file, etc... > > > (2). "XML Data" --> for example, Articles that was stored in XML > format, lets say like this : > > <article> > <article code> .... </article code> > <article title> .... </article title> > <author> .... </author> > <date> ... </date> > <etc> ... </etc> > > <body of th eTEXT> > . > .......................... the article body, TEXT ...... > . > . > . > . > </body of th eTEXT> > > </article> > > In this type of search, we need to search this "XML-based author > file" in two different ways : > 2.a. First Way of searching : Searching XML file through its > KEYWORDS, like : date = "Jan-01-2002" and author = "George > Washington" > 2.b. Second Way of Searching : Free search on the article body. > For example : All the articles, whose body has the word 'Hello', or > the sentence 'Hello Mr. President!' > > > Note-1: > > XML file may reside either Operating System level, or in a > XML-supporting DATABASE, as well. > > > Note-2: > > If I need to have them, I can write extra java classes to support > some more functionality, if possible... > > > Thank you very much, > Harun. > > > > > > __________________________________________________ Do You Yahoo!? Send FREE video emails in Yahoo! Mail! http://promo.yahoo.com/videomail/ -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>