Hello,

You could write an XML parser (see http://xml.apache.org/ for some XML
tools) and store XML elements as Fields in Lucene Documents.
To search for 'Hello' and 'Hello Mr. President!' you can store the
whole article body as a Text (or maybe UnStored) Field.
You can also look on www.mail-archive.com and search this list's
archive for some related discussions.  Try searching for Philip Ogren
(I think I got the name right), he sent some code that lets you go from
XML -> Lucene Document quickly, I think.

Otis


--- Harun Altay <[EMAIL PROTECTED]> wrote:
> Hello Friends,
> 
> I want to search on BOTH --> (1) "XML" data and (2) "Text" data.
> 
> 
> (1). "Text Data" --> mostly consist of HTML pages, residing on the
> server...
> example : hundreds of HTML, TXT file, etc...
> 
> 
> (2). "XML Data" --> for example, Articles that was stored in XML
> format, lets say like this :
> 
> <article>
> <article code>  ....   </article code>
> <article title>   ....  </article title>
> <author>  .... </author>
> <date> ... </date>
> <etc> ... </etc>
> 
> <body of th eTEXT>
> .
> .......................... the article body, TEXT ......
> .
> .
> .
> .
> </body of th eTEXT>
> 
> </article>
> 
> In this type of search, we need to search this "XML-based author
> file" in two different ways :
>     2.a. First Way of searching : Searching XML file through its
> KEYWORDS, like : date = "Jan-01-2002" and author = "George
> Washington"
>     2.b. Second Way of Searching : Free search on the article body.
> For example : All the articles, whose body has the word 'Hello', or
> the sentence 'Hello Mr. President!' 
> 
> 
> Note-1:
> 
> XML file may reside either Operating System level, or in a
> XML-supporting DATABASE, as well.
> 
> 
> Note-2:
> 
> If I need to have them, I can write extra java classes to support
> some more functionality, if possible...
> 
> 
> Thank you very much,
> Harun.
> 
> 
> 
> 
> 
> 


__________________________________________________
Do You Yahoo!?
Send FREE video emails in Yahoo! Mail!
http://promo.yahoo.com/videomail/

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to