So you basically only want to index parts of your document within <table>
Foo Bar </table> tags, 

I'm not sure if there's an easier way, but here's what I do:
1)  Parse XML files using JDOM (or any XML parser that floats your boat)
into a Map or an ArrayList 
2)  Create a Lucene document and loop through the aforementioned structure
(Map or ArrayList) adding field, value pairs to it like so
        contentDoc.add(new Field(fieldName,fieldValue,true,true,true) ) ;

So all you would need to do is just put an if statement around the later
statement to the effect of 

If (  fieldName.equalsIgnoreCase("table") == 0   ) {
        contentDoc.add(new Field(fieldName,fieldValue,true,true,true) ) ;
}


This may be overkill, someone feel free to correct me if I'm wrong

Nader

-----Original Message-----
From: Karthik N S [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, May 19, 2004 1:01 PM
To: Lucene Users List
Subject: RE: SELECTIVE Indexing


Hey Lucene Users

My original intension for indexing was to
index certain portions of  HTML [ not the whole Document ],
if Jtidy is not supporting this then what are my optionals

Karthik

-----Original Message-----
From: Viparthi, Kiran (AFIS) [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 19, 2004 1:43 PM
To: 'Lucene Users List'
Subject: RE: SELECTIVE Indexing


I doubt if it can be used as a plug in.
Would be good to know if it can be used as a plug in.

Regards,
Kiran.

-----Original Message-----
From: Karthik N S [mailto:[EMAIL PROTECTED]
Sent: 17 May 2004 12:30
To: Lucene Users List
Subject: RE: SELECTIVE Indexing


Hi

Can I Use TIDY [as plug in ] with Lucene ...


with regards
Karthik

-----Original Message-----
From: Viparthi, Kiran (AFIS) [mailto:[EMAIL PROTECTED]
Sent: Monday, May 17, 2004 3:27 PM
To: 'Lucene Users List'
Subject: RE: SELECTIVE Indexing



Try using Tidy.
Creates a Document of the html and allows you to apply xpath. Hope this
helps.

Kiran.

-----Original Message-----
From: Karthik N S [mailto:[EMAIL PROTECTED]
Sent: 17 May 2004 11:59
To: Lucene Users List
Subject: SELECTIVE Indexing



Hi all

   Can Some Body tell me How to Index  CERTAIN PORTION OF THE HTML FILE Only

   ex:-
        <table .....>
               ....

         </table>


with regards
Karthik




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to