Indexing/Querying Annotations and Fields for a document

2008-03-17 Thread lucene-seme1 s
Hello, I am a newbie here and still experimenting with Lucene. I have annotations and features generated by GATE for many documents and would like to index the original content of the documents in addition to the generated annotations. The annotations are in the form of [ John loves fishing]. I w

Indexing/Querying Annotations and Fields for a document

2008-03-17 Thread lucene-seme1 s
Hello, I am a newbie here and still experimenting with Lucene. I have annotations and features generated by GATE for many documents and would like to index the original content of the documents in addition to the generated annotations. The annotations are in the form of [ John loves fishing]. I w

Re: Indexing/Querying Annotations and Fields for a document

2008-03-17 Thread Grant Ingersoll
I think there are a couple of ways you can approach this, although I have never used GATE. If these annotations are marked in line in your content, then you can either preprocess the files to have them separately and index as you normally would, or you can use the relatively new TeeTokenFil

Re: Indexing/Querying Annotations and Fields for a document

2008-03-17 Thread lucene-seme1 s
I already have the document preprocessed and the annotations (i.e. John) are already stored in an array with features attached to some annotations (such as the root and lemma of the word). Can you please elaborate some more on how to "index them as normally would" ? Regards, JK On Mon, Mar 17, 2

Re: Indexing/Querying Annotations and Fields for a document

2008-03-17 Thread Grant Ingersoll
You would parse the XML (or whatever) into separate strings, and put each piece into it's own Field in a Lucene Document For instance: Document doc = new Document(); String body = getBody(input); String people = getPeople(input) Field body = new Field("body", body); Field people = new Field("p

Re: Indexing/Querying Annotations and Fields for a document

2008-03-18 Thread mark harwood
feasible and the state of play with payloads? Cheers Mark - Original Message From: Grant Ingersoll <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, 18 March, 2008 12:24:02 AM Subject: Re: Indexing/Querying Annotations and Fields for a document You would parse

Re: Indexing/Querying Annotations and Fields for a document

2008-03-18 Thread lucene-seme1 s
ay of doing this which avoids this problem might be to look at > the new payloads API. > Anyone care to wade in with if this is feasible and the state of play with > payloads? > > Cheers > Mark > > > - Original Message > From: Grant Ingersoll <[EMAIL PROT

Re: Indexing/Querying Annotations and Fields for a document

2008-03-18 Thread markharw00d
lucene-seme1 s wrote: Can you please share the custom Analyzer you have ? Unfortunately it's not mine to share but see the Lucene Token and Analyzer classes - it's not particularly hard to code. - To unsubscribe, e-mail: [