Need Help for Wild Card Query Highlighting

2011-10-18 Thread Vidya Kanigiluppai Sivasubramanian
Hi, I am new to lucene. I am using lucene 2.4.1 in my project to do a search in a text document. I need to perform a wild card query. I am using the code given in Hrycon - blog. It is working fine with complete words. When we do a wild card query, we can only see the search hits but the text

Re: Need Help for Wild Card Query Highlighting

2011-10-18 Thread Ian Lea
Why 2.4.1? That is ancient and there have been many improvements since then. Google finds hits for lucene highlight wild card some of which contain some solutions some of which may or may not be relevant for your problem. -- Ian. On Tue, Oct 18, 2011 at 8:17 AM, Vidya Kanigiluppai

RE: Need Help for Wild Card Query Highlighting

2011-10-18 Thread Vidya Kanigiluppai Sivasubramanian
Hi, Thanks for the reply. I have searched in Google. The options given did not work out. I see that most of the links are questions based on version change. So, just wanted someone to help me on using the wild card query. Do you have any link that solves my problem? Since I see people can

Re: How do you see if a tokenstream has tokens without consuming the tokens ?

2011-10-18 Thread Paul Taylor
On 18/10/2011 06:19, Steven A Rowe wrote:On 18/10/2011 06:19, Steven A Rowe wrote: Hi Paul, You could add a rule to the StandardTokenizer JFlex grammar to handle this case, bypassing its other rules. Hmm, dont really understand jflex, but that is a possibility, but would prefer to do in Java

Re: How can I read records from Lucene

2011-10-18 Thread Ian Lea
IndexReader has assorted methods for reading terms and frequency vectors. See the javadocs for that class and TermDocs and TermEnum and TermFreqVector. Score is only relevant after a search. See TopDocs and ScoreDoc. If you want more specific advice, ask a more specific question. Or use

RE: How do you see if a tokenstream has tokens without consuming the tokens ?

2011-10-18 Thread Steven A Rowe
Hi Paul, On 10/18/2011 at 4:57 AM, Paul Taylor wrote: On 18/10/2011 06:19, Steven A Rowe wrote: Another option is to create a char filter that substitutes PUNCT-EXCLAMATION for exclamation points, PUNCT-PERIOD for periods, etc., Yes that is how I first did it No, I don't think you did.

OutOfMemoryError

2011-10-18 Thread Tamara Bobic
Hi all, I am using Lucene to query Medline abstracts and as a result I get around 3 million hits. Each of the hits is processed and information from a certain field is used. After certain number of hits, somewhere around 1 million (not always the same number) I get OutOfMemory exception that

Re: Case insensitive Keyword Analyser

2011-10-18 Thread Jamir Shaikh
Thanks a ton Anna.. It's working fine... On Sun, Oct 16, 2011 at 11:51 PM, Anna Hunecke a.hune...@topdesk.comwrote: Hi Jamir, you can easily combine Analyzers however you need it by filtering the output of one Analyzer with another. In your case, I would just write my own Analyzer class

Hit search-lucene.com a little harder

2011-10-18 Thread Otis Gospodnetic
Hello folks, Do you ever use http://search-lucene.com (SL) or http://search-hadoop.com (SH)? If you do, I'd like to ask you for a small favour: We are at Lucene Eurocon in Barcelona and we are about to show the Search Analytics [1] and Performance Monitoring [2] tools/services we've built and

Re: OutOfMemoryError

2011-10-18 Thread Otis Gospodnetic
Bok Tamara, You didn't say what -Xmx value you are using.  Try a little higher value.  Note that loading field values (and it looks like this one may be big because is compressed) from a lot of hits is not recommended. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene

Re: OutOfMemoryError

2011-10-18 Thread Mead Lai
Tamara, You may use StringBuffer instead of String docText = hits.doc(j).getField(DOCUMENT).stringValue() ; after that you may use StringBuffer.delete() to release memery. Another way is using x64-bit machine. Regards, Mead On Wed, Oct 19, 2011 at 5:14 AM, Otis Gospodnetic