Help with Custom Analyzer

2006-10-16 Thread Ryan O';Hara
I have a few questions regarding writing a custom analyzer. My situation is that I would like to use the StandardAnalyzer but with some data-specific rules. I was wondering if there was a way of telling the StandardAnalyzer to treat a string of text, that would normally be tokenized into m

Re: Help with Custom Analyzer

2006-10-16 Thread Ryan O';Hara
ging the String seems like the simplest approach. If you want to wrap that in StringReader, you can, but you can also just work with Strings. Otis - Original Message From: Ryan O'Hara <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, October 16, 2006 4:28:3

Memory Allocation

2007-01-05 Thread Ryan O';Hara
I just started using SpellChecker and I have encountered a java heap space exception, which may or may not be related to using SpellChecker. I was wondering if there is a way to allocate the amount of memory Lucene uses during a search? Thanks, Ryan --

SpellChecker::suggestSimilar() Question

2007-01-25 Thread Ryan O';Hara
It seems that the suggestions returned by SpellChecker::suggestSimilar (queryText, num_sug, reader, field, bool) are randomly chosen, then sorted. By altering num_sug (10, 5, 3,2,1), I received the following suggestions for "gnetics": suggestion0: genetics suggestion1: ginetics suggestion2:

SpellChecker and Lucene 2.1

2007-03-14 Thread Ryan O';Hara
Is there a SpellChecker.jar compatible with Lucene 2.1. After updating to Lucene 2.1, I seem to have lost the ability to create a spell index using spellchecker-2.0-rc1-dev.jar. Any help would be greatly appreciated. Thanks, Ryan --

Re: SpellChecker and Lucene 2.1

2007-03-15 Thread Ryan O';Hara
Trace(); } } } When I close indexReaderOriginal after the Dictionary is closed, I get the same exact exception. Thanks in advance, Ryan On Mar 15, 2007, at 1:34 AM, karl wettin wrote: 14 mar 2007 kl. 21.47 skrev Ryan O'Hara: Is there a SpellChecker.jar compa

Re: SpellChecker and Lucene 2.1

2007-03-15 Thread Ryan O';Hara
Thanks a ton, Hoss. I just did an ant on the contrib/spellchecker directory and it produced a jar file in the LUCENE_HOME/build/ directory. Replacing the old jar file with the new jar file fixed my errrors as I suspected. Thanks, again. -Ryan On Mar 15, 2007, at 1:38 PM, Chris Hostetter

Not able to search on UN_TOKENIZED fields

2007-04-05 Thread Ryan O';Hara
Hey, I was just wondering if you are supposed to be able to search on UN_TOKENIZED fields? It seems like you can from the docs, but I have been unsuccessful. I want to do exact string matching on a certain field without analyzer interference. Thanks, Ryan --

Re: Not able to search on UN_TOKENIZED fields

2007-04-05 Thread Ryan O';Hara
. Also, you haven't provided us a clue what the actual query is. I'd use Query.toString(). I suspect that the query is getting tokenized if you're using one of the normal Analyzers... Erick On 4/5/07, Ryan O'Hara <[EMAIL PROTECTED]> wrote: Hey, I was just wondering if you ar

Re: StopAnalyzer- Stop List Words

2007-04-10 Thread Ryan O';Hara
You can find the list in StopAnalyzer.java: public static final String[] ENGLISH_STOP_WORDS = { "a", "an", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "such", "that", "the", "their", "then", "there", "these",

Fastest Method for Searching (need all results)

2006-07-21 Thread Ryan O';Hara
My index contains approximately 5 millions documents. During a search, I need to grab the value of a field for every document in the result set. I am currently using a HitCollector to search. Below is my code: searcher.search(query, new HitCollector(){ public void

Re: Fastest Method for Searching (need all results)

2006-07-21 Thread Ryan O';Hara
Perhaps I am speaking too quickly, but I would try by not grabbing the value of the field for every document in the results set. Someone will see that value or use it for a couple million hits? Could be I suppose...but if not than axe it. Grab the first few thousand (or MUCH less) and if th

Re: Fastest Method for Searching (need all results)

2006-07-21 Thread Ryan O';Hara
I haven't had the chance to use this new feature yet, but have you tried with selective field loading, so that you can load only that 1 field from your index and not all of them? I have not tried selective field loading, but it sounds like a good idea. What class is it in? Any more inform

Can Field types affect search speed?

2006-07-26 Thread Ryan O';Hara
Currently, I have field "DOC" which is indexed, but not stored and not compressed. This is the field that users query. I also have a field "SYM" which is indexed and stored, but not compressed. For every document returned in a query, I need its symbol. Can field types (indexed vs. not i

Re: Can Field types affect search speed?

2006-07-26 Thread Ryan O';Hara
I'm not using a Hits object. I'm using a HitCollector. I was just curious about whether Field settings could affect search performance. Any ideas? Ryan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands,

Re: Fastest Method for Searching (need all results)

2006-08-02 Thread Ryan O';Hara
eks dev, The most best way of looping through all results that I have come across is using a HitCollector and grabbing the field values via FieldCache. This is under two conditions: 1) The FieldCache arrays are initialized only once, since creating these arrays creates serious overhead,

Re: About the use of HitCollector

2006-08-08 Thread Ryan O';Hara
Hey Andy, If you have enough RAM, try using FieldCache: String[] fieldYouWant = FieldCache.DEFAULT.getStrings (searcher.getIndexReader(), "fieldYouWant"); searcher.search(query, new HitCollector(){ public void collect(int doc, float score){ doWhatYouWant(fieldYouWant[do