Re: problem with reading an index

2013-05-09 Thread Igal @ getRailo.org
@Uwe -- I thought that when it comes to Lucene -- you ARE McGyver ;) On 5/9/2013 9:48 AM, Uwe Schindler wrote: Hi Liz, See http://wiki.apache.org/lucene-java/JavaBugs If you run Lucene under such old Java versions, we cannot give any support. The Hotspot VMs in older versions is so buggy that

Re: Analyzer in QueryParser behaves differently from IndexWriter

2013-01-13 Thread Igal @ getRailo.org
ery parser for this situation? Igal On 1/13/2013 5:42 AM, Erik Hatcher wrote: The analyzer through QueryParser is invoked for each "clause" and thus in your example it's invoked 4 times and thus each invocation only sees one word/term. Erik On Jan 13, 2013, at 2:13, &quo

Analyzer in QueryParser behaves differently from IndexWriter

2013-01-12 Thread Igal @ getRailo.org
hi, I've created an Analyzer that performs a few filtering tasks, including creating Shingles and term Replacements among other things. I use that Analyzer with IndexWriter and it works as expected. but when I use that same Analyzer with QueryParser (org.apache.lucene.queryparser.classic.Qu

Re: Field.Store.YES vs Field.Store.NO

2013-01-10 Thread Igal @ getRailo.org
say you index information about a book with the title: "Lucene in Action" with an ID and other information. searching for "Lucene" will find the book and will give you the book's ID. now if you used Store.YES -- then Lucene can also give you the full title, i.e. "Lucene in Action", but if you

Re: getting the token position

2013-01-10 Thread Igal @ getRailo.org
ribute. Also consider the possibility of using ShingleFilter with position increment > 1 and then filtering tokens containing "_" (underscore). This will be easier, I guess. On Jan 11, 2013, at 7:14 AM, Igal @ getRailo.org wrote: hi all, how can I get the Token's Position f

Re: Field.Store.YES vs Field.Store.NO

2013-01-10 Thread Igal @ getRailo.org
I'm no expert but my understanding is that it is Searchable, but you can Not retrieve the information, if for example you want to show excerpts etc. the index size will be smaller, of course. Igal On 1/10/2013 3:16 PM, saisantoshi wrote: I am new to lucene and am trying to understand what

getting the token position

2013-01-10 Thread Igal @ getRailo.org
hi all, how can I get the Token's Position from the TokenStream / Tokenizer / Analyzer ? I know that there's a TokenPositionIncrement Attribute and a TokenPositionLength Attribute, but is there an easy way to get the token position or do I need to implement my own attribute by adding one of t

Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Igal @ getRailo.org
( reader, config ) ); return tsc; } I'll need to test it to know for sure though. thanks, Igal On 1/9/2013 6:54 PM, Igal @ getRailo.org wrote: hi Hoss -- thank you for your time. it looks like you're right (and it makes sense if the reader is advanced in two places a

Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Igal @ getRailo.org
hi Hoss -- thank you for your time. it looks like you're right (and it makes sense if the reader is advanced in two places at the same time that it will cause a problem). I'll try to figure out how to create an Analyzer out of the Tokenizer. that's what I was trying to do there and obviously

Re: NPE when adding a Document to an IndexWriter

2013-01-09 Thread Igal @ getRailo.org
thanks for your reply. please see attached. I tried to maintain the structure of the code that I need to use in the library I'm building. I think it should work for you as long as you remove the package declaration at the top. when I run the attached file I get the following output: debug:

NPE when adding a Document to an IndexWriter

2013-01-09 Thread Igal @ getRailo.org
I keep getting an NPE when trying to add a Doc to an IndexWriter. I've minimized my code to very basic code. what am I doing wrong? pseudo-code: Document doc = new Document(); TextField ft; ft = new TextField( "desc1", "word1", Field.Store.YES ); doc.add( ft ); ft = new TextField( "desc2", "

Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Igal @ getRailo.org
as mentioned before -- I'm not an expert on Lucene (far from it) -- but it seems to me like each migration version will take almost equal amount of work so if I were you I'd rethink this plan and consider migration to 4.0 Igal On 1/9/2013 1:08 PM, saisantoshi wrote: Is there any migration g

Re: Upgrade Lucene to latest version (4.0) from 2.4.0

2013-01-09 Thread Igal @ getRailo.org
the API has changed much over time so I suspect that it will take more than replacing the jars. On 1/9/2013 11:04 AM, saisantoshi wrote: We have an existing application which uses Lucene 2.4.0 version. We are thinking of upgrading it to alatest version (4.0). I am not sure the process involved

Re: Cannot instantiate SPI class

2013-01-09 Thread Igal @ getRailo.org
hi everybody, I figured it out. the problem was that I was using a "custom" jar to deploy this along with other libs that I use in my application. so at the end of my build.xml I create a jar file with all the required libs. the problem was that I was adding lucene-core.jar with a filter of

Re: Cannot instantiate SPI class

2013-01-09 Thread Igal @ getRailo.org
ler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Igal @ getRailo.org [mailto:i...@getrailo.org] Sent: Wednesday, January 09, 2013 4:53 AM To: java-user@lucene.apache.org Subject: Cannot instantiate SPI class I'm t

Re: Cannot instantiate SPI class

2013-01-08 Thread Igal @ getRailo.org
nope. that didn't work either... On 1/8/2013 10:02 PM, Igal @ getRailo.org wrote: hi Steve, thanks for your reply. at first I also thought that, so I added lucene-codecs-4.0.0.jar which caused another error, and prompted me to add commons-codec-1.7.jar as well. this error is af

Re: Cannot instantiate SPI class

2013-01-08 Thread Igal @ getRailo.org
er to load these classes. (does it?) as an aside -- Lucene 3.6 was running properly in that same environment before. Igal On 1/8/2013 9:48 PM, Steve Rowe wrote: Hi Igal, Sounds like you don't have lucene-codecs-4.0.0.jar in Railo's classpath. Steve On Jan 8, 2013, at 10:53 PM

Cannot instantiate SPI class

2013-01-08 Thread Igal @ getRailo.org
I'm trying to access Lucene4 from Railo (an open-source application server) when I try to create an IndexWriterConfig I get the error: Cannot instantiate SPI class: org.apache.lucene.codecs.appending.AppendingCodec any ideas? TIA stacktrace below: Cannot instantiate SPI class: org.apache.

Re: German 'ue' -> 'u' conversion

2012-11-19 Thread Igal @ getRailo.org
if your needs are so specific -- you can always build a NormalizeCharMap and use MappingCharFilter Igal On 11/19/2012 2:11 AM, Dyga, Adam wrote: I did, but none of them can do it (at least in default configuration). Regards, AD -Original Message- From: Igal @ getRailo.org

Re: German 'ue' -> 'u' conversion

2012-11-19 Thread Igal @ getRailo.org
look for filters that use the ICU4J library On 11/19/2012 2:08 AM, Lutz Fechner wrote: Hi, we use a modified ISOLatin1AccentFilter bit to replace German accents by ae, oe, ue and so on for that purpose. In the code you will see a switch for the characters. You need to change it from case

Re: Which stemmer?

2012-11-16 Thread Igal @ getRailo.org
but if "dogs" are feet (and I guess I fall into the not-perfect group here)... and "feet" is the plural form of "foot", then shouldn't "dogs" be stemmed to "dog" as a base, singular form? On 11/16/2012 2:32 PM, Tom Burton-West wrote: Hi Mike, Honestly I've never heard of anyone using "dog

Re: Lucene API

2012-11-05 Thread Igal @ getRailo.org
lanatory? Further, I believe Term instances are meant to be immutable hence no direct linkage between the two. I could be wrong though. On Mon, Nov 5, 2012 at 10:33 AM, Igal @ getRailo.org wrote: I don't mean to sound critical, but is there a reason that the API is not simpler? for example, i

Lucene API

2012-11-05 Thread Igal @ getRailo.org
I don't mean to sound critical, but is there a reason that the API is not simpler? for example, if I want to read/modify a CharTermAttribute's value, I need to use toString() to get the value, which is very unintuitive, and either copyBuffer() or setEmpty() and append(). is there a reason no

Re: using CharFilter to inject a space

2012-11-03 Thread Igal @ getRailo.org
Nov 3, 2012 at 9:22 PM, Igal Sapir wrote: You're right. I'm not sure what I was thinking. Thanks for all your help, Igal On Nov 3, 2012 5:44 PM, "Robert Muir" wrote: On Sat, Nov 3, 2012 at 8:32 PM, Igal @ getRailo.org wrote: hi Robert, thank you for your replies. I c

Re: using CharFilter to inject a space

2012-11-03 Thread Igal @ getRailo.org
";", "; " ); NormalizeCharMap ncm = builder.build(); return ncm; } } On 11/3/2012 5:13 PM, Robert Muir wrote: On Sat, Nov 3, 2012 at 7:47 PM, Igal @ getRailo.org wrote: I considered it, and it's definitely an option. but I re

Re: using CharFilter to inject a space

2012-11-03 Thread Igal @ getRailo.org
ets to index at this time. thanks for your answer, Igal On 11/3/2012 4:42 PM, Robert Muir wrote: On Sat, Nov 3, 2012 at 7:35 PM, Igal @ getRailo.org wrote: hi, I want to make sure that every comma (,) and semi-colon (;) is followed by a space prior to tokenizing. the idea is to then use a

using CharFilter to inject a space

2012-11-03 Thread Igal @ getRailo.org
hi, I want to make sure that every comma (,) and semi-colon (;) is followed by a space prior to tokenizing. the idea is to then use a WhitespaceTokenizer which will keep commas but still split the phrase in a case like: "I bought red apples,green pears,and yellow oranges" I'm thinking

Re: tokenizer's tokens

2012-11-01 Thread Igal @ getRailo.org
, Igal @ getRailo.org wrote: I'm trying to write a very simple method to show the different tokens that come out of a tokenizer. when I call WhitespaceTokenizer's (or LetterTokenizer's) incrementToken() method though I get an ArrayIndexOutOfBoundsException (see below) any ideas?

tokenizer's tokens

2012-11-01 Thread Igal @ getRailo.org
I'm trying to write a very simple method to show the different tokens that come out of a tokenizer. when I call WhitespaceTokenizer's (or LetterTokenizer's) incrementToken() method though I get an ArrayIndexOutOfBoundsException (see below) any ideas? p.s. if I use StandardTokenizer it works

Re: Removing Empty Shingles in Lucene 4

2012-11-01 Thread Igal @ getRailo.org
; <http://markmail.org/message/ewza54azui6knqwf> On Nov 1, 2012, at 3:44 PM, Igal @ getRailo.org wrote: hi, I'm trying to migrate to Lucene 4. in Lucene 3.5 I extended org.apache.lucene.analysis.FilteringTokenFilter and overrode accept() to remove undesired shingles. in Lucene 4 org.apache.

Re: Removing Empty Shingles in Lucene 4

2012-11-01 Thread Igal @ getRailo.org
analysis components changed, too. Use your IDE to find it or ask Google... Uwe "Igal @ getRailo.org" schrieb: hi, I'm trying to migrate to Lucene 4. in Lucene 3.5 I extended org.apache.lucene.analysis.FilteringTokenFilter and overrode accept() to remove undesired shingle

Removing Empty Shingles in Lucene 4

2012-11-01 Thread Igal @ getRailo.org
hi, I'm trying to migrate to Lucene 4. in Lucene 3.5 I extended org.apache.lucene.analysis.FilteringTokenFilter and overrode accept() to remove undesired shingles. in Lucene 4 org.apache.lucene.analysis.FilteringTokenFilter does not exist? I'm trying to achieve two things: 1) remove shingl