Re: Difference between regular Highlighter and Fast Vector Highlighter ?

2011-04-01 Thread shrinath.m
Got it :) Thanks for the link. [closed] On Sat, Apr 2, 2011 at 6:14 AM, Koji Sekiguchi [via Lucene] < ml-node+2765616-1923995541-376...@n3.nabble.com> wrote: > (11/04/01 21:32), shrinath.m wrote: > > I was wondering whats the difference between the Lucene's 2 > implementation of > > highlighter

Re: Difference between regular Highlighter and Fast Vector Highlighter ?

2011-04-01 Thread Koji Sekiguchi
(11/04/01 21:32), shrinath.m wrote: I was wondering whats the difference between the Lucene's 2 implementation of highlighters... I saw the javadoc of FVH, but it only says "another implementation of Lucene Highlighter" ... Description section in the javadoc shows the features of FVH: https://

Re: Using IndexWriterConfig repeatedly in 3.1

2011-04-01 Thread Trejkaz
On Sat, Apr 2, 2011 at 7:07 AM, Christopher Condit wrote: > I see in the JavaDoc for IndexWriterConfig that: > "Note that IndexWriter makes a private clone; if you need to > subsequently change settings use IndexWriter.getConfig()." > > However when I attempt to use the same IndexWriterConfig to c

Re: Using IndexWriterConfig repeatedly in 3.1

2011-04-01 Thread Michael McCandless
The issue is that a MergePolicy instance cannot be re-used across multiple writers. So, you could take your first IWC, change out the MergePolicy, then re-use it? Other things also cannot be reused, eg a ConcurrentMergeScheduler instance. Mike http://blog.mikemccandless.com On Fri, Apr 1, 2011

Using IndexWriterConfig repeatedly in 3.1

2011-04-01 Thread Christopher Condit
I see in the JavaDoc for IndexWriterConfig that: "Note that IndexWriter makes a private clone; if you need to subsequently change settings use IndexWriter.getConfig()." However when I attempt to use the same IndexWriterConfig to create multiple IndexWriters the following exception is thrown: org.

Re: org.apache.lucene.store.AlreadyClosedException: this IndexReader is closed

2011-04-01 Thread Devon H. O'Dell
2011/4/1 Yogesh Dabhi : > Hi > > Concurrently 5 user access same lucene directory for searching document > > That time I got bellow exception > > org.apache.lucene.store.AlreadyClosedException: this IndexReader is > closed > > is there a way to handle such error Use a ReentrantReaderWriterLock aro

org.apache.lucene.store.AlreadyClosedException: this IndexReader is closed

2011-04-01 Thread Yogesh Dabhi
Hi Concurrently 5 user access same lucene directory for searching document That time I got bellow exception org.apache.lucene.store.AlreadyClosedException: this IndexReader is closed is there a way to handle such error Thanks & Regards Yogesh

Re: How to do Multiple-Cluster Query?

2011-04-01 Thread Erick Erickson
You might consider a multiValued field and a positionIncrementGap longer than the longest tuple. At that point, you can search for phrase queries where the slop is less than the positionIncrementGap. I'm a bit rushed, so if you need more details we can talk later Best Erick 2011/4/1 袁武 [GMa

Re: Undo hyphenation when indexing

2011-04-01 Thread Yonik Seeley
Solr has a hyphenated word filter you could copy. http://lucene.apache.org/solr/api/org/apache/solr/analysis/HyphenatedWordsFilterFactory.html On trunk, this has been folded into the analysis module. -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco

RE: About the lucene sort

2011-04-01 Thread Carl Austin
Don't prefix queries get rewritten wrapped in ConstantScoreQuery, meaning all will get the same score and you get them in read order? Checking the API, PrefixQuery uses MultiTermQuery.CONSTANT_SCORE_AUTO_REWRITE_DEFAULT, which can be changed with setRewriteMethod. -Original Message- From:

Re: Re: A likely bug of TermsPosition.nextPosition

2011-04-01 Thread Michael McCandless
Hmm so it's not index corruption. Curious. Which Lucene version are you using? Looks like it's 2.9, but not 2.9.4? Can you try 2.9.4 and see if you still hit the problem? Can you post a small test case showing the problem, on your index? Mike http://blog.mikemccandless.com 2011/4/1 袁武 [GMai

Re: Best practice for stemming and exact matching

2011-04-01 Thread Christopher Condit
>> Ideally I'd like to have the parser use the >> custom analyzer for everything unless it's going to parse a clause into >> a PhraseQuery or a MultiPhraseQuery, in which case it uses the >> SimpleAnalyzer and looks in the _exact field - but I can't figure out >> the best way to accomplish this. >

Undo hyphenation when indexing

2011-04-01 Thread Wulf Berschin
Hi, for indexing PDF files we have to undo word hyphenation. The basic idea is simply to remove the hyphen when a new line and a small letter follows. Of course this approach isnt 100%-foolproofed but checking against a dictionary wouldnt be as well... Since we face this problem too when hig

Re: About the lucene sort

2011-04-01 Thread Ian Lea
Probably a bug in your code. If you post again with, as a minimum, the version of lucene that you are using and your search/sort code, you might get a better answer. Best of all would be a complete self-contained standalone program or test case that demonstrates the problem. -- Ian. On Fri, Ap

About the lucene sort

2011-04-01 Thread Cescky
Hi, I know the default is sort by relevance. While, when i search the prefix (interface*), it does not work. It can only sort the document by the order of reading files. What is the problem??? Thx.

Difference between regular Highlighter and Fast Vector Highlighter ?

2011-04-01 Thread shrinath.m
I was wondering whats the difference between the Lucene's 2 implementation of highlighters... I saw the javadoc of FVH, but it only says "another implementation of Lucene Highlighter" ... Can someone throw some more light on this ? -- View this message in context: http://lucene.472066.n3.nabbl

Re: indexing data without writing to disk ?

2011-04-01 Thread jm
or maybe MemoryIndex (in contrib) is more suited to what he wants On Fri, Apr 1, 2011 at 1:10 PM, Ian Lea wrote: > RAMDirectory. The clue is in the name ... > > > -- > Ian. > > > On Fri, Apr 1, 2011 at 11:08 AM, Patrick Diviacco > wrote: > > Is there a way to index data into memory without wr

Re: SpanNearQuery with repeated term?

2011-04-01 Thread Ian Lea
> Sorry I am using 2.9.4, which is the same as 3.0.3? Not by my definition of "the same". > The code below demonstrates the problem. Not really. I'd be more convinced if it compiled. And how can we be sure the docs are exactly as you say? That you are actually executing the span queries you h

RE: SpanNearQuery with repeated term?

2011-04-01 Thread Gregory Tarr
Sorry, I have got this working now. It was a silly mistake. -Original Message- From: Gregory Tarr [mailto:gregory.t...@detica.com] Sent: 01 April 2011 12:13 To: java-user@lucene.apache.org Subject: RE: SpanNearQuery with repeated term? Sorry I am using 2.9.4, which is the same as 3.0.3?

RE: SpanNearQuery with repeated term?

2011-04-01 Thread Gregory Tarr
Sorry I am using 2.9.4, which is the same as 3.0.3? The code below demonstrates the problem. Thanks Greg -Original Message- From: Ian Lea [mailto:ian@gmail.com] Sent: 01 April 2011 12:10 To: java-user@lucene.apache.org Subject: Re: SpanNearQuery with repeated term? I can't reprod

Re: indexing data without writing to disk ?

2011-04-01 Thread Ian Lea
RAMDirectory. The clue is in the name ... -- Ian. On Fri, Apr 1, 2011 at 11:08 AM, Patrick Diviacco wrote: > Is there a way to index data into memory without writing to disk in Lucene ? > > This is my current code storing it on disk > > writer = new IndexWriter(FSDirectory.open(index_dir), ne

Re: SpanNearQuery with repeated term?

2011-04-01 Thread Ian Lea
I can't reproduce this using lucene-core-3.0.3.jar. You don't say what version you are using. Why don't you post the smallest possible complete standalone program or test case that demonstrates the problem. And tell us what version of lucene you are working with. Always. -- Ian. On Fri, Apr

SpanNearQuery with repeated term?

2011-04-01 Thread Gregory Tarr
I am having some issues with SpanNearQuery: SpanQuery[] clauses = new SpanTermQuery[2]; Clauses[0] = new SpanTermQuery("text",""); Clauses[1] = new SpanTermQuery("text",""); SpanNearQuery q = new SpanNearQuery(clauses,0,true); // returns 1 document with " " in it SpanQuery[] claus

Re: Re: A likely bug of TermsPosition.nextPosition

2011-04-01 Thread 袁武 [GMail]
Hi, Dear Mike: belows list the report of checkIndex. OS is Fedora Linux. [oracle@server bin]$ java -classpath ./ org.apache.lucene.index.CheckIndex /data/Index/URL/Generic/ -fix NOTE: testing will be more thorough if you run java with '-ea:org.apache.lucene...', so assertions are enabled Openi

indexing data without writing to disk ?

2011-04-01 Thread Patrick Diviacco
Is there a way to index data into memory without writing to disk in Lucene ? This is my current code storing it on disk writer = new IndexWriter(FSDirectory.open(index_dir), new IndexWriterConfig(org.apache.lucene.util.Version.LUCENE_40, new WhitespaceAnalyzer(org.apache.lucene.util.Version.LUCEN

Re: A likely bug of TermsPosition.nextPosition

2011-04-01 Thread Michael McCandless
Hmm this could be from a corrupted index. What version of Lucene? What OS/filesystem? Can you run CheckIndex and post the output? Mike http://blog.mikemccandless.com 2011/3/31 袁武 [GMail] : > Hi, dear experts: > > When IndexReader.termsPositions is used to access specific terms, the call to >

RE: 3.1 upgrade problem

2011-04-01 Thread Uwe Schindler
Hi Wouter, See point 8 in the Backwards Compatibility CHANGES.txt. The reason is explained in several issues (not all listed there), problems are e.g. in the Unicode 4 changes, where a non-final WhitespaceTokenizer would need to do reflection-based backwards hacks (like in 2.9 when we changed to i

3.1 upgrade problem

2011-04-01 Thread Wouter Heijke
I'm doing the upgrade to Lucene 3.1.0. The upgrade failed on WhitespaceTokenizer being final in this version. I don't understand why anyone would make this tokenizer final, I was happlily extending it for many Lucene versions! Wouter -