Re: LeafCollector

2016-12-02 Thread Adrien Grand
Maybe you could use DiversifiedTopDocsCollector? https://lucene.apache.org/core/6_2_0/misc/org/apache/lucene/search/DiversifiedTopDocsCollector.html Le jeu. 1 déc. 2016 à 23:08, Michael McCandless a écrit : > Lucene used to have a DuplicateFilter to do this, but we removed it > recently ... see

Re: Unable to retrieve OffsetTermVector for given term using Apache Lucene 6

2016-12-02 Thread Szymon Sutek
I made a mistake in last part of code. It should be: while((byteRef = iterator.next()) != null) { String term = byteRef.utf8ToString(); //Here I would like to retrieve all offset postions for given term variable } 2016-12-02 10:08 GMT+01:00 Szymon Sutek : > Hello, I am trying to index

Unable to retrieve OffsetTermVector for given term using Apache Lucene 6

2016-12-02 Thread Szymon Sutek
Hello, I am trying to index a txt file and then retrieve it's terms offset positions. Unfortunately I can only get only one offset information per term, not all of it(if it occured more than once while indexing) Here are most important parts of the code: FieldType used while indexing. private Fie

Increase in ByteBufferImpl class heap size in longevity run

2016-12-02 Thread Mukul Ranjan
Hi, We ran longevity Load testing run for 96 hour in our application using lucene 5.5.2 for text search. We have observed that there is significant change in heap size of org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl. Size of this class increased from 7 MB to 15 MB from day1 to

Unable to retrieve TermVectorOffsets using Lucene 6

2016-12-02 Thread Szymon Sutek
Hello, I am trying to index a txt file and then retrieve it's terms offset positions.(if it occured more than once while indexing) I present most important parts of the code: 1)StandardAnalyzer used. 2)FieldType used while indexing. FieldType fieldType = new FieldType(); fieldType.setTok

Re: commit frequency guideline?

2016-12-02 Thread Michael McCandless
On Wed, Nov 30, 2016 at 9:37 AM, Rob Audenaerde wrote: > Thanks for the quick reply! > >>What do you mean by "Lucene complain about too-many uncommitted docs"? > > --> good question, I was thoughtlessly echoing words from my colleague. I > asked him and he said that it was about taking very long t