Maybe you could use DiversifiedTopDocsCollector?
https://lucene.apache.org/core/6_2_0/misc/org/apache/lucene/search/DiversifiedTopDocsCollector.html
Le jeu. 1 déc. 2016 à 23:08, Michael McCandless
a écrit :
> Lucene used to have a DuplicateFilter to do this, but we removed it
> recently ... see
I made a mistake in last part of code. It should be:
while((byteRef = iterator.next()) != null) {
String term = byteRef.utf8ToString();
//Here I would like to retrieve all offset postions for given term variable
}
2016-12-02 10:08 GMT+01:00 Szymon Sutek :
> Hello, I am trying to index
Hello, I am trying to index a txt file and then retrieve it's terms offset
positions. Unfortunately I can only get only one offset information per
term, not all of it(if it occured more than once while indexing) Here are
most important parts of the code:
FieldType used while indexing.
private Fie
Hi,
We ran longevity Load testing run for 96 hour in our application using lucene
5.5.2 for text search. We have observed that there is significant change in
heap size of
org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.
Size of this class increased from 7 MB to 15 MB from day1 to
Hello, I am trying to index a txt file and then retrieve it's terms offset
positions.(if it occured more than once while indexing) I present most
important parts of the code:
1)StandardAnalyzer used.
2)FieldType used while indexing.
FieldType fieldType = new FieldType();
fieldType.setTok
On Wed, Nov 30, 2016 at 9:37 AM, Rob Audenaerde
wrote:
> Thanks for the quick reply!
>
>>What do you mean by "Lucene complain about too-many uncommitted docs"?
>
> --> good question, I was thoughtlessly echoing words from my colleague. I
> asked him and he said that it was about taking very long t