Thanks, i missed that! Glad its already resolved.
Markus
-Original message-
> From:Ishan Chattopadhyaya
> Sent: Thursday 21st January 2016 12:01
> To: java-user@lucene.apache.org
> Subject: Re: Jira issue for possibly transient resource issue, or a Lucene
LUCENE-6970
On Thu, Jan 21, 2016 at 4:07 PM, Markus Jelsma
wrote:
> Hi - we get the above issue as well some times. I've noticed Lucene-dev
> mails on this issue [1] but i couldn't find a corresponding Jira issue? Any
> pointer to that one?
>
> Many thanks,
> Markus
Thanks guys!
On Thu, Jan 21, 2016 at 3:40 AM, david.w.smi...@gmail.com <
david.w.smi...@gmail.com> wrote:
> Yup.
> Just to clarify for the O.P., after getting the SpatialStrategy instance,
> call createIndexableFields() which returns a list of Field instances, which
> you can then call
Hello,
I'm trying improve the speed of an index when searching for long phrases. I
performed some tests with the benchmark module. With a simple analyser and
PhraseQueries and get a throughput of 118 rec/sec. My test dataset is the
latest dump of wikipedia. Here is the filters I use at indexation
In my experience, shingles can hurt query performance because the term
dictionary grows quite a bit. There's far more unique bigrams than there
are words. While the lookup time doesn't grow linearly with the number of
terms, it still grows.
I haven't specifically compared performance numbers
Be sure to check and see if your app is compute or I/O bound during this
process - whether too little of your index is cached in system memory and
each query requires I/O, lots of it.
-- Jack Krupansky
On Thu, Jan 21, 2016 at 1:52 PM, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:
Shingles should make a huge different on phrase query performance if
1) the phrase queries involve high frequency terms and 2) you have a
substantial number of documents in the index (so that
time-to-visit-postings dominates over time-to-lookup-terms).
118 rec/sec is already very fast for a long
Thank you all for your answers. Initially, I also thought that shingle
should make a huge difference. I will give a try to the CommonGramsFilter.
In the mean time, these additional informations may help you at identifying
a problem in my setup.
Basically, I indexed the whole wikipedia dump (> 8