; > >
> > > > > Hi Michael,
> > > > >
> > > > > The docs range could vary in extremes from few 10s to
> > tens-of-thousands
> > > > > and in very heavy usage cases, 100k and above… in a single segment
> > >
gt; > > On Mon, Jun 2, 2025 at 11:32 AM Ravikumar Govindarajan
> > > wrote:
> > > >
> > > > Hi Michael,
> > > >
> > > > The docs range could vary in extremes from few 10s to
> tens-of-thousands
> > > > and in very heavy us
gt; > >
> > > Hi Michael,
> > >
> > > The docs range could vary in extremes from few 10s to tens-of-thousands
> > > and in very heavy usage cases, 100k and above… in a single segment
> > >
> > > Filtered Hnsw like you said uses a single gra
avikumar Govindarajan
> wrote:
> >
> > Hi Michael,
> >
> > The docs range could vary in extremes from few 10s to tens-of-thousands
> > and in very heavy usage cases, 100k and above… in a single segment
> >
> > Filtered Hnsw like you said uses a single gr
e docs range could vary in extremes from few 10s to tens-of-thousands
> and in very heavy usage cases, 100k and above… in a single segment
>
> Filtered Hnsw like you said uses a single graph.., which could be better if
> designed as sub-graphs
>
> On Mon, 2 Jun 2025 at 5:42 PM, Mic
Hi Michael,
The docs range could vary in extremes from few 10s to tens-of-thousands
and in very heavy usage cases, 100k and above… in a single segment
Filtered Hnsw like you said uses a single graph.., which could be better if
designed as sub-graphs
On Mon, 2 Jun 2025 at 5:42 PM, Michael
How many documents do you anticipate in a typical sub range? If it's in the
hundreds or even low thousands you would be better off without hnsw.
Instead you can use a function score query based on the vector distance.
For larger numbers where hnsw becomes useful, you could try using filtered
given OrdRange. A sub-graph, to be precise.. The
generated segment will contain a lot of these sub-graphs but without any
neighbour links to each other at Level-0. Level-1 and above can have
cross-links, which should be fine..
Searches will be based on OrdRange and should stop once the sub-gr
Have you looked at edismax, pf2 and pf3?
On Fri, Sep 25, 2020, 15:07 Gregg Donovan wrote:
> Hello!
>
> I'm wondering what the state-of-the-art for matching exact sub phrases
> within Lucene is. As a bonus, I'd love to attach a boost to each of the
> subphrases match
Hello!
I'm wondering what the state-of-the-art for matching exact sub phrases
within Lucene is. As a bonus, I'd love to attach a boost to each of the
subphrases matched (if possible).
For example:
doc 1: "field": "tree skirt spring skirt
spring dress"
doc 2:
; Sent: Monday, April 15, 2013 9:43 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: Statically store sub-collections for search (faceted search?)
>>
>> Hi Uwe,
>>
>> Thanks for the info, I was under the impression that it didn't... I got this
&g
Hi,
Original Message-
> From: Sujit Pal [mailto:sujitatgt...@gmail.com] On Behalf Of SUJIT PAL
> Sent: Monday, April 15, 2013 9:43 PM
> To: java-user@lucene.apache.org
> Subject: Re: Statically store sub-collections for search (faceted search?)
>
> Hi Uwe,
>
> Th
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Carsten Schnober [mailto:schno...@ids-mannheim.de]
>> Sent: Monday, April 15, 2013 9:53 AM
>>
Am 15.04.2013 13:43, schrieb Uwe Schindler:
Hi,
> Passing NULL means all documents are allowed, if this would not be the case,
> whole Lucene queries and filters would not work at all, so if you get 0 docs,
> you must have missed something else. If this is not the case, your filter may
> behav
Hi,
> Hi again,
>
> >>> You are somehow "misusing" acceptDocs and DocIdSet here, so you
> have
> >> to take care, semantics are different:
> >>> - For acceptDocs "null" means "all documents allowed" -> no deleted
> >>> documents
> >>> - For DocIdSet "null" means "no documents matched"
> >>
> >> O
Am 15.04.2013 11:27, schrieb Uwe Schindler:
Hi again,
>>> You are somehow "misusing" acceptDocs and DocIdSet here, so you have
>> to take care, semantics are different:
>>> - For acceptDocs "null" means "all documents allowed" -> no deleted
>>> documents
>>> - For DocIdSet "null" means "no docume
Hi,
> > AcceptDocs in Lucene are generally all non-deleted documents. For your
> call to Filter.getDocIdSet you should therefor pass
> AtomicReader.getLiveDocs() and not Bits.MatchAllBits.
>
> I see. As far as I understand the documentation, getLiveDocs() returns null if
> there are no deleted d
Am 15.04.2013 10:42, schrieb Uwe Schindler:
> Not every DocIdSet supports bits(). If it returns null, then bits are not
> supported. To enforce a bitset availabe use CachingWrapperFilter (which
> internally uses a BitSet to cache).
> It might also happen that Filter.getDocIdSet() returns null, w
There might be 2 problems:
Not every DocIdSet supports bits(). If it returns null, then bits are not
supported. To enforce a bitset availabe use CachingWrapperFilter (which
internally uses a BitSet to cache).
It might also happen that Filter.getDocIdSet() returns null, which means that
no docum
Am 15.04.2013 10:04, schrieb Uwe Schindler:
> The limit also applies for filters. If you have a list of terms ORed
> together, the fastest way is not to use a BooleanQuery at all, but instead a
> TermsFilter (which has no limits).
Hi Uwe,
thanks for the pointer, this looks promising! The only mi
> -Original Message-
> From: Carsten Schnober [mailto:schno...@ids-mannheim.de]
> Sent: Monday, April 15, 2013 9:53 AM
> To: java-user@lucene.apache.org
> Subject: Re: Statically store sub-collections for search (faceted search?)
>
> Am 12.04.2013 20:08, schrieb SUJI
Am 12.04.2013 20:08, schrieb SUJIT PAL:
> Hi Carsten,
>
> Why not use your idea of the BooleanQuery but wrap it in a Filter instead?
> Since you are not doing any scoring (only filtering), the max boolean clauses
> limit should not apply to a filter.
Hi Sujit,
thanks for your suggestion! I wasn
sujit
>
> On Apr 12, 2013, at 7:34 AM, Carsten Schnober wrote:
>
> > Dear list,
> > I would like to create a sub-set of the documents in an index that is to
> > be used for further searches. However, the criteria that lead to the
> > creation of that sub-set are not p
ike to create a sub-set of the documents in an index that is to
> be used for further searches. However, the criteria that lead to the
> creation of that sub-set are not predefined so I think that faceted
> search cannot be applied my this use case.
>
> For instance:
> A user sea
Dear list,
I would like to create a sub-set of the documents in an index that is to
be used for further searches. However, the criteria that lead to the
creation of that sub-set are not predefined so I think that faceted
search cannot be applied my this use case.
For instance:
A user searches for
Hi,
as far as I can see, boolean scorers always sum up scores of their
sub-scorers. It works, but in case of my application it's required to
multiply sub-scores.
Is there a simple/efficient way to do this (apart from modifying
lucene's source code)?
It seems to me that standard t
be to eliminate the queryNorm completely (you can
override it in your Similarity class) ... depending on your use case you
might not need it at all.
: So how can I get the correct sub-scores for *all* clauses of a
: BooleanQuery, regardless of whether they matched or not?
well, first off: if a
I have a BooleanQuery with several clauses. After running a search, in
addition to seeing the overall score of each document, I need to see the
sub-score produced by each clause. When all clauses match, this is
relatively easy to get back by ".explain(...)", which gives me something
user query as
is using some set of lucene queries and get most relevant results without
worrying much about the internals of scoring.
In my case, I know that each field will most likely match some sub phrase of
the user query and need to have a query or solr request handler which
handles this case
That is very good performance.
But, If I take, on an average, 6 terms per user query, and looking at
shingles of size 2 I will have a boolean OR of 5 shingle phrase queries.
How better is this compared to a single sub phrase query which would
internally be just like another phrase query with
e user
query as
is using some set of lucene queries and get most relevant results
without
worrying much about the internals of scoring.
In my case, I know that each field will most likely match some sub
phrase of
the user query and need to have a query or solr request handler which
handles
user query as
is using some set of lucene queries and get most relevant results without
worrying much about the internals of scoring.
In my case, I know that each field will most likely match some sub phrase of
the user query and need to have a query or solr request handler which
handles this case
Hi Preetam,
On 07/14/2008 at 1:40 PM, Preetam Rao wrote:
> Is there a query in Lucene which matches sub phrases ?
>
[snip]
>
> I was redirected to Shingle filter which is a token filter
> that spits out n-grams. But it does not seem to be best solution
> since one does not kn
Hi,
Sorry if you get this mail second or third time. Getting mail delivery
errors from gmail for some unknown reason.
This is my last attempt at sending the mail for the day.. :-)
Is there a query in Lucene which matches sub phrases ?
For example if the document text is "new york exi
Hi,
Sorry if you get this mail second time. Having some trouble with mail box.
Is there a query in Lucene which matches sub phrases ?
For example if the document text is "new york existing homes *3 bed 2
bath*homes 3 miles from city center 2 rooms" and if user enters
"Brookly
Hi,
Is there a query in Lucene which matches sub phrases ?
For example if the document text is "new york existing homes *3 bed 2
bath*homes 3 miles from city center 2 rooms" and if user enters
"Brooklyn homes
with *3 bed *rooms and swimming pools", I would like to recogn
>
>
> Is it possible to boost subqueries with QueryParser?
>
> For example:
> ((apple AND banana)^10 OR orange)
>
> Thanks
>
>
>
>
> --
> View this message in context:
> http://www.nabble.com/Boost-Sub-Query-tf4685212.html#a13388793
> Sent from
Is it possible to boost subqueries with QueryParser?
For example:
((apple AND banana)^10 OR orange)
Thanks
--
View this message in context:
http://www.nabble.com/Boost-Sub-Query-tf4685212.html#a13388793
Sent from the Lucene - Java Users mailing list archive at Nabble.com
Dear readers,
I had this error message from luke when opening a recent index:
No sub-file with id ... found
After looking around a bit on the web I found the problem mentioned
several times, but no solution.
Putting the lucene jar that created the index on the classpath before
lukeall.jar
Anton Potehin wrote:
Is it possible to make search among results of previous search?
After it I want to not make a new search, I want to make search among
found results...
Simple. Create a new BooleanQuery and put the original query into it,
along with the new query.
Daniel
--
Daniel
Anton Potehin wrote:
After it I want to not make a new search,
> I want to make search among found results...
Perhaps something like this would work:
final BitSet results = toBitSet(Hits);
searcher.search(newQuery, new Filter() {
public BitSet bits(IndexReader reader) {
return results;
,true,false);
hits = Searcher.search(bq,queryFilter);
-Original Message-
From: hu andy [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 07, 2006 12:40 PM
To: java-user@lucene.apache.org
Subject: Re: sub search
Importance: High
2006/3/7, Anton Potehin <[EMAIL PROTECTED]>:
Is it possib
EMAIL PROTECTED]
> Sent: Tuesday, March 07, 2006 12:40 PM
> To: java-user@lucene.apache.org
> Subject: Re: sub search
> Importance: High
>
> 2006/3/7, Anton Potehin <[EMAIL PROTECTED]>:
> >
> > Is it possible to make search among results of previous search?
>
);
-Original Message-
From: hu andy [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 07, 2006 12:40 PM
To: java-user@lucene.apache.org
Subject: Re: sub search
Importance: High
2006/3/7, Anton Potehin <[EMAIL PROTECTED]>:
>
> Is it possible to make search among results of previous s
2006/3/7, Anton Potehin <[EMAIL PROTECTED]>:
>
> Is it possible to make search among results of previous search?
>
>
>
>
>
> For example: I made search:
>
>
>
> Searcher searcher =...
>
>
>
> Query query = ...
>
>
>
> Hits hits =
>
>
>
> hits = Searcher.search(query);
>
>
>
>
>
>
>
> After it
Is it possible to make search among results of previous search?
For example: I made search:
Searcher searcher =...
Query query = ...
Hits hits =
hits = Searcher.search(query);
After it I want to not make a new search, I want to make search among found
results
06 07:31:45 AM EST
Subject: Re: No sub-file with id _18.f0 found
is there a web page that lists all the files created in a index so i can
track down the problem im having
im using the latest source via svn and have rebuild using ant
everytime i create an index no-matter how basic i get error
kokid" <[EMAIL PROTECTED]>
To:
Sent: Monday, January 23, 2006 2:35 PM
Subject: No sub-file with id _18.f0 found
hi, when i try to view my index with luke i get the loading error: "No
sub-file with id _18.f0 found".
any ideas what could be causing this?
im using IndexWriter.se
hi, when i try to view my index with luke i get the loading error: "No sub-file
with id _18.f0 found".
any ideas what could be causing this?
im using IndexWriter.setUseCompoundFile(true)
in the past it has worked fine without any problems, im on win xp with java 1.5
Regar
50 matches
Mail list logo