Hi Chris,
> Because if you can adjust your parser syntax, this literallyly just
> becomes: ' field:"foo bar"~N ' ... where N is the positionIncrementGap
> on your analyzer ... OR ... ' field:"foo bar" ' ... if you call
> setPhraseSlop on your QueryParser.
Yes - correct. This would be equivale
(caveat: i don't ever really understand what Intervals at hte lucene
feature set stage)
: Yup - similar to what Alan suggested. I'd have to rewrite the (general
: text-to-query) query parser to only use intervals though. Still
: thinking about possible approaches to this.
...
: > You co
Thanks Michael. The outcome of this discussion seems to be clear that
everyone is trying to reinvent the wheel somehow. ;) I think it really
should become part of core Lucene functionality. Seems like a corner
case people are not aware of until they hit it (and then it's not
clear what to do about
This might be a little outside the spirit of this discussion (in that
it's not really "off-the-shelf") -- but I implemented a
proof-of-concept for a different use case that I think could be
adapted here:
For a given doc, for each term in your multivalued field, you could
record a bitset representa
bq. Expanding a query over numerous fields grows combinatorically
in the number of fields (if I want my query to match when all terms
match in *some* field), doesn't it?
I don't think it does? It grows linearly with the number of fields? In
my experience the number of fields
searchable "by default
You're thinking of SurroundQuery parser for span queries I think...
https://lucene.apache.org/solr/guide/8_6/other-parsers.html#surround-query-parser
and the Advanced Query Parser will have a similar syntax
On Thu, Sep 10, 2020 at 4:40 PM Michael Sokolov wrote:
> A slightly different but related
A slightly different but related topic is how to manage lots of fields
I agree that sub-fields are a pain and that mashing everything
together in an all-field is a mess, but for best performance with a
large number of fields/sub-fields, it is the only workable option I
can see? Expanding a query o
> Ok so the more general question is whether we need an interval query parser
Oh, to this I'd say: yes, yes, yes.
I didn't have much prior experience writing frontend apps on top of
Solr/Lucene but once I did have
to go that route it quickly turns out that several things that are
readily availabl
Ok so the more general question is whether we need an interval query parser
Le jeu. 10 sept. 2020 à 17:28, Dawid Weiss a écrit :
> I am fine with the boundary token suggestion, actually. What I don't
> see at the moment is how I can marry it with an output of a general
> query parser (which retu
I am fine with the boundary token suggestion, actually. What I don't
see at the moment is how I can marry it with an output of a general
query parser (which returns any Query). I could give an attempt to
process the query node tree from standard query parser (which we're
using at the moment anyway)
Right, I misunderstood Alan's answer. The boundary option is not "impure"
in my opinion. It solves this issue nicely but maybe it needs something
more packaged to add the boundaries and build queries easily.
Le jeu. 10 sept. 2020 à 16:16, Dawid Weiss a écrit :
> Yup - similar to what Alan sugges
Yup - similar to what Alan suggested. I'd have to rewrite the (general
text-to-query) query parser to only use intervals though. Still
thinking about possible approaches to this.
D.
On Thu, Sep 10, 2020 at 3:58 PM jim ferenczi wrote:
>
> You could set a very high position increment gap for multi
You could set a very high position increment gap for multi-valued fields
(Analyzer#getPositionIncrementGap) and perform something
like Intervals.maxWidth(Intervals.unordered(...), pos_gap-1) ?
Le jeu. 10 sept. 2020 à 12:32, Dawid Weiss a écrit :
> Yeah... I was thinking about adding synthetic b
Yeah... I was thinking about adding synthetic boundaries but this
seems... impure. :) Another quick reflection is that I'd have to
somehow translate the original query (which can be arbitrarily
complex) into an interval query. Tough.
D.
On Thu, Sep 10, 2020 at 12:22 PM Alan Woodward wrote:
>
> I
I’ve solved this sort of thing in the past by indexing boundary tokens, and
wrapping the queries with the equivalent of Intervals.notContaining(query,
boundary-query); you could also put a very large position increment gap and use
a width filter, but that’s a bit more error prone if you could co
Hi Alan,
You're the expert here so I thought I'd ask before I jump in deep. Do
you think it's feasible to solve the following multivalued-field
problem:
doc: field=["foo", "bar"]
query: field:(foo AND bar)
I'd like the above to return zero hits (no single value contains both
foo and bar), but si
16 matches
Mail list logo