Re: Percolate feature?

Jack Krupansky Fri, 09 Aug 2013 08:51:04 -0700

Starting with the presumption that Solr is a "search engine" for userqueries, what exactly would a user query look like?

Are you really requiring your users to enter long, carefully constructed,full length product titles??


What kind of application would force its users to do such a thing?

Put another way, if the user has entered what they consider important termsin their query, why are you being so ready to ignore a lot of those terms?

Or, is this simply a case where some old software had a feature that forreasons unknown behaved this way and you are merely trying to replicate thatfeature merely in the name of compatibility without thinking about whetherthe feature actually makes sense in a modern software environment? (Or,maybe your manager or marketing "invented" this feature and you're justtrying to implement it as stated without trying to decide whether it makessense?) The point is that you are making us try to guess what the actual usecase is, rather than simply telling us what it is!

Please clarify what your use case really is. If you would explain the usecase (not some proposed solution), maybe we could offer suggestions forsolutions.

Put another way, what exactly do you perceive to be wrong with normal,traditional, simply query matching that causes you to go to such greatlengths to avoid using normal, traditional, simple query matching?

IOW, why are you trying to re-invent and re-imagine a wheel that doesn'tappear to need to be re-invented or re-imagined?

I'm sure you must have some reason for doing that, but why not disclose thatreason so that we can utilize it in understanding what you are trying to do?


-- Jack Krupansky

-----Original Message-----From: Mark

Sent: Friday, August 09, 2013 11:29 AM
To: solr-user@lucene.apache.org
Subject: Re: Percolate feature?

*All* of the terms in the field must be matched by the query....notvice-versa.


Exactly. This is why I was trying to explain it as a reverse search.

I just realized I describe it as a *large list of known keywords when reallyits small; no more than 1000. Forgetting about performance how hard do youthink this would be to implement? How should I even start?


Thanks for the input

On Aug 9, 2013, at 6:56 AM, Yonik Seeley <yo...@lucidworks.com> wrote:

*All* of the terms in the field must be matched by the query....notvice-versa.

And no, we don't have a query for that out of the box.  To implement,
it seems like it would require the total number of terms indexed for a
field (for each document).
I guess you could also index start and end tokens and then use query
expansion to all possible combinations... messy though.

-Yonik
http://lucidworks.com

On Fri, Aug 9, 2013 at 8:19 AM, Erick Erickson <erickerick...@gmail.com>wrote:

This _looks_ like simple phrase matching (no slop) and highlighting...

But whenever I think the answer is really simple, it usually means
that I'm missing something....

Best
Erick


On Thu, Aug 8, 2013 at 11:18 PM, Mark <static.void....@gmail.com> wrote:

Ok forget the mention of percolate.

We have a large list of known keywords we would like to match against.

Product keyword: "Sony"
Product keyword: "Samsung Galaxy"
We would like to be able to detect given a product title whether or notitmatches any known keywords. For a keyword to be matched all of it'sterms
must be present in the product title given.

Product Title: "Sony Experia"
Matches and returns a highlight: "Sony Experia"

Product Title: "Samsung 52inch LC"
Does not match

Product Title: "Samsung Galaxy S4"
Matches a returns a highlight: "Samsung Galaxy"

Product Title: "Galaxy Samsung S4"
Matches a returns a highlight: " Galaxy Samsung"

What would be the best way to approach this?




On Aug 5, 2013, at 7:02 PM, Chris Hostetter <hossman_luc...@fucit.org>
wrote:
: Subject: Percolate feature?
can you give a more concrete, realistic example of what you are tryingtodo? your synthetic hypothetical example is kind of hard to make senseof.
your Subject line and comment that the "percolate" feature of elastic
search sounds like what you want seems to have some lead people down a
path of assuming you want to run these types of queries as documentsareindexed -- but that isn't at all clear to me from the way you wordedyour
question other then that.
it's also not clear what aspect of the "results" you really careabout --are you only looking for the *number* of documents that "match"according
to your concept of matching, or are you looking for a list of matches?
what multiple documents have all of their terms in the query string --
how
should they score relative to eachother? what if a document containsthesame term multiple times, do you expect it to be a match of a queryonly
if that term appears in the query multiple times as well? do you care
about hte ordering of the terms in the query? the ordering of hte terms
in
the document?

Ideally: describe for us what you wnat to do, w/o assuming
solr/elasticsearch/anything specific about the implementation -- just
describe your actual use case for us, with several real document/query
examples.



https://people.apache.org/~hossman/#xyproblem
XY Problem
Your question appears to be an "XY Problem" ... that is: you aredealing
with "X", you are assuming "Y" will help you, and you are asking about
"Y"
without giving more details about the "X" so that we can understand the
full issue. Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341






-Hoss

Re: Percolate feature?

Reply via email to