Starting with the presumption that Solr is a "search engine" for user queries, what exactly would a user query look like?

Are you really requiring your users to enter long, carefully constructed, full length product titles??

What kind of application would force its users to do such a thing?

Put another way, if the user has entered what they consider important terms in their query, why are you being so ready to ignore a lot of those terms?

Or, is this simply a case where some old software had a feature that for reasons unknown behaved this way and you are merely trying to replicate that feature merely in the name of compatibility without thinking about whether the feature actually makes sense in a modern software environment? (Or, maybe your manager or marketing "invented" this feature and you're just trying to implement it as stated without trying to decide whether it makes sense?) The point is that you are making us try to guess what the actual use case is, rather than simply telling us what it is!

Please clarify what your use case really is. If you would explain the use case (not some proposed solution), maybe we could offer suggestions for solutions.

Put another way, what exactly do you perceive to be wrong with normal, traditional, simply query matching that causes you to go to such great lengths to avoid using normal, traditional, simple query matching?

IOW, why are you trying to re-invent and re-imagine a wheel that doesn't appear to need to be re-invented or re-imagined?

I'm sure you must have some reason for doing that, but why not disclose that reason so that we can utilize it in understanding what you are trying to do?

-- Jack Krupansky

-----Original Message----- From: Mark
Sent: Friday, August 09, 2013 11:29 AM
To: solr-user@lucene.apache.org
Subject: Re: Percolate feature?

*All* of the terms in the field must be matched by the query....not vice-versa.

Exactly. This is why I was trying to explain it as a reverse search.

I just realized I describe it as a *large list of known keywords when really its small; no more than 1000. Forgetting about performance how hard do you think this would be to implement? How should I even start?

Thanks for the input

On Aug 9, 2013, at 6:56 AM, Yonik Seeley <yo...@lucidworks.com> wrote:

*All* of the terms in the field must be matched by the query....not vice-versa.
And no, we don't have a query for that out of the box.  To implement,
it seems like it would require the total number of terms indexed for a
field (for each document).
I guess you could also index start and end tokens and then use query
expansion to all possible combinations... messy though.

-Yonik
http://lucidworks.com

On Fri, Aug 9, 2013 at 8:19 AM, Erick Erickson <erickerick...@gmail.com> wrote:
This _looks_ like simple phrase matching (no slop) and highlighting...

But whenever I think the answer is really simple, it usually means
that I'm missing something....

Best
Erick


On Thu, Aug 8, 2013 at 11:18 PM, Mark <static.void....@gmail.com> wrote:

Ok forget the mention of percolate.

We have a large list of known keywords we would like to match against.

Product keyword:  "Sony"
Product keyword:  "Samsung Galaxy"

We would like to be able to detect given a product title whether or not it matches any known keywords. For a keyword to be matched all of it's terms
must be present in the product title given.

Product Title: "Sony Experia"
Matches and returns a highlight: "<em>Sony</em> Experia"

Product Title: "Samsung 52inch LC"
Does not match

Product Title: "Samsung Galaxy S4"
Matches a returns a highlight: "<em>Samsung Galaxy</em>"

Product Title: "Galaxy Samsung S4"
Matches a returns a highlight: "<em> Galaxy  Samsung</em>"

What would be the best way to approach this?




On Aug 5, 2013, at 7:02 PM, Chris Hostetter <hossman_luc...@fucit.org>
wrote:


: Subject: Percolate feature?

can you give a more concrete, realistic example of what you are trying to do? your synthetic hypothetical example is kind of hard to make sense of.

your Subject line and comment that the "percolate" feature of elastic
search sounds like what you want seems to have some lead people down a
path of assuming you want to run these types of queries as documents are indexed -- but that isn't at all clear to me from the way you worded your
question other then that.

it's also not clear what aspect of the "results" you really care about -- are you only looking for the *number* of documents that "match" according
to your concept of matching, or are you looking for a list of matches?
what multiple documents have all of their terms in the query string --
how
should they score relative to eachother? what if a document contains the same term multiple times, do you expect it to be a match of a query only
if that term appears in the query multiple times as well?  do you care
about hte ordering of the terms in the query? the ordering of hte terms
in
the document?

Ideally: describe for us what you wnat to do, w/o assuming
solr/elasticsearch/anything specific about the implementation -- just
describe your actual use case for us, with several real document/query
examples.



https://people.apache.org/~hossman/#xyproblem
XY Problem

Your question appears to be an "XY Problem" ... that is: you are dealing
with "X", you are assuming "Y" will help you, and you are asking about
"Y"
without giving more details about the "X" so that we can understand the
full issue.  Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341






-Hoss


Reply via email to