On Fri, 2012-09-07 at 06:55 +0200, Erick Erickson wrote:
> I may prefer the first, and you may prefer the second. Neither is
> necessarily more "correct" IMO, it depends on the problem
> space. Choosing either one will be unpopular with anyone
> who likes the other....

Sorry, I did not make myself clear: If we decide that there is only a
few "obvious" (that's a loaded word. Maybe "common"?) solutions, my idea
was to implement them all. Especially if they can be reduced to the same
underlying algorithm with a few tweaks for each case.

> And I suspect that 99 times out of 100, someone wanting to sort on
> fields with multiple tokens hasn't thought the problem through
> carefully.

That might very well be the case. I must admit that I have mostly seen
the issue as "User asks for X, how do we implement X?", instead of "User
asks for X, would user be better off with Y?".

> And duplicate entries in the result set gets ugly. Say a user sorts
> on a field containing 10,000 tokens. Now one doc is repeated
> 10,000 times in the result set. How many docs are set for
> numFound? Faceting? Grouping?

I don't see the difference between 2 and 10,000 tokens for this, but I
concede that there is no clear answer and that choosing by setup would
require the user to have a fairly deep understanding.

I accept that there is no clear need for the functionality at this point
in time and defer hacking on it.

Thank you for your input,
Toke Eskildsen

Reply via email to