[ 
https://issues.apache.org/jira/browse/SOLR-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101152#comment-13101152
 ] 

Jan Høydahl commented on SOLR-2366:
-----------------------------------

Hoss: Good comments, which need to be decided upon, including corner cases.

1)
bq.I would suggest define any case where the spec contains absolute value N 
after (effective) value M where N < M as an error and fail fast.
Agree

bq.Still not sure what (if anything) should be done about overlapping ranges 
that appear out of order (ie: 0,100,50..90,150 ... is that "0-100,50-90,90-150" 
?)
If all gaps are specified as explicit ranges this is no ambiguity, so we could 
require all gaps to be explicit ranges if one wants to use it?

2) 

bq.The first three examples suggest that * will be treated as "-Infinity" and 
"+Infinity" based on position (ie: the first and last ranges will be unbounded 
on one end) but in the last example the wording "...100-200, 200-300 repeating 
until max" seems inconsistent with that.
Agree. The 0,10,50,+50,+100,* example would create infinite gaps which would be 
less than desireable. But 0,10,50,+50,+100,500 would give repeating 100-gaps 
until upper bound 500, while 0,10,50,+50,+100,500,* would in addition give a 
last range 500-*. That was the intentional syntax.

bq.If we want to support the idea of "repeat the last increment continuously" 
that should be with it's own "repeat" syntax such as the "..." (three dots) i 
suggested in comment "17/Feb/11 23:50" above. I would argue that this should 
only be legal after an increment and before a concrete value (ie: 
0,+10,...,100). Requiring it to follow an increment seems like a given 
(otherwise what exactly are you repeating?) requiring that it be followed by an 
absolute value is based on my concern that if it's the last item in the spec 
(or the last item before *) it results in an infinite number of ranges.

Agree. Alternatively, if Solr could compute myField.max(), the useful value of 
"*" could be computed a bit smarter, but that would probably be hard to scale 
in a multi-shard setting.

bq.That seems like it isn't specific enough about what is/isn't going to be 
allowed – particularly since all of the facet.range params can be specified on 
a per field basis.

Didn't really think much about the global params. Silently not caring about 
gap, begin, end, other would be one way to go, but then the error feedback is 
not explicit in case of misunderstanding; the user will see that he does not 
get back what he thought, and start reading the documentation :)

I have no good answer to this, other than inventing some syntax. The default 
could be that facet.range.spec respects the global values for start and end, 
but also allow explicitly overriding start and end values as part of spec with 
a special syntax.
The following params would result in ranges 0-1, 1-2, 2-3, 3-5, 5-10 :
{noformat}
facet.range.start=0
facet.range.end=10
facet.range.gap=2
f.bedrooms.facet.range.spec=1,2,3,5
{noformat}

But these params would result in the same ranges because we specify start and 
end with a special syntax N.. for start and ..M for end:
{noformat}
facet.range.start=100
facet.range.end=200
facet.range.gap=10
f.bedrooms.facet.range.spec=0..,1,2,3,5,..10
{noformat}

This would be equivalent with adding the two params 
f.bedrooms.facet.range.start=0&f.bedrooms.facet.range.end=10, which could then 
still be allowed as an alternative. If the first value of the spec is not an 
N.., we'll require a facet.range.start. If the last value of the spec is not 
..M, we'll require facet.range.end.

Also, it must not be allowed to specify both a global facet.range.gap and a 
global facet.range.spec.

Would this be a good "compromise"? :-) My primary reason for suggesting this is 
to give users a terse, intuitive syntax for ranges.

4)
bq.Should all ranges produced by facet.range.spec be considered "gap" ranges? 
even the ones with no lower/upper bound?
Good question. I think the values facet.range.include=upper/lower is clear. 
Outer/edge would need some more work/definition.

> Facet Range Gaps
> ----------------
>
>                 Key: SOLR-2366
>                 URL: https://issues.apache.org/jira/browse/SOLR-2366
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Grant Ingersoll
>            Priority: Minor
>             Fix For: 3.4, 4.0
>
>         Attachments: SOLR-2366.patch, SOLR-2366.patch
>
>
> There really is no reason why the range gap for date and numeric faceting 
> needs to be evenly spaced.  For instance, if and when SOLR-1581 is completed 
> and one were doing spatial distance calculations, one could facet by function 
> into 3 different sized buckets: walking distance (0-5KM), driving distance 
> (5KM-150KM) and everything else (150KM+), for instance.  We should be able to 
> quantize the results into arbitrarily sized buckets.  I'd propose the syntax 
> to be a comma separated list of sizes for each bucket.  If only one value is 
> specified, then it behaves as it currently does.  Otherwise, it creates the 
> different size buckets.  If the number of buckets doesn't evenly divide up 
> the space, then the size of the last bucket specified is used to fill out the 
> remaining space (not sure on this)
> For instance,
> facet.range.start=0
> facet.range.end=400
> facet.range.gap=5,25,50,100
> would yield buckets of:
> 0-5,5-30,30-80,80-180,180-280,280-380,380-400

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to