[ 
https://issues.apache.org/jira/browse/SOLR-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512372
 ] 

Pieter Berkel commented on SOLR-258:
------------------------------------

I've just tried this patch and the results are impressive!

I agree with Ryan regarding the naming of 'pre', 'post' and 'inner', using 
simple concrete words will make it easier for developers to understand the 
basic concepts.  At first I was a little confused how the 'gap' parameter was 
used, perhaps a name like 'interval' would be more indicative of it's purpose?

While on the topic of gaps / intervals, I can imagine a case where one might 
want facet counts over non-linear intervals, for instance obtaining results 
from: "Last 7 days", "Last 30 days", "Last 90 days", "Last 6 months".  
Obviously you can achieve this by setting facet.date.gap=+1DAY and then 
post-process the results, but a much more elegant solution would be to allow 
"facet.date.gap"  (or another suitably named param) to accept a 
(comma-delimited) set of explicit partition dates:

facet.date.start=NOW-6MONTHS/DAY
facet.date.end=NOW/DAY
facet.date.gap=NOW-90DAYS/DAY,NOW-30DAYS/DAY,NOW-7DAYS/DAY

It would then be trivial to calculate facet counts for the ranges specified 
above.

It would be useful to make the 'start' an 'end' parameters optional.  If not 
specified 'start' should default to the earliest stored date value, and 'end' 
should default to the latest stored date value (assuming that's possible).  
Probably should return a 400 if 'gap' is not set.

My personal opinion is that 'end' should be a hard limit, the last gap should 
never go past 'end'.  Given that the facet label is always generated from the 
lower value in the range, I don't think truncating the last 'gap' will cause 
problems, however it may be helpful to return the actual date value for "end" 
if it was specified as a offset of NOW.

What might be a problem is when both start and end dates are specified as 
offsets of NOW, the value of NOW may not be constant for both values.  In one 
of my tests, I set:

facet.date.start=NOW-12MONTHS
facet.date.end=NOW
facet.date.gap=+1MONTH

With some extra debugging output I can see that mostly the value of NOW is the 
same:

<str name="start">2006-07-13T06:06:07.397</str>
<str name="end">2007-07-13T06:06:07.397</str>

However occasionally there is a difference:

<str name="start">2006-07-13T05:48:23.014</str>
<str name="end">2007-07-13T05:48:23.015</str>

This difference alters the number of gaps calculated (+1 when NOW values are 
diff for start & end).  Not sure how this could be fixed, but as you mentioned 
above, it will probably involve changing "ft.toExternal(ft.toInternal(...))".

Thanks again for creating this useful addition, I'll try to test it a bit more 
and see if I can find anything else.


> Date based Facets
> -----------------
>
>                 Key: SOLR-258
>                 URL: https://issues.apache.org/jira/browse/SOLR-258
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>         Attachments: date_facets.patch, date_facets.patch, date_facets.patch, 
> date_facets.patch, date_facets.patch
>
>
> 1) Allow clients to express concepts like...
>     * "give me facet counts per day for every day this month."
>     * "give me facet counts per hour for every hour of today."
>     * "give me facet counts per hour for every hour of a specific day."
>     * "give me facet counts per hour for every hour of a specific day and 
> give me facet counts for the 
>        number of matches before that day, or after that day." 
> 2) Return all data in a way that makes it easy to use to build filter queries 
> on those date ranges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to