[ 
https://issues.apache.org/jira/browse/LUCENE-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012870#comment-14012870
 ] 

David Smiley commented on LUCENE-5648:
--------------------------------------

Some possible features to add, maybe before committing or maybe later:
* Make the TimeZone and/or Locale configurable.  It's fixed to UTC/ROOT right 
now.
* Cap the precision at configuration/setup time to a specified level. New data 
would get truncated.  It's not needed if you never provide data beyond the 
desired precision but it's likely easier to do truncation here, and some RPT 
algorithms can work better if it knows where the bottom is (e.g. 
prefixGridScanLevel).

Hmmm; I'm not even checking/asserting the TimeZone is what I expect it to be 
when a Calendar is passed in; I should do that and assert a failure or add the 
ZONE_OFFSET value to correct into UTC.

Some possible refactorings or performance improvements I see:
* Optimize Calendar formatting -- don't use SimpleDateFormat.
* Break out NRCell as a separate class (package access) to keep the line count 
more manageable.
* Use a private inner class for the shared state between NRCell instances in 
the same stack (currently 3 fields: the stack, BytesRef, and something else)

> Index/search multi-valued time durations
> ----------------------------------------
>
>                 Key: LUCENE-5648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5648
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial
>            Reporter: David Smiley
>            Assignee: David Smiley
>         Attachments: LUCENE-5648.patch, LUCENE-5648.patch, LUCENE-5648.patch, 
> LUCENE-5648.patch
>
>
> If you need to index a date/time duration, then the way to do that is to have 
> a pair of date fields; one for the start and one for the end -- pretty 
> straight-forward. But if you need to index a variable number of durations per 
> document, then the options aren't pretty, ranging from denormalization, to 
> joins, to using Lucene spatial with 2D as described 
> [here|http://wiki.apache.org/solr/SpatialForTimeDurations].  Ideally it would 
> be easier to index durations, and work in a more optimal way.
> This issue implements the aforementioned feature using Lucene-spatial with a 
> new single-dimensional SpatialPrefixTree implementation. Unlike the other two 
> SPT implementations, it's not based on floating point numbers. It will have a 
> Date based customization that indexes levels at meaningful quantities like 
> seconds, minutes, hours, etc.  The point of that alignment is to make it 
> faster to query across meaningful ranges (i.e. [2000 TO 2014]) and to enable 
> a follow-on issue to facet on the data in a really fast way.
> I'll expect to have a working patch up this week.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to