[ 
https://issues.apache.org/jira/browse/ACCUMULO-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225259#comment-13225259
 ] 

John Vines commented on ACCUMULO-452:
-------------------------------------

This feature by itself is nice, but it doesn't really do a whole lot without 
supporting expressions ( https://issues.apache.org/jira/browse/ACCUMULO-164 ) . 
But once you start allowing expressions, you then need to support orders of 
locality, because data can only exist in a single locality group. This can 
increase complexity for the user, or at the very least will make the API more 
cludgy.

I'm all for making things pluggable, but we need to make it designed to ensure 
that things are not easily borked by the user. This includes either forcing the 
interface to only handle one locality group at a time or redesigning rfile to 
allow writing to different locality groups. We also need to make sure we can 
optimize scan queries of data pre-dating the locality group. Right now, it's 
really nice that data can only belong to one locality group. This will make it 
so all data must be checked to see if it belongs in the new locality group, 
which will make a big hit. Perhaps we should encourage majc after applying a 
new locality group.
                
> Generalize locality groups
> --------------------------
>
>                 Key: ACCUMULO-452
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-452
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Keith Turner
>             Fix For: 1.5.0
>
>
> Locality groups are a neat feature, but there is no reason to limit 
> partitioning to column families.  Data could be partitioned based on any 
> criteria.  For example if a user is interested in querying recent data and 
> ageing off old data partitioning locality groups based in timestamp would be 
> useful.  This could be accomplished by letting users specify a partitioner 
> plugin that is used at compaction and scan time.  Scans would need an ability 
> to pass options to the partitioner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to