[ 
https://issues.apache.org/jira/browse/ACCUMULO-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225204#comment-13225204
 ] 

Aaron Cordova commented on ACCUMULO-452:
----------------------------------------

I know this might sound crazy but 'more general' doesn't always mean better. 
There is a cost to generality and it is complexity. For example, several people 
have read the MapReduce paper and said 'oh you can easily create a more general 
computation framework than that, that includes message passing etc' and those 
people miss the point of why MapReduce is so widely adopted and why more 
general systems like MPI are not - its simplicity.

In this case, the cost is that the user has to now decide at what level to 
group data to get the locality they desire, and pass options where they 
normally might not. Users already have the ability to use the column family to 
store whatever data they want, knowing that the data they store in column 
families can be used for physical partitioning.

So I suppose I'm looking for more justification before adding more complexity 
to an admittedly already more general/complex implementation of BigTable.
                
> Generalize locality groups
> --------------------------
>
>                 Key: ACCUMULO-452
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-452
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Keith Turner
>             Fix For: 1.5.0
>
>
> Locality groups are a neat feature, but there is no reason to limit 
> partitioning to column families.  Data could be partitioned based on any 
> criteria.  For example if a user is interested in querying recent data and 
> ageing off old data partitioning locality groups based in timestamp would be 
> useful.  This could be accomplished by letting users specify a partitioner 
> plugin that is used at compaction and scan time.  Scans would need an ability 
> to pass options to the partitioner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to