[
https://issues.apache.org/jira/browse/ACCUMULO-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225521#comment-13225521
]
Aaron Cordova commented on ACCUMULO-452:
----------------------------------------
Even if users use the server-provided timestamps or their own, the timestamp
still falls after the row and column, and is used the same way: to limit values
after the rows and columns have been identified.
To me it seems as if this happened, as a little play:
BigTable Guys: look you can physically partition your data automatically using
the rows!
Users: Great! That works, but maybe I want an additional, secondary
partitioning?
BG: hmm. ok, how about you can also partition on the column family? It's the
next item in the hierarchy, doesn't add too much complexity, pretty
straightforward. Just specify them into groups called locality groups and I
think we can keep this under control.
Users: Yay! You guys rock!
BG: You're welcome.
Other users: Hey, locality groups are cool, but can I partition on column
qualifiers?
BG: why are rows and column families insufficient?
OU: well, I don't know, I just really like to slice things every way possible.
BG: sigh ..
Yet other users: Wait, what about timestamps? You know what's more general than
partitioning on a few elements of the data model? Partitioning on ALL the
elements of the data model! So sweet. More general means more better!
BG: I'm quitting to go work at Facebook.
> Generalize locality groups
> --------------------------
>
> Key: ACCUMULO-452
> URL: https://issues.apache.org/jira/browse/ACCUMULO-452
> Project: Accumulo
> Issue Type: New Feature
> Reporter: Keith Turner
> Fix For: 1.5.0
>
> Attachments: PartitionerDesign.txt
>
>
> Locality groups are a neat feature, but there is no reason to limit
> partitioning to column families. Data could be partitioned based on any
> criteria. For example if a user is interested in querying recent data and
> ageing off old data partitioning locality groups based in timestamp would be
> useful. This could be accomplished by letting users specify a partitioner
> plugin that is used at compaction and scan time. Scans would need an ability
> to pass options to the partitioner.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira