If its possible to make the timestamps as a suffix of your rowkey(assuming the
rowkey is composite) then you would not run into read/write hotspots.
Have a look at open tsdb data model that scales really really well.
Sent from my iPhone
> On Feb 21, 2016, at 10:28 AM, Stephen Durfey
I personally don't deal with time series data, so I'm not going to make a
statement on which is better. I would think from a scanning viewpoint putting
the time stamp in the row key is easier, but that will introduce scanning
performance bottlenecks due to the row keys being stored
Thanks for your sharing, Stephen and Ted. The reference guide recommends "rows"
over "versions" concerning time series data. Are there advantages of using
"reversed timestamps" in row keys over the built-in "versions" with regard to
scanning performance?
-- Original
Thanks for sharing, Stephen.
bq. scan performance on the region servers needing to scan over all that
data you may not need
When number of versions is large, try to utilize Filters (where
appropriate) which implements:
public Cell getNextCellHint(Cell currentKV) {
See MultiRowRangeFilter for
Someone please correct me if I am wrong.
I've looked into this recently due to some performance reasons with my tables
in a production environment. Like the books says, I don't recommend keeping
this many versions around unless you really need them. Telling HBase to keep
around a very large
Hi, I have two questions about the maximum number of versions of a column
family:
(1) Is it OK to set a very large (>100,000) maximum number of versions for a
column family?
The reference guide says "It is not recommended setting the number of max
versions to an exceedingly high level (e.g.,