I pushed https://issues.apache.org/jira/browse/ACCUMULO-3946 tonight
(thanks again, James).
The merge to 1.6 and onwards wasn't easy, so I would appreciate if
someone could spot-check this for me.
I _think_ this (and the previous audit issue) were the only things we
wanted to get into 1.5? A
That being send what is the use case that u feel you need a nosql solution
for?
On Aug 19, 2015 6:54 PM, "Ted Malaska" wrote:
> I'm on the side of benchmarking for the use case and with an expert.
> There a so many ways to cheat a benchmark. And the bench mark may not be
> anything like your use
I'm on the side of benchmarking for the use case and with an expert. There
a so many ways to cheat a benchmark. And the bench mark may not be
anything like your use case.
On Aug 19, 2015 5:43 PM, "Andrew Purtell" wrote:
> I think someone who uses third party benchmarks to assess a system like
>
Ah right, I did forgot about that paper. Thanks for clarifying.
Big +1 to Andy's comments, too.
Jeremy Kepner wrote:
Turning off the walog was mostly to shorten the benchmarking cycle
(it allowed us to go from zero to peak ingest in a few seconds). BAH got
pretty much the same performance resu
I think someone who uses third party benchmarks to assess a system like
HBase or Accumulo (or Cassandra...) is taking a foolish shortcut, so
perhaps we must agree to disagree.
On Wed, Aug 19, 2015 at 2:34 PM, Jeremy Kepner wrote:
> I agree, that performance on real apps is the most important fo
I agree, that performance on real apps is the most important for
any particular organization, but as technologists how do we measure ourselves?
Hence imperfect benchmarking remains our only recourse.
On Wed, Aug 19, 2015 at 12:34:44PM -0700, Andrew Purtell wrote:
> I can't speak for anyone other t
Turning off the walog was mostly to shorten the benchmarking cycle
(it allowed us to go from zero to peak ingest in a few seconds). BAH got
pretty much the same performance results in their paper,
it just took longer for their experiments to run.
So, in this case, we had two different teams doing
Hbase region splits can be done through a variety of strategies. Data size
can be a component in those strategies. There's no hard and fast rule of
how large a region can be. There's some tradeoffs with larger or smaller
region sizes. A region split strategy will depend upon a number of factors.
Me
I can't speak for anyone other than myself in the HBase community, but I'm
much more interested and focused on performance analysis and
developing/deploying for the use cases of my employer than participating in
generic bench-marketing to make weapons for happy OSS warriors. Perhaps
this does a dis
Alright, I have to ask... are you referring to the paper that cites
Accumulo performance without write-ahead logs enabled? I have some
serious reservations about the relevance of that paper to this
conversation and just want to make sure people aren't led astray by what
the actual takeaway shou
Forgive my ignorance about HBase, but wouldn't size of records count,
also? Your response seems to imply that number of records is what
matters for how many regions are needed. For what it's worth,
Accumulo's tablets are split based on storage size, not number of
records. I assumed the same was tru
A big difference between Accumulo and HBase is the published performance
numbers.
The Accumulo community has done a good job of continuing to publish up-to-date
performance
numbers in peer-reviewed venues which allow Accumulo to claim best in the world
performance.
The HBase community hasn't be
Sorry Type-o
So there might be issues when you pass the Quadrillion. But Like I said
never ran into that issue of region limits.
On Wed, Aug 19, 2015 at 2:29 PM, Ted Malaska
wrote:
> Sorry 10 billion a day so that is 7 Trillion records. So many issues
> around 1000 Trillion
>
> On Wed, Aug 19
Yeah is you have more then a Quadrillion records in you design let me know
I would love to help out.
Ted Malaska
On Wed, Aug 19, 2015 at 2:30 PM, Josh Elser wrote:
> Like I've said many times now, it's relative to your actual problem. If
> you don't have that much data (or intend to grow into t
He didn't ask just about security, FWIW
"I am looking for real gap comparing HBase to Accumulo if there is any
so that I can be prepared to address them. This is not limited to the
security area."
Sean Busbey wrote:
Let's please stick to the topic Jerry asked about: security features.
We ca
"I am looking for real gap comparing HBase to Accumulo if there is any so that
I can be prepared to address them. This is not limited to the security area.
There are differences in some features and implementations. But they don't see
like real 'gaps'."
He asked about gaps, but not feature and
If you drew a Venn diagram of HBase features compared to Accumulo features,
it's pretty much going to be a single circle.
If you want performance anecdotes, the most succinct summary I've seen is
that Accumulo can handle heavier write loads whereas HBase will handle
heavier read loads. From these
Let's please stick to the topic Jerry asked about: security features.
We can get into all sorts of discussions around scalability and read/write
performance in a different joint thread if folks want. We all have lots of
Opinions (and the YCSB community would love to see more of y'all show up to
im
I've been doing HBase for a long time and never had an issue with region
count limits and I have clusters with 10s of billions of records. Many
there would be issues around a couple Trillion records, but never got that
high yet.
Ted Malaska
On Wed, Aug 19, 2015 at 2:24 PM, Josh Elser wrote:
>
Sorry 10 billion a day so that is 7 Trillion records. So many issues
around 1000 Trillion
On Wed, Aug 19, 2015 at 2:28 PM, Ted Malaska
wrote:
> I've been doing HBase for a long time and never had an issue with region
> count limits and I have clusters with 10s of billions of records. Many
> th
Like I've said many times now, it's relative to your actual problem. If
you don't have that much data (or intend to grow into that much data),
it's not an issue. Obviously, this is the case for you.
However, it is an architectural difference between the two projects with
known limitations for
Oh, one other thing that I should mention (was prompted off-list).
(definition time since cross-list now: HBase regions == Accumulo tablets)
Accumulo will handle many more regions than HBase does now due to a
splittable metadata table. While I was told this was a very long and
arduous journey
+dev@accumulo (Though Josh and I are on this list, some other folks on
dev@accumulo might have opinions)
Hi Jerry!
Do you have constraints on which version(s) of HBase and Accumulo you're
comparing?
Are you looking for currently shipping or for some expected future date?
In very broad strokes:
23 matches
Mail list logo