No, I never asserted that Snappy is *always* the better choice. I would say that I believe Snappy is better in *most cases*.

Most users I talk to (with and without Accumulo involved) have plenty of disk space available to them. It is rare that space on disk is actually a concern. Instead, performance is usually the primary metric of concern. To be crystal clear, this is only my opinion on users I've talked to, not an assertion on everyone.

I do not believe I need a better argument than "on average, we can make out of the box performance better for most users". I suppose we'll have to disagree on that point. Thanks for clarifying your opinions on the topic.

Adam Fuchs wrote:
If the crux of your argument was that snappy is always a better choice,
then my retort was to say it is not, since sometimes compression ratio can
be a dominant factor. Changes to defaults are disruptive for existing
users, so you need a better argument. I don't mean that you shouldn't
continue to debate the merits. By all means, do continue the conversation.

Adam

On Aug 13, 2016 8:39 PM, "Josh Elser"<josh.el...@gmail.com>  wrote:
Your argument fails to address the performance benefits. I could pose the
same question back to you: you need to prove why we shouldn't use the
faster compression algorithm.

I don't mean to be snarky, but your argument is shutting down
conversation.
I appreciate you sharing the opinion but don't feel like it's encouraging
discussion.

On Aug 13, 2016 11:18 PM, "Adam Fuchs"<afu...@apache.org>  wrote:

In my experience gz gets roughly 1.5x to 2x better compression than
snappy.
Snappy is definitely not a pareto improvement (although we tend to use
snappy by default). Since it's not always better I think you would need
a
more solid argument to change the default.

Adam

On Aug 13, 2016 8:06 PM, "Josh Elser"<josh.el...@gmail.com>  wrote:

Same motivation of using it as for making it the default. I am not
aware
of any downside to it. It's become pretty standard across all
installations
I've worked with for years.

Asking because I am no oracle on the matter. I could just be ignorant
of
some issue, but, given my current understanding, there is no downside
for
the average case.

Christopher wrote:

Sorry. I wasn't clear. I understand the motivation for using it...
I'm
asking about the motivation for making it the default.

Since both are available, I'm not sure the default matters *that*
much,
but
it could be an unexpected change for those preferring GZ.

Also, are there any risks regarding library availability of snappy?
GZ
is
pretty ubiquitous.

On Sat, Aug 13, 2016 at 10:59 PM Josh Elser<josh.el...@gmail.com>
wrote:
Uhh, besides what I already mentioned? (close in compressed size but
"much" faster)

Christopher wrote:

What's the motivation for changing it?

On Sat, Aug 13, 2016 at 10:47 PM Josh Elser<josh.el...@gmail.com>

wrote:

Any reason we don't want to do this? Last rule-of-thumb I heard was
that
snappy is often close enough in compression to GZ but quite a bit
faster
(I don't remember exactly how much).

- Josh



Reply via email to