Re: WG: Choosing a Hadoop distribution

Steve Loughran Mon, 24 Sep 2012 12:32:07 -0700

On 24 September 2012 14:41, Marcos Ortiz <mlor...@uci.cu> wrote:

>
> On 09/24/2012 06:29 AM, Christian Schäfer wrote:
>
>> I think a good starting point for that distribution guide would be a
>> feature matrix where all reasonable distributions could be compaired.
>>
> +1  for this idea
> I think that this feature matrix will be on the Hadoop wiki.
>
>
gets too controversial


I wouldn't be completely dismissive of Apache 1.0.3; it went through the
large cluster QA by the QA team at hortonworks (disclaimer: my colleagues)
; the 1.x branch is going to be long-lived and is in use in production.


>
>>
>> There could be metrics for cross cutting concerns like performance,
>> security, etc. referring to real benchmarks.
>> Upon this one could derive (maybe by additional explainations) which
>> distribution fits in a certain use case the best.
>>
> Umm, this is tricky, How we can decide which is the best fit for a certain
> type of problem?
> My suggestion is to avoid this, because this will bring some hot
> discussions and that´s not the idea.
> It´s my personal opinion.
>

What would be good would be more traces of real-world cluster use, stuff
that can be fed into the gridmix 3 benchmarker [
http://developer.yahoo.com/blogs/hadoop/posts/2010/04/gridmix3_emulating_production/].
If your workload gets pulled into the performance tests used by the
Hadoop development teams. .

Re: WG: Choosing a Hadoop distribution

Reply via email to