Re: Spark vs Tez

2014-10-18 Thread Jeff Zhang
Tez has a feature called pre-warm which will launch JVM before you use it and you can reuse the container afterwards. So it is also suitable for interactive queries and is more stable and scalable than spark IMO. On Sat, Oct 18, 2014 at 4:22 PM, Niels Basjes wrote: > It is my understanding that

Re: hadoop 2.4 using Protobuf - How does downgrade back to 2.3 works ?

2014-10-18 Thread Harsh J
Until you issue a finalize command, the pre-upgrade metadata is kept aside for rolling back. When you issue the rollback command, it replaces the past metadata files back. On Fri, Oct 17, 2014 at 5:45 AM, Manoj Samel wrote: > Hadoop 2.4.0 mentions that FSImage is stored using protobuf. So upgrade

Re: Spark vs Tez

2014-10-18 Thread Russell Jurney
Check out PySpark. No Scala required. On Friday, October 17, 2014, Adaryl "Bob" Wakefield, MBA < adaryl.wakefi...@hotmail.com> wrote: > “The only problem with Spark adoption is the steep learning curve of > Scala , and understanding the API properly.” > > This is why I’m looking for reasons to

Re: Spark vs Tez

2014-10-18 Thread Mohan Radhakrishnan
I remember Spark uses Akka clusters. Isn't that totally different from other distributed technologies ? Thanks, Mohan On Sat, Oct 18, 2014 at 1:52 PM, Niels Basjes wrote: > It is my understanding that one of the big differences between Tez and > Spark is is that a Tez based query still has the

Re: Spark vs Tez

2014-10-18 Thread Niels Basjes
It is my understanding that one of the big differences between Tez and Spark is is that a Tez based query still has the startup overhead of starting JVMs on the Yarn cluster. Spark based queries are immediately executed on "already running JVMs". So for interactive dashboards Spark seems more suit