Tez has a feature called pre-warm which will launch JVM before you use it
and you can reuse the container afterwards. So it is also suitable for
interactive queries and is more stable and scalable than spark IMO.
On Sat, Oct 18, 2014 at 4:22 PM, Niels Basjes wrote:
> It is my understanding that
Until you issue a finalize command, the pre-upgrade metadata is kept
aside for rolling back. When you issue the rollback command, it
replaces the past metadata files back.
On Fri, Oct 17, 2014 at 5:45 AM, Manoj Samel wrote:
> Hadoop 2.4.0 mentions that FSImage is stored using protobuf. So upgrade
Check out PySpark. No Scala required.
On Friday, October 17, 2014, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefi...@hotmail.com> wrote:
> “The only problem with Spark adoption is the steep learning curve of
> Scala , and understanding the API properly.”
>
> This is why I’m looking for reasons to
I remember Spark uses Akka clusters. Isn't that totally different from
other distributed technologies ?
Thanks,
Mohan
On Sat, Oct 18, 2014 at 1:52 PM, Niels Basjes wrote:
> It is my understanding that one of the big differences between Tez and
> Spark is is that a Tez based query still has the
It is my understanding that one of the big differences between Tez and
Spark is is that a Tez based query still has the startup overhead of
starting JVMs on the Yarn cluster. Spark based queries are immediately
executed on "already running JVMs".
So for interactive dashboards Spark seems more suit