It is my understanding that one of the big differences between Tez and
Spark is is that a Tez based query still has the startup overhead of
starting JVMs on the Yarn cluster. Spark based queries are immediately
executed on already running JVMs.
So for interactive dashboards Spark seems more
I remember Spark uses Akka clusters. Isn't that totally different from
other distributed technologies ?
Thanks,
Mohan
On Sat, Oct 18, 2014 at 1:52 PM, Niels Basjes ni...@basjes.nl wrote:
It is my understanding that one of the big differences between Tez and
Spark is is that a Tez based query
Until you issue a finalize command, the pre-upgrade metadata is kept
aside for rolling back. When you issue the rollback command, it
replaces the past metadata files back.
On Fri, Oct 17, 2014 at 5:45 AM, Manoj Samel manojsamelt...@gmail.com wrote:
Hadoop 2.4.0 mentions that FSImage is stored
Tez has a feature called pre-warm which will launch JVM before you use it
and you can reuse the container afterwards. So it is also suitable for
interactive queries and is more stable and scalable than spark IMO.
On Sat, Oct 18, 2014 at 4:22 PM, Niels Basjes ni...@basjes.nl wrote:
It is my