Is Tez's architecture similar to Akka's distributed architecture ? I think I remember that Jonas boner mentioned during a presentation on distributed computing about Akka's support for protocols like raft etc. What makes Tez more scalable in this regard ?
Thanks, Mohan On Sun, Oct 19, 2014 at 5:26 PM, Niels Basjes <ni...@basjes.nl> wrote: > Very interesting! > What makes Tez more scalable than Spark? > What architectural "thing" makes the difference? > > Niels Basjes > On Oct 19, 2014 3:07 AM, "Jeff Zhang" <zjf...@gmail.com> wrote: > >> Tez has a feature called pre-warm which will launch JVM before you use it >> and you can reuse the container afterwards. So it is also suitable for >> interactive queries and is more stable and scalable than spark IMO. >> >> On Sat, Oct 18, 2014 at 4:22 PM, Niels Basjes <ni...@basjes.nl> wrote: >> >>> It is my understanding that one of the big differences between Tez and >>> Spark is is that a Tez based query still has the startup overhead of >>> starting JVMs on the Yarn cluster. Spark based queries are immediately >>> executed on "already running JVMs". >>> >>> So for interactive dashboards Spark seems more suitable. >>> >>> Did I understand correctly? >>> >>> Niels Basjes >>> On Oct 17, 2014 8:30 PM, "Gavin Yue" <yue.yuany...@gmail.com> wrote: >>> >>>> Spark and tez both make MR faster, this has no doubt. >>>> >>>> They also provide new features like DAG, which is quite important for >>>> interactive query processing. From this perspective, you could view them >>>> as a wrapper around MR and try to handle the intermediary buffer(files) >>>> more efficiently. It is a big pain in MR. >>>> >>>> Also they both try to use Memory as the buffer instead of only >>>> filesystems. Spark has a concept RDD, which is quite interesting and also >>>> limited. >>>> >>>> >>>> >>>> On Fri, Oct 17, 2014 at 11:23 AM, Adaryl "Bob" Wakefield, MBA < >>>> adaryl.wakefi...@hotmail.com> wrote: >>>> >>>>> It was my understanding that Spark is faster batch processing. Tez >>>>> is the new execution engine that replaces MapReduce and is also supposed >>>>> to >>>>> speed up batch processing. Is that not correct? >>>>> B. >>>>> >>>>> >>>>> >>>>> *From:* Shahab Yunus <shahab.yu...@gmail.com> >>>>> *Sent:* Friday, October 17, 2014 1:12 PM >>>>> *To:* user@hadoop.apache.org >>>>> *Subject:* Re: Spark vs Tez >>>>> >>>>> What aspects of Tez and Spark are you comparing? They have different >>>>> purposes and thus not directly comparable, as far as I understand. >>>>> >>>>> Regards, >>>>> Shahab >>>>> >>>>> On Fri, Oct 17, 2014 at 2:06 PM, Adaryl "Bob" Wakefield, MBA < >>>>> adaryl.wakefi...@hotmail.com> wrote: >>>>> >>>>>> Does anybody have any performance figures on how Spark stacks up >>>>>> against Tez? If you don’t have figures, does anybody have an opinion? >>>>>> Spark >>>>>> seems so popular but I’m not really seeing why. >>>>>> B. >>>>>> >>>>> >>>>> >>>> >>>> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> >