Matei, Thank you for the concise explanation.
I use Python and will definitely add my vote of interest to seeing more of Spark's functionality (especially Spark Streaming) exposed via Python. Scala seems like an interesting language to learn, if only to unlock more of Spark's functionality for use. I am a total n00b in general, so I'm still learning about the things that distinguish programming languages from one another (e.g. type inference, lambda expressions, etc). Benjamin, HN does come off as a "Reddit for nerds", and discussions do seem to descend sometimes into "nerd slapfights", as one person put it. :) Nick On Thu, May 29, 2014 at 5:19 PM, Matei Zaharia <matei.zaha...@gmail.com> wrote: > Quite a few people ask this question and the answer is pretty simple. When > we started Spark, we had two goals — we wanted to work with the Hadoop > ecosystem, which is JVM-based, and we wanted a concise programming > interface similar to Microsoft’s DryadLINQ (the first language-integrated > big data framework I know of, that begat things like FlumeJava and Crunch). > On the JVM, the only language that would offer that kind of API was Scala, > due to its ability to capture functions and ship them across the network. > Scala’s static typing also made it much easier to control performance > compared to, say, Jython or Groovy. > > In terms of usage, however, we see substantial usage of our other > languages (Java and Python), and we’re continuing to invest in both. In a > user survey we did last fall, about 25% of users used Java and 30% used > Python, and I imagine these numbers are growing. With lambda expressions > now added to Java 8 ( > http://databricks.com/blog/2014/04/14/Spark-with-Java-8.html), I think > we’ll see a lot more Java. And at Databricks I’ve seen a lot of interest in > Python, which is very exciting to us in terms of ease of use. > > Matei > > On May 29, 2014, at 1:57 PM, Benjamin Black <b...@b3k.us> wrote: > > HN is a cesspool safely ignored. > > > On Thu, May 29, 2014 at 1:55 PM, Nick Chammas <nicholas.cham...@gmail.com> > wrote: > >> I recently discovered Hacker News and started reading through older >> posts about Scala >> <https://hn.algolia.com/?q=scala#!/story/forever/0/scala>. It looks like >> the language is fairly controversial on there, and it got me thinking. >> >> Scala appears to be the preferred language to work with in Spark, and >> Spark itself is written in Scala, right? >> >> I know that often times a successful project evolves gradually out of >> something small, and that the choice of programming language may not always >> have been made consciously at the outset. >> >> But pretending that it was, why is Scala the preferred language of Spark? >> >> Nick >> >> >> ------------------------------ >> View this message in context: Why Scala? >> <http://apache-spark-user-list.1001560.n3.nabble.com/Why-Scala-tp6536.html> >> Sent from the Apache Spark User List mailing list archive >> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >> > > >