Tal: I think by nature of the project itself, Python APIs are developed after Scala and Java, and it is a fair trade off between speed of getting stuff to market. And more and more this discussion is progressing, I see not much issue in terms of feature parity.
Coming back to performance, Darren raised a good point: if I can scale out, individual VM performance should not matter much. But performance is often stated as a definitive downside of using Python over scala/java. I am trying to understand the truth and myth behind this claim. Any pointer would be great. best Ayan On Fri, Sep 2, 2016 at 4:10 PM, Tal Grynbaum <tal.grynb...@gmail.com> wrote: > > On Fri, Sep 2, 2016 at 1:15 AM, darren <dar...@ontrenet.com> wrote: > >> This topic is a concern for us as well. In the data science world no one >> uses native scala or java by choice. It's R and Python. And python is >> growing. Yet in spark, python is 3rd in line for feature support, if at all. >> >> This is why we have decoupled from spark in our project. It's really >> unfortunate spark team have invested so heavily in scale. >> >> As for speed it comes from horizontal scaling and throughout. When you >> can scale outward, individual VM performance is less an issue. Basic HPC >> principles. >> > > Darren, > > My guess is that data scientist who will decouple themselves from spark, > will eventually left with more or less nothing. (single process > capabilities, or purely performing HPC's) (unless, unlikely, some good > spark competitor will emerge. unlikely, simply because there is no need > for such). > But putting guessing aside - the reason python is 3rd in line for feature > support, is not because the spark developers were busy with scala, it's > because the features that are missing are those that support strong typing. > which is not relevant to python. in other words, even if spark was > rewritten in python, and was to focus on python only, you would still not > get those features. > > > > -- > *Tal Grynbaum* / *CTO & co-founder* > > m# +972-54-7875797 > > mobile retention done right > -- Best Regards, Ayan Guha