On Fri, Sep 2, 2016 at 1:15 AM, darren <dar...@ontrenet.com> wrote:

> This topic is a concern for us as well. In the data science world no one
> uses native scala or java by choice. It's R and Python. And python is
> growing. Yet in spark, python is 3rd in line for feature support, if at all.
>
> This is why we have decoupled from spark in our project. It's really
> unfortunate spark team have invested so heavily in scale.
>
> As for speed it comes from horizontal scaling and throughout. When you can
> scale outward, individual VM performance is less an issue. Basic HPC
> principles.
>

Darren,

My guess is that data scientist who will decouple themselves from spark,
will eventually left with more or less nothing. (single process
capabilities, or purely performing HPC's) (unless, unlikely, some good
spark competitor will emerge.  unlikely, simply because there is no need
for such).
But putting guessing aside - the reason python is 3rd in line for feature
support, is not because the spark developers were busy with scala, it's
because the features that are missing are those that support strong typing.
which is not relevant to python.  in other words, even if spark was
rewritten in python, and was to focus on python only, you would still not
get those features.



-- 
*Tal Grynbaum* / *CTO & co-founder*

m# +972-54-7875797

        mobile retention done right

Reply via email to