> python for just sake of sheer fun to code with it :)
>
> best
> Ayan
>
> On Wed, Sep 6, 2017 at 1:46 PM, Adaryl Wakefield
> mailto:adaryl.wakefi...@hotmail.com>>
> wrote:
>
> Is there any performance difference in writing your application in
> python
n writing your application in python
> vs. scala? I’ve resisted learning Python because it’s an interpreted
> scripting language, but the market seems to be demanding Python skills.
>
>
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics
Is there any performance difference in writing your application in python vs.
scala? I’ve resisted learning Python because it’s an interpreted scripting
language, but the market seems to be demanding Python skills.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
91
Sorry, there is not, you can try clone from github and build it from
scratch, see [1]
[1] https://github.com/apache/spark
Davies
On Wed, Oct 22, 2014 at 2:31 PM, Marius Soutier wrote:
> Can’t install that on our cluster, but I can try locally. Is there a
> pre-built binary available?
>
> On 22
Can’t install that on our cluster, but I can try locally. Is there a pre-built
binary available?
On 22.10.2014, at 19:01, Davies Liu wrote:
> In the master, you can easily profile you job, find the bottlenecks,
> see https://github.com/apache/spark/pull/2556
>
> Could you try it and show the s
Yeah we’re using Python 2.7.3.
On 22.10.2014, at 20:06, Nicholas Chammas wrote:
> On Wed, Oct 22, 2014 at 11:34 AM, Eustache DIEMERT
> wrote:
>
>
>
> Wild guess maybe, but do you decode the json records in Python ? it could be
> much slower as the default lib is quite slow.
>
>
> Oh yea
On Wed, Oct 22, 2014 at 11:34 AM, Eustache DIEMERT
wrote:
Wild guess maybe, but do you decode the json records in Python ? it could
> be much slower as the default lib is quite slow.
>
Oh yeah, this is a good place to look. Also, just upgrading to Python 2.7
may be enough performance improvement
vm. The
>>>> python
>>>> server bit "translates" the python calls to those in the jvm. The python
>>>> spark context is like an adapter to the jvm spark context. If you're seeing
>>>> performance discrepancies, this might be the reason
bit "translates" the python calls to those in the jvm.
>>>> The python spark context is like an adapter to the jvm spark context. If
>>>> you're seeing performance discrepancies, this might be the reason why. If
>>>> the code can be organised to
he jvm spark context. If you're seeing
>> performance discrepancies, this might be the reason why. If the code can be
>> organised to require fewer interactions with the adapter, that may improve
>> things. Take this with a pinch of salt...I might be way off on this :)
>
Didn’t seem to help:
conf = SparkConf().set("spark.shuffle.spill",
"false").set("spark.default.parallelism", "12")
sc = SparkContext(appName=’app_name', conf = conf)
but still taking as much time
On 22.10.2014, at 14:17, Nicholas Chammas wrote:
> Total guess without knowing anything about you
t; The python spark context is like an adapter to the jvm spark context. If
>>> you're seeing performance discrepancies, this might be the reason why. If
>>> the code can be organised to require fewer interactions with the adapter,
>>> that may improve things. Take this w
is like an adapter to the jvm spark context. If you're
>> seeing performance discrepancies, this might be the reason why. If the code
>> can be organised to require fewer interactions with the adapter, that may
>> improve things. Take this with a pinch of salt...I might be way of
why. If the code can be
> organised to require fewer interactions with the adapter, that may improve
> things. Take this with a pinch of salt...I might be way off on this :)
>
> Cheers,
> Ashic.
>
> > From: mps@gmail.com
> > Subject: Python vs Scala performa
things. Take this with a pinch of salt...I might be way off on this
> :)
>
> Cheers,
> Ashic.
>
> > From: mps@gmail.com
> > Subject: Python vs Scala performance
> > Date: Wed, 22 Oct 2014 12:00:41 +0200
> > To: user@spark.apache.org
>
> >
>
t...I might be way off on this :)
Cheers,
Ashic.
> From: mps....@gmail.com
> Subject: Python vs Scala performance
> Date: Wed, 22 Oct 2014 12:00:41 +0200
> To: user@spark.apache.org
>
> Hi there,
>
> we have a small Spark cluster running and are processing around 40 GB of
Hi there,
we have a small Spark cluster running and are processing around 40 GB of
Gzip-compressed JSON data per day. I have written a couple of word count-like
Scala jobs that essentially pull in all the data, do some joins, group bys and
aggregations. A job takes around 40 minutes to complete
17 matches
Mail list logo