hour.
We want to keep around the labels and the sample ids for the next iteration
(N+1) where we want to do a join with the new sample window to inherit the
labels of samples that existed in the previous (N) iteration.
--
Regards,
Ofer Eliassaf
anyone? please? is this getting any priority?
On Tue, Sep 27, 2016 at 3:38 PM, Ofer Eliassaf
wrote:
> Is there any plan to support python spark running in "cluster mode" on a
> standalone deployment?
>
> There is this famous survey mentioning that more than 50% of the
back with the JIRA number once I've got it
>> created - will probably take awhile before it lands in a Spark release
>> (since 2.1 has already branched) but better debugging information for
>> Python users is certainly important/useful.
>>
>> On Thu, Nov 24, 20
sn't currently exposed in PySpark but I've been meaning
> to look at exposing at least some of TaskContext for parity in PySpark. Is
> there a particular use case which you want this for? Would help with
> crafting the JIRA :)
>
> Cheers,
>
> Holden :)
>
> On
Hi,
Is there a way to get in PYSPARK something like TaskContext from a code
running on executor like in scala spark?
If not - how can i know my task id from inside the executors?
Thanks!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-TaskContext-t
applications will get the total
amount of cores until a new application arrives...
--
Regards,
Ofer Eliassaf
; Just want some ideas.
>
> Thank,
> Ben
>
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
--
Regards,
Ofer Eliassaf
I start a cluster of 3? SPARK_WORKER_INSTANCES is the only
> way I see to start the standalone cluster and the only way I see to define
> it is in spark-env.sh. The spark submit option, SPARK_EXECUTOR_INSTANCES
> and spark.executor.instances are all related to submitting the job.
>
>
>
> Any ideas?
>
> Thanks
>
> Assaf
>
--
Regards,
Ofer Eliassaf
I advice you to use livy for this purpose.
Livy works well with yarn and it will decouple spark from your web app.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Pyspark-not-working-on-yarn-cluster-mode-tp23755p27799.html
Sent from the Apache Spark User Lis
vailabilty in python spark.
Cuurently only Yarn deployment supports it. Bringing the huge Yarn
installation just for this feature is not fun at all
Does someone have time estimation for this?
--
Regards,
Ofer Eliassaf
with something that includes a JOIN() it won’t work due to
this same issue.
Maybe worth mentioning in the docs then?
Ofer
> On Mar 23, 2015, at 11:40 AM, Sean Owen wrote:
>
> I think the explanation is that the join does not guarantee any order,
> since it causes a shuffle in gen
a = sc.parallelize('v'+str(x)+',v'+str(x) for x in range(100))
d1 = data.map(lambda s: s.split(',')[0])
d2 = data.map(lambda s: s.split(',')[1])
x = d1.zip(d2)
print x.take(10)
The output is:
[('v0', 'v0'), ('v1', 'v1'), ('v2', 'v2'), ('v3', 'v3'), ('v4', 'v4'), ('v5',
'v5'), ('v6', 'v6'), ('v7', 'v7'), ('v8', 'v8'), ('v9', 'v9')]
As expected.
Anyone run into this or a similar issue?
Ofer
12 matches
Mail list logo