Hi Neal,

Yes, my question is: How can I run Plasma Store in each worker node on Spark 
cluster.

Suppose my cluster consist of 6 nodes (1 master plus 5 workers), I want to run 
Plasma Store on all 5 worker nodes. Thanks.


Regards,
Tanveer Ahmad
________________________________
From: Neal Richardson <[email protected]>
Sent: Thursday, June 11, 2020 12:40:47 AM
To: [email protected]
Subject: Re: Running plasma_store_server (in background) on each Spark worker 
node

Hi Tanveer,
Do you have any specific questions, or have you encountered trouble with your 
setup?

Neal

On Wed, Jun 10, 2020 at 2:23 PM Tanveer Ahmad - EWI 
<[email protected]<mailto:[email protected]>> wrote:

Hi all,

I want to run an external command (plasma_store_server -m 3000000000 -s 
/tmp/store0 &) in the background on each worker node of my Spark 
cluster<https://urldefense.proofpoint.com/v2/url?u=https-3A__userinfo.surfsara.nl_systems_cartesius_software_spark&d=DwMFaQ&c=XYzUhXBD2cD-CornpT4QE19xOJBbRy-TBPLK0X9U2o8&r=0FbbJetCCSYzJEnEDCQ1rNv76vTL6SUFCukKhvNosPs&m=OTFkDBpK8Wz6I3ICsdIVeAGOElHKBdYn32SJXTiW--Y&s=TCwFQ8RNAB50SqrdEDQlggcrojrUYiabQx4sdfq980A&e=>.
 So that that external process should be running during the whole Spark job.

The plasma_store_server process is used for storing and retrieving Apache Arrow 
data in Apache Spark.

I am using PySpark for Spark programming and SLURM for Spark 
cluster<https://urldefense.proofpoint.com/v2/url?u=https-3A__userinfo.surfsara.nl_systems_cartesius_software_spark&d=DwMFaQ&c=XYzUhXBD2cD-CornpT4QE19xOJBbRy-TBPLK0X9U2o8&r=0FbbJetCCSYzJEnEDCQ1rNv76vTL6SUFCukKhvNosPs&m=OTFkDBpK8Wz6I3ICsdIVeAGOElHKBdYn32SJXTiW--Y&s=TCwFQ8RNAB50SqrdEDQlggcrojrUYiabQx4sdfq980A&e=>
 creation.

Any help will be highly appreciated!

Regards,

Tanveer Ahmad

Reply via email to