[jira] [Commented] (SPARK-24092) spark.python.worker.reuse does not work?

2018-04-27 Thread David Figueroa (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456464#comment-16456464
 ] 

David Figueroa commented on SPARK-24092:


still no answer from community 

[https://stackoverflow.com/questions/50043684/spark-python-worker-reuse-not-working-as-expected]

[http://apache-spark-user-list.1001560.n3.nabble.com/spark-python-worker-reuse-not-working-as-expected-td31976.html]

can anyone look at this problem? seems like a bug.

> spark.python.worker.reuse does not work?
> 
>
> Key: SPARK-24092
> URL: https://issues.apache.org/jira/browse/SPARK-24092
> Project: Spark
>  Issue Type: Question
>  Components: PySpark
>Affects Versions: 2.3.0
>Reporter: David Figueroa
>Priority: Minor
>
> {{spark.python.worker.reuse is true by default but even after explicitly 
> setting to true the code below does not print the same python worker process 
> ids.}}
> {code:java|title=procid.py|borderStyle=solid}
> def return_pid(_): yield os.getpid()
> spark = SparkSession.builder.getOrCreate()
> pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
> print(pids)
> pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
> print(pids){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24092) spark.python.worker.reuse does not work?

2018-04-25 Thread David Figueroa (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Figueroa updated SPARK-24092:
---
Description: 
{{spark.python.worker.reuse is true by default but even after explicitly 
setting to true the code below does not print the same python worker process 
ids.}}
{code:java|title=procid.py|borderStyle=solid}
def return_pid(_): yield os.getpid()
spark = SparkSession.builder.getOrCreate()
 pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids)
pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids){code}

  was:
{{spark.python.worker.reuse is true by default but even after explicitly 
setting to true the code below does not print the same python worker process 
ids.}}

{code:title=procid.py|borderStyle=solid} 

def return_pid(_): yield os.getpid()
spark = SparkSession.builder.getOrCreate()
 pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids)
pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids)

{code}


> spark.python.worker.reuse does not work?
> 
>
> Key: SPARK-24092
> URL: https://issues.apache.org/jira/browse/SPARK-24092
> Project: Spark
>  Issue Type: Question
>  Components: PySpark
>Affects Versions: 2.3.0
>Reporter: David Figueroa
>Priority: Minor
>
> {{spark.python.worker.reuse is true by default but even after explicitly 
> setting to true the code below does not print the same python worker process 
> ids.}}
> {code:java|title=procid.py|borderStyle=solid}
> def return_pid(_): yield os.getpid()
> spark = SparkSession.builder.getOrCreate()
>  pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
> print(pids)
> pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
> print(pids){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24092) spark.python.worker.reuse does not work?

2018-04-25 Thread David Figueroa (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Figueroa updated SPARK-24092:
---
Description: 
{{spark.python.worker.reuse is true by default but even after explicitly 
setting to true the code below does not print the same python worker process 
ids.}}
{code:java|title=procid.py|borderStyle=solid}
def return_pid(_): yield os.getpid()
spark = SparkSession.builder.getOrCreate()
pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids)
pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids){code}

  was:
{{spark.python.worker.reuse is true by default but even after explicitly 
setting to true the code below does not print the same python worker process 
ids.}}
{code:java|title=procid.py|borderStyle=solid}
def return_pid(_): yield os.getpid()
spark = SparkSession.builder.getOrCreate()
 pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids)
pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids){code}


> spark.python.worker.reuse does not work?
> 
>
> Key: SPARK-24092
> URL: https://issues.apache.org/jira/browse/SPARK-24092
> Project: Spark
>  Issue Type: Question
>  Components: PySpark
>Affects Versions: 2.3.0
>Reporter: David Figueroa
>Priority: Minor
>
> {{spark.python.worker.reuse is true by default but even after explicitly 
> setting to true the code below does not print the same python worker process 
> ids.}}
> {code:java|title=procid.py|borderStyle=solid}
> def return_pid(_): yield os.getpid()
> spark = SparkSession.builder.getOrCreate()
> pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
> print(pids)
> pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
> print(pids){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24092) spark.python.worker.reuse does not work?

2018-04-25 Thread David Figueroa (JIRA)
David Figueroa created SPARK-24092:
--

 Summary: spark.python.worker.reuse does not work?
 Key: SPARK-24092
 URL: https://issues.apache.org/jira/browse/SPARK-24092
 Project: Spark
  Issue Type: Question
  Components: PySpark
Affects Versions: 2.3.0
Reporter: David Figueroa


{{spark.python.worker.reuse is true by default but even after explicitly 
setting to true the code below does not print the same python worker process 
ids.}}

{code:title=procid.py|borderStyle=solid} 

def return_pid(_): yield os.getpid()
spark = SparkSession.builder.getOrCreate()
 pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids)
pids = set(spark.sparkContext.range(32).mapPartitions(return_pid).collect())
print(pids)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org