[jira] [Resolved] (SPARK-20352) PySpark SparkSession initialization take longer every iteration in a single application

2017-04-17 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-20352.
---
Resolution: Not A Problem

"My code takes too long to run" is not a JIRA. You haven't addressed the 
underlying problem here, which is that you're reopening contexts.
Let committers reopen JIRAs. This should _not_ be reopened.

> PySpark SparkSession initialization take longer every iteration in a single 
> application
> ---
>
> Key: SPARK-20352
> URL: https://issues.apache.org/jira/browse/SPARK-20352
> Project: Spark
>  Issue Type: Question
>  Components: PySpark
>Affects Versions: 2.1.0
> Environment: Ubuntu 12
> Spark 2.1
> JRE 8.0
> Python 2.7
>Reporter: hosein
>
> I run Spark on a standalone Ubuntu server with 128G memory and 32-core CPU. 
> Run spark-sumbit my_code.py without any additional configuration parameters.
> In a while loop I start SparkSession, analyze data and then stop the context 
> and this process repeats every 10 seconds.
> {code}
> while True:
> spark =   
> SparkSession.builder.appName("sync_task").config('spark.driver.maxResultSize' 
> , '5g').getOrCreate()
> sc = spark.sparkContext
> #some process and analyze
> spark.stop()
> {code}
> When program starts, it works perfectly.
> but when it works for many hours. spark initialization take long time. it 
> makes 10 or 20 seconds for just initializing spark.
> So what is the problem ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-20352) PySpark SparkSession initialization take longer every iteration in a single application

2017-04-17 Thread Sean Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-20352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-20352.
---
   Resolution: Not A Problem
Fix Version/s: (was: 2.1.0)

At the least, it's not supported to stop and start contexts within an app. 
There should be no need to do that. 
You haven't provided detail on what is slow, and that's up to you before you 
create an issue.
Please read http://spark.apache.org/contributing.html

> PySpark SparkSession initialization take longer every iteration in a single 
> application
> ---
>
> Key: SPARK-20352
> URL: https://issues.apache.org/jira/browse/SPARK-20352
> Project: Spark
>  Issue Type: Question
>  Components: PySpark
>Affects Versions: 2.1.0
> Environment: Ubuntu 12
> Spark 2.1
> JRE 8.0
> Python 2.7
>Reporter: hosein
>
> I run Spark on a standalone Ubuntu server with 128G memory and 32-core CPU. 
> Run spark-sumbit my_code.py without any additional configuration parameters.
> In a while loop I start SparkSession, analyze data and then stop the context 
> and this process repeats every 10 seconds.
> {code}
> while True:
> spark =   
> SparkSession.builder.appName("sync_task").config('spark.driver.maxResultSize' 
> , '5g').getOrCreate()
> sc = spark.sparkContext
> #some process and analyze
> spark.stop()
> {code}
> When program starts, it works perfectly.
> but when it works for many hours. spark initialization take long time. it 
> makes 10 or 20 seconds for just initializing spark.
> So what is the problem ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org