Re: [PySpark] Releasing memory after a spark job is finished

2018-06-04 Thread Shuporno Choudhury
l] > <http:///user/SendEmail.jtp?type=node=32465=2>> > *Cc: *"Jörn Franke [via Apache Spark User List]" <[hidden email] > <http:///user/SendEmail.jtp?type=node=32465=3>>, <[hidden email] > <http:///user/SendEmail.jtp?type=node=32465=4>> &g

Re: [PySpark] Releasing memory after a spark job is finished

2018-06-04 Thread Jörn Franke
be a better approach for > this problem? > > Can someone please help me with this and tell me if I am going wrong anywhere? > > --Thanks, > Shuporno Choudhury > > > If you reply to this email, your message will be added to the discussion > below: > http://apache-

Re: [PySpark] Releasing memory after a spark job is finished

2018-06-04 Thread Thakrar, Jayesh
;Jörn Franke [via Apache Spark User List]" , Subject: Re: [PySpark] Releasing memory after a spark job is finished Can you tell us what version of Spark you are using and if Dynamic Allocation is enabled ? Also, how are the files being read ? Is it a single read of all files using a file matc

Re: [PySpark] Releasing memory after a spark job is finished

2018-06-04 Thread Jay
fter-a-spark-job-is-finished-tp32454p32455.html >>> >> To start a new topic under Apache Spark User List, email [hidden email] >>> <http:///user/SendEmail.jtp?type=node=32458=2> >>> >> To unsubscribe from Apache Spark User List, click here. >>> NAML >>

Re: [PySpark] Releasing memory after a spark job is finished

2018-06-04 Thread Shuporno Choudhury
gt; If it is not possible to clear out memory, what can be a better approach >> for this problem? >> >> Can someone please help me with this and tell me if I am going wrong >> anywhere? >> >> --Thanks, >> Shuporno Choudhury >> >> >> &

Re: [PySpark] Releasing memory after a spark job is finished

2018-06-04 Thread Jörn Franke
ster. There is a single SparkSession >>> that is doing all the processing. >>> If it is not possible to clear out memory, what can be a better approach >>> for this problem? >>> >>> Can someone please help me with this and tell me if I am going wrong >>&g

Re: [PySpark] Releasing memory after a spark job is finished

2018-06-04 Thread Shuporno Choudhury
roblem? > > Can someone please help me with this and tell me if I am going wrong > anywhere? > > --Thanks, > Shuporno Choudhury > > > > -- > If you reply to this email, your message will be added to the discussion > below: > > http://

Re: [PySpark] Releasing memory after a spark job is finished

2018-06-04 Thread Jörn Franke
Why don’t you modularize your code and write for each process an independent python program that is submitted via Spark? Not sure though if Spark local make sense. If you don’t have a cluster then a normal python program can be much better. > On 4. Jun 2018, at 21:37, Shuporno Choudhury >

[PySpark] Releasing memory after a spark job is finished

2018-06-04 Thread Shuporno Choudhury
Hi everyone, I am trying to run a pyspark code on some data sets sequentially [basically 1. Read data into a dataframe 2.Perform some join/filter/aggregation 3. Write modified data in parquet format to a target location] Now, while running this pyspark code across *multiple independent data sets