date:20220330

Re: loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

2022-03-30 Thread Enrico Minack

> Wrt looping: if I want to process 3 years of data, my modest cluster will never do it one go , I would expect? > I have to break it down in smaller pieces and run that in a loop (1 day is already lots of data). Well, that is exactly what Spark is made for. It splits the work up and

Re: loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

2022-03-30 Thread Bjørn Jørgensen

It`s quite impossible for anyone to answer your question about what is eating your memory, without even knowing what language you are using. If you are using C then it`s always pointers, that's the mem issue. If you are using python, there can be some like not using context manager like With

Re: loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

2022-03-30 Thread Joris Billen

Thanks for answer-much appreciated! This forum is very useful :-) I didnt know the sparkcontext stays alive. I guess this is eating up memory. The eviction means that he knows that he should clear some of the old cached memory to be able to store new one. In case anyone has good articles about

Re: loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

2022-03-30 Thread Sean Owen

The Spark context does not stop when a job does. It stops when you stop it. There could be many ways mem can leak. Caching maybe - but it will evict. You should be clearing caches when no longer needed. I would guess it is something else your program holds on to in its logic. Also consider not

loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

2022-03-30 Thread Joris Billen

Hi, I have a pyspark job submitted through spark-submit that does some heavy processing for 1 day of data. It runs with no errors. I have to loop over many days, so I run this spark job in a loop. I notice after couple executions the memory is increasing on all worker nodes and eventually this

RE: [EXTERNAL] Re: spark ETL and spark thrift server running together

2022-03-30 Thread Alex Kosberg

Hi Christophe, Thank you for the explanation! Regards, Alex From: Christophe Préaud Sent: Wednesday, March 30, 2022 3:43 PM To: Alex Kosberg ; user@spark.apache.org Subject: [EXTERNAL] Re: spark ETL and spark thrift server running together Hi Alex, As stated in the Hive documentation

Re: spark ETL and spark thrift server running together

2022-03-30 Thread Christophe Préaud

Hi Alex, As stated in the Hive documentation (https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration): *An embedded metastore database is mainly used for unit tests. Only one process can connect to the metastore database at a time, so it is not really a

spark ETL and spark thrift server running together

2022-03-30 Thread Alex Kosberg

Hi, Some details: * Spark SQL (version 3.2.1) * Driver: Hive JDBC (version 2.3.9) * ThriftCLIService: Starting ThriftBinaryCLIService on port 1 with 5...500 worker threads * BI tool is connect via odbc driver After activating Spark Thrift Server I'm unable to

Call for Presentations now open, ApacheCon North America 2022

2022-03-30 Thread Rich Bowen

[You are receiving this because you are subscribed to one or more user or dev mailing list of an Apache Software Foundation project.] ApacheCon draws participants at all levels to explore “Tomorrow’s Technology Today” across 300+ Apache projects and their diverse communities. ApacheCon showcases

Unusual bug,please help me,i can do nothing!!!

2022-03-30 Thread spark User

Hello, I am a spark user. I use the "spark-shell.cmd" startup command in windows cmd, the first startup is normal, when I use the "ctrl+c" command to force the end of the spark window, it can't start normally again. .The error message is as follows "Failed to initialize Spark

Re: loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

Re: loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

Re: loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

Re: loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

loop of spark jobs leads to increase in memory on worker nodes and eventually faillure

RE: [EXTERNAL] Re: spark ETL and spark thrift server running together

Re: spark ETL and spark thrift server running together

spark ETL and spark thrift server running together

Call for Presentations now open, ApacheCon North America 2022

Unusual bug,please help me,i can do nothing!!!

10 matches

Site Navigation

Mail list logo

Footer information