Re:Re: Re:Re: Will the HiveContext cause memory leak ?

kramer2...@126.com Thu, 12 May 2016 18:50:46 -0700

It seems we hit the same issue.

There was a bug on 1.5.1 about memory leak. But I am using 1.6.1

Here is the link about the bug in 1.5.1
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

At 2016-05-12 23:10:43, "Simon Schiff [via Apache Spark User List]"
<ml-node+s1001560n2694...@n3.nabble.com> wrote:
I read with Spark-Streaming from a Port. The incoming data consists of key and
value pairs. Then I call forEachRDD on each window. There I create a Dataset
from the window and do some SQL Querys on it. On the result i only do show, to
see the content. It works well, but the memory usage increases. When it reaches
the maximum nothing works anymore. When I use more memory. The Program runs
some time longer, but the problem persists. Because I run a Programm which
writes to the Port, I can control perfectly how much Data Spark has to Process.
When I write every one ms one key and value Pair the Problem is the same as
when i write only every second a key and value pair to the port.

When I dont create a Dataset in the foreachRDD and only count the Elements in
the RDD, then everything works fine. I also use groupBy agg functions in the
querys.

If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/Will-the-HiveContext-cause-memory-leak-tp26921p26940.html
To unsubscribe from Will the HiveContext cause memory leak ?, click here.
NAML

--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Will-the-HiveContext-cause-memory-leak-tp26921p26946.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re:Re: Re:Re: Will the HiveContext cause memory leak ?

Reply via email to