yes I can certainly use jstack but it requires 4 to 5 hours for me to
reproduce the error so I can get back as early as possible.

Thanks a lot!

On Mon, Oct 31, 2016 at 12:41 PM, Shixiong(Ryan) Zhu <
shixi...@databricks.com> wrote:

> Then it should not be a Receiver issue. Could you use `jstack` to find out
> the name of leaking threads?
>
> On Mon, Oct 31, 2016 at 12:35 PM, kant kodali <kanth...@gmail.com> wrote:
>
>> Hi Ryan,
>>
>> It happens on the driver side and I am running on a client mode (not the
>> cluster mode).
>>
>> Thanks!
>>
>> On Mon, Oct 31, 2016 at 12:32 PM, Shixiong(Ryan) Zhu <
>> shixi...@databricks.com> wrote:
>>
>>> Sorry, there is a typo in my previous email: this may **not** be the
>>> root cause if the leak threads are in the driver side.
>>>
>>> Does it happen in the driver or executors?
>>>
>>> On Mon, Oct 31, 2016 at 12:20 PM, kant kodali <kanth...@gmail.com>
>>> wrote:
>>>
>>>> Hi Ryan,
>>>>
>>>> Ahh My Receiver.onStop method is currently empty.
>>>>
>>>> 1) I have a hard time seeing why the receiver would crash so many times 
>>>> within a span of 4 to 5 hours but anyways I understand I should still 
>>>> cleanup during OnStop.
>>>>
>>>> 2) How do I clean up those threads? The documentation here 
>>>> https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.html doesn't 
>>>> seem to have any method where I can clean up the threads created during 
>>>> OnStart. any ideas?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> On Mon, Oct 31, 2016 at 11:58 AM, Shixiong(Ryan) Zhu <
>>>> shixi...@databricks.com> wrote:
>>>>
>>>>> So in your code, each Receiver will start a new thread. Did you stop
>>>>> the receiver properly in `Receiver.onStop`? Otherwise, you may leak 
>>>>> threads
>>>>> after a receiver crashes and is restarted by Spark. However, this may be
>>>>> the root cause since the leak threads are in the driver side. Could you 
>>>>> use
>>>>> `jstack` to check which types of threads are leaking?
>>>>>
>>>>> On Mon, Oct 31, 2016 at 11:50 AM, kant kodali <kanth...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I am also under the assumption that *onStart *function of the
>>>>>> Receiver is only called only once by Spark. please correct me if I
>>>>>> am wrong.
>>>>>>
>>>>>> On Mon, Oct 31, 2016 at 11:35 AM, kant kodali <kanth...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> My driver program runs a spark streaming job.  And it spawns a
>>>>>>> thread by itself only in the *onStart()* function below Other than
>>>>>>> that it doesn't spawn any other threads. It only calls MapToPair,
>>>>>>> ReduceByKey, forEachRDD, Collect functions.
>>>>>>>
>>>>>>> public class NSQReceiver extends Receiver<String> {
>>>>>>>
>>>>>>>     private String topic="";
>>>>>>>
>>>>>>>     public NSQReceiver(String topic) {
>>>>>>>         super(StorageLevel.MEMORY_AND_DISK_2());
>>>>>>>         this.topic = topic;
>>>>>>>     }
>>>>>>>
>>>>>>>     @Override
>>>>>>>     public void *onStart()* {
>>>>>>>         new Thread()  {
>>>>>>>             @Override public void run() {
>>>>>>>                 receive();
>>>>>>>             }
>>>>>>>         }.start();
>>>>>>>     }
>>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>> Environment info:
>>>>>>>
>>>>>>> Java 8
>>>>>>>
>>>>>>> Scala 2.11.8
>>>>>>>
>>>>>>> Spark 2.0.0
>>>>>>>
>>>>>>> More than happy to share any other info you may need.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 31, 2016 at 11:05 AM, Jakob Odersky <ja...@odersky.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>  > how do I tell my spark driver program to not create so many?
>>>>>>>>
>>>>>>>> This may depend on your driver program. Do you spawn any threads in
>>>>>>>> it? Could you share some more information on the driver program,
>>>>>>>> spark
>>>>>>>> version and your environment? It would greatly help others to help
>>>>>>>> you
>>>>>>>>
>>>>>>>> On Mon, Oct 31, 2016 at 3:47 AM, kant kodali <kanth...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > The source of my problem is actually that I am running into the
>>>>>>>> following
>>>>>>>> > error. This error seems to happen after running my driver program
>>>>>>>> for 4
>>>>>>>> > hours.
>>>>>>>> >
>>>>>>>> > "Exception in thread "ForkJoinPool-50-worker-11" Exception in
>>>>>>>> thread
>>>>>>>> > "dag-scheduler-event-loop" Exception in thread
>>>>>>>> "ForkJoinPool-50-worker-13"
>>>>>>>> > java.lang.OutOfMemoryError: unable to create new native thread"
>>>>>>>> >
>>>>>>>> > and this wonderful book taught me that the error "unable to
>>>>>>>> create new
>>>>>>>> > native thread" can happen because JVM is trying to request the OS
>>>>>>>> for a
>>>>>>>> > thread and it is refusing to do so for the following reasons
>>>>>>>> >
>>>>>>>> > 1. The system has actually run out of virtual memory.
>>>>>>>> > 2. On Unix-style systems, the user has already created (between
>>>>>>>> all programs
>>>>>>>> > user is running) the maximum number of processes configured for
>>>>>>>> that user
>>>>>>>> > login. Individual threads are considered a process in that regard.
>>>>>>>> >
>>>>>>>> > Option #2 is ruled out in my case because my driver programing is
>>>>>>>> running
>>>>>>>> > with a userid of root which has  maximum number of processes set
>>>>>>>> to 120242
>>>>>>>> >
>>>>>>>> > ulimit -a gives me the following
>>>>>>>> >
>>>>>>>> > core file size          (blocks, -c) 0
>>>>>>>> > data seg size           (kbytes, -d) unlimited
>>>>>>>> > scheduling priority             (-e) 0
>>>>>>>> > file size               (blocks, -f) unlimited
>>>>>>>> > pending signals                 (-i) 120242
>>>>>>>> > max locked memory       (kbytes, -l) 64
>>>>>>>> > max memory size         (kbytes, -m) unlimited
>>>>>>>> > open files                      (-n) 1024
>>>>>>>> > pipe size            (512 bytes, -p) 8
>>>>>>>> > POSIX message queues     (bytes, -q) 819200
>>>>>>>> > real-time priority              (-r) 0
>>>>>>>> > stack size              (kbytes, -s) 8192
>>>>>>>> > cpu time               (seconds, -t) unlimited
>>>>>>>> > max user processes              (-u) 120242
>>>>>>>> > virtual memory          (kbytes, -v) unlimited
>>>>>>>> > file locks                      (-x) unlimited
>>>>>>>> >
>>>>>>>> > So at this point I do understand that the I am running out of
>>>>>>>> memory due to
>>>>>>>> > allocation of threads so my biggest question is how do I tell my
>>>>>>>> spark
>>>>>>>> > driver program to not create so many?
>>>>>>>> >
>>>>>>>> > On Mon, Oct 31, 2016 at 3:25 AM, Sean Owen <so...@cloudera.com>
>>>>>>>> wrote:
>>>>>>>> >>
>>>>>>>> >> ps -L [pid] is what shows threads. I am not sure this is
>>>>>>>> counting what you
>>>>>>>> >> think it does. My shell process has about a hundred threads, and
>>>>>>>> I can't
>>>>>>>> >> imagine why one would have thousands unless your app spawned
>>>>>>>> them.
>>>>>>>> >>
>>>>>>>> >> On Mon, Oct 31, 2016 at 10:20 AM kant kodali <kanth...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> >>>
>>>>>>>> >>> when I do
>>>>>>>> >>>
>>>>>>>> >>> ps -elfT | grep "spark-driver-program.jar" | wc -l
>>>>>>>> >>>
>>>>>>>> >>> The result is around 32K. why does it create so many threads
>>>>>>>> how can I
>>>>>>>> >>> limit this?
>>>>>>>> >
>>>>>>>> >
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to