Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2016-01-06 Thread Priya Ch
Running 'lsof' will let us know the open files but how do we come to know the root cause behind opening too many files. Thanks, Padma CH On Wed, Jan 6, 2016 at 8:39 AM, Hamel Kothari wrote: > The "Too Many Files" part of the exception is just indicative of the fact >

Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2016-01-06 Thread Priya Ch
The line of code which I highlighted in the screenshot is within the spark source code. Spark implements sort-based shuffle implementation and the spilled files are merged using the merge sort. Here is the link https://issues.apache.org/jira/secure/attachment/12655884/Sort-basedshuffledesign.pdf

Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2016-01-05 Thread Annabel Melongo
Vijay, Are you closing the fileinputstream at the end of each loop ( in.close())? My guess is those streams aren't close and thus the "too many open files" exception. On Tuesday, January 5, 2016 8:03 AM, Priya Ch wrote: Can some one throw light on this ?

Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2016-01-05 Thread Priya Ch
Yes, the fileinputstream is closed. May be i didn't show in the screen shot . As spark implements, sort-based shuffle, there is a parameter called maximum merge factor which decides the number of files that can be merged at once and this avoids too many open files. I am suspecting that it is

Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2016-01-05 Thread Priya Ch
Can some one throw light on this ? Regards, Padma Ch On Mon, Dec 28, 2015 at 3:59 PM, Priya Ch wrote: > Chris, we are using spark 1.3.0 version. we have not set > spark.streaming.concurrentJobs > this parameter. It takes the default value. > > Vijay, > > From

Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2015-12-28 Thread Priya Ch
Chris, we are using spark 1.3.0 version. we have not set spark.streaming.concurrentJobs this parameter. It takes the default value. Vijay, From the tack trace it is evident that

Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2015-12-25 Thread Chris Fregly
and which version of Spark/Spark Streaming are you using? are you explicitly setting the spark.streaming.concurrentJobs to something larger than the default of 1? if so, please try setting that back to 1 and see if the problem still exists. this is a dangerous parameter to modify from the

java.io.FileNotFoundException(Too many open files) in Spark streaming

2015-12-23 Thread Vijay Gharge
Few indicators - 1) during execution time - check total number of open files using lsof command. Need root permissions. If it is cluster not sure much ! 2) which exact line in the code is triggering this error ? Can you paste that snippet ? On Wednesday 23 December 2015, Priya Ch

Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2015-12-23 Thread Priya Ch
ulimit -n 65000 fs.file-max = 65000 ( in etc/sysctl.conf file) Thanks, Padma Ch On Tue, Dec 22, 2015 at 6:47 PM, Yash Sharma wrote: > Could you share the ulimit for your setup please ? > > - Thanks, via mobile, excuse brevity. > On Dec 22, 2015 6:39 PM, "Priya Ch"

Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2015-12-22 Thread Priya Ch
Jakob, Increased the settings like fs.file-max in /etc/sysctl.conf and also increased user limit in /etc/security/limits.conf. But still see the same issue. On Fri, Dec 18, 2015 at 12:54 AM, Jakob Odersky wrote: > It might be a good idea to see how many files are open

java.io.FileNotFoundException(Too many open files) in Spark streaming

2015-12-17 Thread Priya Ch
Hi All, When running streaming application, I am seeing the below error: java.io.FileNotFoundException: /data1/yarn/nm/usercache/root/appcache/application_1450172646510_0004/blockmgr-a81f42cd-6b52-4704-83f3-2cfc12a11b86/02/temp_shuffle_589ddccf-d436-4d2c-9935-e5f8c137b54b (Too many open

Re: java.io.FileNotFoundException(Too many open files) in Spark streaming

2015-12-17 Thread Jakob Odersky
It might be a good idea to see how many files are open and try increasing the open file limit (this is done on an os level). In some application use-cases it is actually a legitimate need. If that doesn't help, make sure you close any unused files and streams in your code. It will also be easier