[ 
https://issues.apache.org/jira/browse/NUTCH-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306194#comment-14306194
 ] 

stack commented on NUTCH-1935:
------------------------------

The hbase refguide says "It is recommended to raise the ulimit to at least 
10,000..." Also explains how to count the files hbase is keeping open.

Have you watched the file count and the open count type over time?  Perhaps a 
leaking file descriptors in nutch<->hbase communication?

> too many open files
> -------------------
>
>                 Key: NUTCH-1935
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1935
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 2.2
>            Reporter: yuanyun.cn
>            Priority: Minor
>              Labels: leak
>
> This is to track the " too many open files" issue from 
> http://lucene.472066.n3.nabble.com/Cannot-run-program-quot-chmod-quot-too-many-open-files-td4109753.html
> Recently I also hit this too many open files issues, after run our crawler 
> application for about 2-5 months, it will fail " too many open files" errors.
> We have to restart nutch server and hbase.
> Nutch Server opens 4249 files
> lsof -p 17849 | wc -l
> 4249
> There are a lot of pipe, sock, and connections to hbase.
> cat sorted | grep pipe | wc -l
> 1624
> cat sorted | grep 2181 | wc -l 
> 813
> 907 java    17849 user 1465u  sock                0,7      0t0 65043894 can't 
> identify protocol
> 1843 java    17849 user 2744u  IPv6           76971706      0t0      TCP 
> localhost:35073->localhost:2181 (ESTABLISHED) // connect to hbase
> cat sorted | grep anon_inode | wc -l
> 812
> Thanks...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to