[jira] [Assigned] (FLUME-3078) Expose Monitoring Metrics For Netcat Source
[ https://issues.apache.org/jira/browse/FLUME-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned FLUME-3078: -- Assignee: Siddharth Ahuja > Expose Monitoring Metrics For Netcat Source > --- > > Key: FLUME-3078 > URL: https://issues.apache.org/jira/browse/FLUME-3078 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.7.0 > Environment: Linux >Reporter: Gangaraju > Assignee: Siddharth Ahuja > > Currently for Netcat source , no metrics being shown if we query using the > HTTP. ( -DFlume.monitoring.type=http). > It will be better , if we can get the details on important stats and errors > for the Netcat source. > We are looking for following stats: > EventRecievedCount > EventAcceptedCount > StartTime > StopTime > OpenConnectionCount > ConnectionsFailed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: Review Request 60357: NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60357/ --- (Updated June 28, 2017, 4:58 a.m.) Review request for Flume and Attila Simon. Changes --- Updated code based on comments from Attila Simon. Repository: flume-git Description --- Review request for: https://issues.apache.org/jira/browse/FLUME-2905 which is trying to prevent socket leaks when a netcat port is already bound to an existing process. Diffs (updated) - flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java 9513902 flume-ng-core/src/test/java/org/apache/flume/source/TestNetcatSource.java 99d413a Diff: https://reviews.apache.org/r/60357/diff/2/ Changes: https://reviews.apache.org/r/60357/diff/1-2/ Testing --- I have tested flume-ng executable generated from my changes and I can confirm from the lsof output that the sockets do not keep increasing if the port to which netcat source is trying to bind to is already in use. The junits are also passing for me for the NetcatSource. Thanks, Siddharth Ahuja
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064494#comment-16064494 ] Siddharth Ahuja commented on FLUME-2905: Hey [~sati], thanks for your reply and comments on the review. I have uploaded a new patch to this JIRA based on your comments. I hope this is enough to get the fix out of the way! Thanks in advance for reviewing :) > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch, > FLUME-2905-5.patch, FLUME-2905-6.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064494#comment-16064494 ] Siddharth Ahuja edited comment on FLUME-2905 at 6/27/17 9:00 AM: - Hey [~sati], thanks for your reply and comments on the review. I have uploaded a new patch (patch #6) to this JIRA based on your comments. I hope this is enough to get the fix out of the way! Thanks in advance for reviewing :) was (Author: sahuja): Hey [~sati], thanks for your reply and comments on the review. I have uploaded a new patch to this JIRA based on your comments. I hope this is enough to get the fix out of the way! Thanks in advance for reviewing :) > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch, > FLUME-2905-5.patch, FLUME-2905-6.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2905: --- Attachment: FLUME-2905-6.patch > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch, > FLUME-2905-5.patch, FLUME-2905-6.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058799#comment-16058799 ] Siddharth Ahuja commented on FLUME-2905: Hey [~sati], just added the review request on the Review Board. Not sure if I have done everything as per the process. Would be great if you could check. Thanks in advance! > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch, FLUME-2905-5.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Review Request 60357: NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60357/ --- Review request for Flume and Attila Simon. Repository: flume-git Description --- Review request for: https://issues.apache.org/jira/browse/FLUME-2905 which is trying to prevent socket leaks when a netcat port is already bound to an existing process. Diffs - flume-ng-core/src/main/java/org/apache/flume/source/NetcatSource.java 9513902 flume-ng-core/src/test/java/org/apache/flume/source/TestNetcatSource.java 99d413a Diff: https://reviews.apache.org/r/60357/diff/1/ Testing --- I have tested flume-ng executable generated from my changes and I can confirm from the lsof output that the sockets do not keep increasing if the port to which netcat source is trying to bind to is already in use. The junits are also passing for me for the NetcatSource. Thanks, Siddharth Ahuja
[jira] [Updated] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2905: --- Attachment: FLUME-2905-5.patch > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch, FLUME-2905-5.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058745#comment-16058745 ] Siddharth Ahuja edited comment on FLUME-2905 at 6/22/17 5:02 AM: - Hi [~sati], thanks a lot for your review. Please find my answers for your points: * For point 1. - "calling stop() after writing out the exception", I have moved stop() after logging the exception but just before it gets thrown. * For point 2. - We should possibly have a dedicated JIRA for removing the "return" statement from the stop() method as this would be a different issue to what I am trying to fix in this JIRA which is to prevent socket leaks if a port is already bound. Also, it would make tracking easier with a new JIRA as otherwise any issues (if any) arising from this removal will be discussed in this JIRA which is a side-track from the original issue that is already potentially resolved. What do you think? * For point 3 - I believe I have nothing to do here. I have gone on and created a new patch - FLUME-2905-5.patch for your review. I have tested that to ensure that there are no leaks and the junit also passes for me. I will try and add this to review board (haven't done that yet) soon. Thanks once again. was (Author: sahuja): Hi [~sati], thanks a lot for your review. Please find my answers for your points: * For point 1. - "calling stop() after writing out the exception", I have moved stop() after logging the exception but just before it gets thrown. * For point 2. - We should possibly have a dedicated JIRA for removing the "return" statement from the stop() method as this would be a different issue to what I am trying to fix in this JIRA which is to prevent socket leaks if a port is already bound. Also, it would make tracking easier with a new JIRA as otherwise any issues (if any) arising from this removal will be discussed in this JIRA which is a side-track from the original issue that is already potentially resolved. What do you think? * For point 3 - I believe I have nothing to do here. I have gone on and created a new patch - FLUME-2905-5.patch for your review and I have also tested that to ensure that there are no leaks. I will try and add this to review board (haven't done that yet) soon. Thanks once again. > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Ne
[jira] [Comment Edited] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058745#comment-16058745 ] Siddharth Ahuja edited comment on FLUME-2905 at 6/22/17 4:55 AM: - Hi [~sati], thanks a lot for your review. Please find my answers for your points: * For point 1. - "calling stop() after writing out the exception", I have moved stop() after logging the exception but just before it gets thrown. * For point 2. - We should possibly have a dedicated JIRA for removing the "return" statement from the stop() method as this would be a different issue to what I am trying to fix in this JIRA which is to prevent socket leaks if a port is already bound. Also, it would make tracking easier with a new JIRA as otherwise any issues (if any) arising from this removal will be discussed in this JIRA which is a side-track from the original issue that is already potentially resolved. What do you think? * For point 3 - I believe I have nothing to do here. I have gone on and created a new patch - FLUME-2905-5.patch for your review and I have also tested that to ensure that there are no leaks. I will try and add this to review board (haven't done that yet) soon. Thanks once again. was (Author: sahuja): Hi [~sati], thanks a lot for your review. Please find my answers for your points: * For point 1. - "calling stop() after writing out the exception", I have moved stop() after logging the exception but just before it gets thrown. * For point 2. - We should possibly have a dedicated JIRA for removing the "return" statement from the stop() method as this would be a different issue to what I am trying to fix in this JIRA which is to prevent socket leaks if a port is already bound. Also, it would make tracking easier with a new JIRA as otherwise any issues (if any) arising from this removal will be discussed in this JIRA which is a side-track from the original issue that is already potentially resolved. What do you think? * For point 3 - I believe I have nothing to do here. I have gone on and created a new patch - FLUME-2905-5.patch for your review and I have also tested that. I will try and add this to review board (haven't done that yet) soon. Thanks once again. > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058745#comment-16058745 ] Siddharth Ahuja commented on FLUME-2905: Hi [~sati], thanks a lot for your review. Please find my answers for your points: * For point 1. - "calling stop() after writing out the exception", I have moved stop() after logging the exception but just before it gets thrown. * For point 2. - We should possibly have a dedicated JIRA for removing the "return" statement from the stop() method as this would be a different issue to what I am trying to fix in this JIRA which is to prevent socket leaks if a port is already bound. Also, it would make tracking easier with a new JIRA as otherwise any issues (if any) arising from this removal will be discussed in this JIRA which is a side-track from the original issue that is already potentially resolved. What do you think? * For point 3 - I believe I have nothing to do here. I have gone on and created a new patch - FLUME-2905-5.patch for your review. I will try and add this to review board (haven't done that yet) soon. Thanks once again. > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources > Affects Versions: 1.6.0 > Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16058745#comment-16058745 ] Siddharth Ahuja edited comment on FLUME-2905 at 6/22/17 4:54 AM: - Hi [~sati], thanks a lot for your review. Please find my answers for your points: * For point 1. - "calling stop() after writing out the exception", I have moved stop() after logging the exception but just before it gets thrown. * For point 2. - We should possibly have a dedicated JIRA for removing the "return" statement from the stop() method as this would be a different issue to what I am trying to fix in this JIRA which is to prevent socket leaks if a port is already bound. Also, it would make tracking easier with a new JIRA as otherwise any issues (if any) arising from this removal will be discussed in this JIRA which is a side-track from the original issue that is already potentially resolved. What do you think? * For point 3 - I believe I have nothing to do here. I have gone on and created a new patch - FLUME-2905-5.patch for your review and I have also tested that. I will try and add this to review board (haven't done that yet) soon. Thanks once again. was (Author: sahuja): Hi [~sati], thanks a lot for your review. Please find my answers for your points: * For point 1. - "calling stop() after writing out the exception", I have moved stop() after logging the exception but just before it gets thrown. * For point 2. - We should possibly have a dedicated JIRA for removing the "return" statement from the stop() method as this would be a different issue to what I am trying to fix in this JIRA which is to prevent socket leaks if a port is already bound. Also, it would make tracking easier with a new JIRA as otherwise any issues (if any) arising from this removal will be discussed in this JIRA which is a side-track from the original issue that is already potentially resolved. What do you think? * For point 3 - I believe I have nothing to do here. I have gone on and created a new patch - FLUME-2905-5.patch for your review. I will try and add this to review board (haven't done that yet) soon. Thanks once again. > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) >
[jira] [Updated] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2905: --- Attachment: FLUME-2905-4.patch > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch, FLUME-2905-4.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16050088#comment-16050088 ] Siddharth Ahuja commented on FLUME-2905: Hi [~denes],[~jarcec], I have just updated the junit again with my latest patch, would have you time to review this for me please? Hopefully, the junits work this time around! Thank you in advance! > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16049006#comment-16049006 ] Siddharth Ahuja commented on FLUME-2905: Hi [~denes], I have just created another patch (patch #3) with a minor change to my junit tests. The Junits are passing in my local build, as such, it would be great if you could test them out again. Thanks in advance for your help! > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2905: --- Attachment: FLUME-2905-3.patch > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: 1.6.0 >Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch, FLUME-2905-3.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: [ANNOUNCE] New Flume committer - Denes Arvay
Congratulations Denes! On Thu, May 25, 2017 at 7:28 AM, Tristan Stevens wrote: > Congratulations Denes. Well deserved! > Tristan > > > On 22 May 2017 4:54 pm, "Attila Simon" wrote: > > Kudos to Denes! Well deserved! > > Cheers, > Attila > > On Mon, May 22, 2017 at 7:07 AM, Mike Percy wrote: > > > On behalf of the Apache Flume PMC, I am very pleased to welcome Denes > Arvay > > as a committer on the Apache Flume project. > > > > Denes has put a lot of effort into improving the stability of Flume, most > > recently focusing on identifying and fixing serious and hard-to-diagnose > > issues including several bugs that could cause data loss. > > > > Congratulations and welcome, Denes! > > > > Best, > > Mike > > >
Re: [ANNOUNCE] Two new Flume committers
Congratulations Bessenyei and Jeff!! Well done! Regards, Siddharth On Tue, Sep 20, 2016 at 11:11 AM, Johny Rufus John wrote: > Congrats Bessenyei and Jeff !! > > Regards, > Rufus > > On Mon, Sep 19, 2016 at 4:43 PM, Mike Percy wrote: > > > Hi Apache Flume community, > > > > I am very happy to announce that the Flume PMC has voted to add Bessenyei > > Balázs Donát and Jeff Holoman as committers in recognition of their > > contributions to Flume. > > > > Over the past few months, Donat has contributed and reviewed many > patches, > > more than any non-committer. He has contributed several bug fixes and > > improvements and has shepherded important, long-forgotten patches through > > the review and commit process, with more in-progress. He is also > currently > > working on improvements to the Flume configuration system. > > > > Jeff has contributed several important improvements to Flume in recent > > months, including adding support for secure Kafka to Flume, improving the > > AvroEventSerializer, and adding additional smarts to the HDFS sink. > > > > Please join me in congratulating them on their new committership! > > > > Best regards, > > Mike > > > > >
[jira] [Commented] (FLUME-2966) NULL text in a TextMessage from a JMS source in Flume can lead to NPE
[ https://issues.apache.org/jira/browse/FLUME-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15466170#comment-15466170 ] Siddharth Ahuja commented on FLUME-2966: To provide a bit more context on this as requested, I raised this JIRA based on my investigation of the NPE stacktrace encountered by one of the customers running IBM Websphere MQ with Flume 1.5.0 as per below: {code} ERROR org.apache.flume.source.jms.JMSSource Unexpected error processing events java.lang.NullPointerException at org.apache.flume.source.jms.DefaultJMSMessageConverter.convert(DefaultJMSMessageConverter.java:101) at org.apache.flume.source.jms.JMSMessageConsumer.take(JMSMessageConsumer.java:124) at org.apache.flume.source.jms.JMSSource.doProcess(JMSSource.java:261) at org.apache.flume.source.AbstractPollableSource.process(AbstractPollableSource.java:58) at org.apache.flume.source.PollableSourceRunner$PollingRunner.run(PollableSourceRunner.java:137) at java.lang.Thread.run(Thread.java:745) {code} Please find my analysis for the above as per below: a) Based on code inspection at https://github.com/apache/flume/blob/flume-1.5/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L101, event and textMessage cannot be NULL as a brand new event has been created just before at https://github.com/apache/flume/blob/flume-1.5/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L74 and message has been used earlier at https://github.com/apache/flume/blob/flume-1.5/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L77. Therefore, if any of them were null, we would have NPE'd with a different stacktrace. So the only remaining possibility is that textMessage.getText() is null. b) To confirm if we can have code using MQ frameworks send null text messages to Flume I got ActiveMQ going in Eclipse where I created a sample producer and consumer based on http://activemq.apache.org/hello-world.html. The Producer creates a text message (similar to this case) with a "null" text string for the destination queue as per below: {code} … // Create the destination (Topic or Queue) Destination destination = session.createQueue("TEST.FOO"); // Create a MessageProducer from the Session to the Topic or Queue MessageProducer producer = session.createProducer(destination); producer.setDeliveryMode(DeliveryMode.NON_PERSISTENT); // Create a null text message //String text = "Hello world! From: " + Thread.currentThread().getName() + " : " + this.hashCode(); String text = null; TextMessage message = session.createTextMessage(text); … {code} When I run the ActiveMQ application with this Producer, there are no errors/exceptions creating that text message with a null string. However, my consumer fails with an NPE while trying to get bytes off the null text as per below: {code} … if (message instanceof TextMessage) { TextMessage textMessage = (TextMessage) message; String text = textMessage.getText(); text.getBytes(Charset.defaultCharset()); <FAILS here because text is NULL System.out.println("Received: " + text); } else { System.out.println("Received: " + message); } … {code} The above test confirms that code using MQ frameworks can allow for a null text message to be sent through, as such, we need to check for this possibility in Flume so that these messages are ignored. I have also added a new Junit test : testNullTextMessage() that will correctly throw an NPE if my code from "DefaultJMSMessageConverter" in the patch is removed further confirming the fix. I have attached the sample ActiveMQ code used for my testing to the JIRA as well. Please let me know if this resolves your request [~bessbd] and if the latest patch (FLUME-2966-1.patch) conforms to your expectation (I have removed the trailing whitespaces and used assertEquals instead of assertTrue as per the other test cases). Thanks again! > NULL text in a TextMessage from a JMS source in Flume can lead to NPE > - > > Key: FLUME-2966 > URL: https://issues.apache.org/jira/browse/FLUME-2966 > Project: Flume > Issue Type: Bug >Affects Versions: v1.5.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attach
[jira] [Updated] (FLUME-2966) NULL text in a TextMessage from a JMS source in Flume can lead to NPE
[ https://issues.apache.org/jira/browse/FLUME-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2966: --- Attachment: FLUME-2966-1.patch App.java > NULL text in a TextMessage from a JMS source in Flume can lead to NPE > - > > Key: FLUME-2966 > URL: https://issues.apache.org/jira/browse/FLUME-2966 > Project: Flume > Issue Type: Bug > Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: App.java, FLUME-2966-0.patch, FLUME-2966-1.patch > > > Code at > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L103 > does not check for a NULL text in a TextMessage from a Flume JMS source. > This can lead to a NullPointerException here: > {code}textMessage.getText().getBytes(charset){code} while trying to > de-reference a null text from the textmessage. > We should probably skip these like the NULL Objects in the ObjectMessage just > below at: > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L107. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2966) NULL text in a TextMessage from a JMS source in Flume can lead to NPE
[ https://issues.apache.org/jira/browse/FLUME-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15463788#comment-15463788 ] Siddharth Ahuja commented on FLUME-2966: Sorry, was away on leave [~bessbd]. Just posted a patch for review. Please kindly advise if I needed to do something else for getting this reviewed.Thanks! > NULL text in a TextMessage from a JMS source in Flume can lead to NPE > - > > Key: FLUME-2966 > URL: https://issues.apache.org/jira/browse/FLUME-2966 > Project: Flume > Issue Type: Bug > Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: FLUME-2966-0.patch > > > Code at > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L103 > does not check for a NULL text in a TextMessage from a Flume JMS source. > This can lead to a NullPointerException here: > {code}textMessage.getText().getBytes(charset){code} while trying to > de-reference a null text from the textmessage. > We should probably skip these like the NULL Objects in the ObjectMessage just > below at: > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L107. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2966) NULL text in a TextMessage from a JMS source in Flume can lead to NPE
[ https://issues.apache.org/jira/browse/FLUME-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2966: --- Attachment: FLUME-2966-0.patch > NULL text in a TextMessage from a JMS source in Flume can lead to NPE > - > > Key: FLUME-2966 > URL: https://issues.apache.org/jira/browse/FLUME-2966 > Project: Flume > Issue Type: Bug > Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: FLUME-2966-0.patch > > > Code at > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L103 > does not check for a NULL text in a TextMessage from a Flume JMS source. > This can lead to a NullPointerException here: > {code}textMessage.getText().getBytes(charset){code} while trying to > de-reference a null text from the textmessage. > We should probably skip these like the NULL Objects in the ObjectMessage just > below at: > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L107. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2966) NULL text in a TextMessage from a JMS source in Flume can lead to NPE
[ https://issues.apache.org/jira/browse/FLUME-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411150#comment-15411150 ] Siddharth Ahuja commented on FLUME-2966: Ah, I managed to get it done myself!:) > NULL text in a TextMessage from a JMS source in Flume can lead to NPE > - > > Key: FLUME-2966 > URL: https://issues.apache.org/jira/browse/FLUME-2966 > Project: Flume > Issue Type: Bug > Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > > Code at > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L103 > does not check for a NULL text in a TextMessage from a Flume JMS source. > This can lead to a NullPointerException here: > {code}textMessage.getText().getBytes(charset){code} while trying to > de-reference a null text from the textmessage. > We should probably skip these like the NULL Objects in the ObjectMessage just > below at: > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L107. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLUME-2966) NULL text in a TextMessage from a JMS source in Flume can lead to NPE
[ https://issues.apache.org/jira/browse/FLUME-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned FLUME-2966: -- Assignee: Siddharth Ahuja > NULL text in a TextMessage from a JMS source in Flume can lead to NPE > - > > Key: FLUME-2966 > URL: https://issues.apache.org/jira/browse/FLUME-2966 > Project: Flume > Issue Type: Bug > Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > > Code at > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L103 > does not check for a NULL text in a TextMessage from a Flume JMS source. > This can lead to a NullPointerException here: > {code}textMessage.getText().getBytes(charset){code} while trying to > de-reference a null text from the textmessage. > We should probably skip these like the NULL Objects in the ObjectMessage just > below at: > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L107. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2966) NULL text in a TextMessage from a JMS source in Flume can lead to NPE
[ https://issues.apache.org/jira/browse/FLUME-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411148#comment-15411148 ] Siddharth Ahuja commented on FLUME-2966: Hi [~bessbd], thanks for checking. Would love to be able to work on the patch if you don't mind. Would it be possible for you to assign this to me? Thank you! > NULL text in a TextMessage from a JMS source in Flume can lead to NPE > - > > Key: FLUME-2966 > URL: https://issues.apache.org/jira/browse/FLUME-2966 > Project: Flume > Issue Type: Bug >Reporter: Siddharth Ahuja > > Code at > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L103 > does not check for a NULL text in a TextMessage from a Flume JMS source. > This can lead to a NullPointerException here: > {code}textMessage.getText().getBytes(charset){code} while trying to > de-reference a null text from the textmessage. > We should probably skip these like the NULL Objects in the ObjectMessage just > below at: > https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L107. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLUME-2966) NULL text in a TextMessage from a JMS source in Flume can lead to NPE
Siddharth Ahuja created FLUME-2966: -- Summary: NULL text in a TextMessage from a JMS source in Flume can lead to NPE Key: FLUME-2966 URL: https://issues.apache.org/jira/browse/FLUME-2966 Project: Flume Issue Type: Bug Reporter: Siddharth Ahuja Code at https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L103 does not check for a NULL text in a TextMessage from a Flume JMS source. This can lead to a NullPointerException here: {code}textMessage.getText().getBytes(charset){code} while trying to de-reference a null text from the textmessage. We should probably skip these like the NULL Objects in the ObjectMessage just below at: https://github.com/apache/flume/blob/trunk/flume-ng-sources/flume-jms-source/src/main/java/org/apache/flume/source/jms/DefaultJMSMessageConverter.java#L107. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297911#comment-15297911 ] Siddharth Ahuja commented on FLUME-2905: I have attached a new patch : FLUME-2905-2 for the above, thanks again for the review. > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2905: --- Attachment: FLUME-2905-2.patch > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch, > FLUME-2905-2.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297905#comment-15297905 ] Siddharth Ahuja commented on FLUME-2905: Thanks [~jarcec], I have found the issue here. It is to do with how I set up my source in TestNetcatSource.java. As I created a new NetcatSource in my test method and did not set up the channelProcessor for it manually it failed for you (but for some reason it worked fine for me in both my eclipse & command line environment). Regardless, I have modified the test so that it re-uses the existing source object with the channelProcessor already set up through the setup() method of the test suite. The point of the test is to ensure that when we start up a source that tries to bind on a port which is already bound we should stop the source. Creating a "new" source in the test suite is not important and we can re-use the already existing source. > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 > Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282411#comment-15282411 ] Siddharth Ahuja commented on FLUME-2905: Thanks [~jarcec], attached a new patch, hopefully this should be enough:) > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2905: --- Attachment: FLUME-2905-1.patch > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch, FLUME-2905-1.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275659#comment-15275659 ] Siddharth Ahuja commented on FLUME-2905: Attached the patch that stops the source and cleans up the socket when a BindException is encountered if a port is already in use by invoking stop() . Also, moved the creation of the thread pool near to where it gets used so that it does not have to be cleaned up during the stop(). JUnit is also provided. Tested the patch as follows: • Started HDFS (datanode) service that binds to the port 50010, so port is in use. • Started updated flume-agent with configuration containing Netcat source and File roll sink with bind port as 50010. • Investigated flume logs and found the BindException due to port already in use. • Ran "ps auxx|grep -i flume" to get the flume process id. • Ran "lsof -p | wc -l" multiple times to check if file descriptors are increasing. They are stable. • Stopped HDFS service to free up port 50010. • Noticed from flume-agent logs that source is finally started successfully with socket bound to 50010. • From a new terminal, ran "nc localhost 50010" and entered some text. • The text is written on the local filesystem successfully. > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources > Affects Versions: v1.6.0 > Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2905: --- Attachment: FLUME-2905-0.patch > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > Attachments: FLUME-2905-0.patch > > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > {code} > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > {code} > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated FLUME-2905: --- Description: During the flume agent start-up, the flume configuration containing the NetcatSource is parsed and the source's start() is called. If there is an issue while binding the channel's socket to a local address to configure the socket to listen for connections following exception is thrown but the socket open just before is not closed. {code} 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - Exception follows. org.apache.flume.FlumeException: java.net.BindException: Address already in use at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) ... 9 more {code} The source's start() is then called again leading to another socket being opened but not closed and so on. This leads to file descriptor (socket) leaks. This can be easily reproduced as follows: 1. Set Netcat as the source in flume agent configuration. 2. Set the bind port for the netcat source to a port which is already in use. e.g. in my case I used 50010 which is the port for DataNode's XCeiver Protocol in use by the HDFS service. 3. Start flume agent and perform "lsof -p | wc -l". Notice the file descriptors keep on growing due to socket leaks with errors like: "can't identify protocol". was: During the flume agent start-up, the flume configuration containing the NetcatSource is parsed and the source's start() is called. If there is an issue while binding the channel's socket to a local address to configure the socket to listen for connections following exception is thrown but the socket open just before is not closed. 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - Exception follows. org.apache.flume.FlumeException: java.net.BindException: Address already in use at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67
[jira] [Assigned] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned FLUME-2905: -- Assignee: Siddharth Ahuja > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Siddharth Ahuja > Assignee: Siddharth Ahuja > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: JIRA Assignment - FLUME-2905
Thanks Roshan! Just did the assignment to myself:) On Tue, May 3, 2016 at 9:34 AM, Roshan Naik wrote: > Added you to contributors list. Can you try assigning to self now ? > > > On 5/2/16, 4:32 PM, "Siddharth Ahuja" wrote: > > >Hi there, > > > >I created a Flume JIRA yesterday: > >https://issues.apache.org/jira/browse/FLUME-2905. > > > >Would it be possible for someone to assign this to me? > > > >Thanks in advance! > > > >Regards, > > > >Siddharth > >
JIRA Assignment - FLUME-2905
Hi there, I created a Flume JIRA yesterday: https://issues.apache.org/jira/browse/FLUME-2905. Would it be possible for someone to assign this to me? Thanks in advance! Regards, Siddharth
JIRA assignment
Hi there, I created a Flume JIRA yesterday: https://issues.apache.org/jira/browse/FLUME-2905. Would it be possible for someone to assign this to me? Thanks in advance! Regards, Siddharth
[jira] [Commented] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
[ https://issues.apache.org/jira/browse/FLUME-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266049#comment-15266049 ] Siddharth Ahuja commented on FLUME-2905: Trying to get it assigned to myself... > NetcatSource - Socket not closed when an exception is encountered during > start() leading to file descriptor leaks > - > > Key: FLUME-2905 > URL: https://issues.apache.org/jira/browse/FLUME-2905 > Project: Flume > Issue Type: Bug > Components: Sinks+Sources >Affects Versions: v1.6.0 >Reporter: Siddharth Ahuja > > During the flume agent start-up, the flume configuration containing the > NetcatSource is parsed and the source's start() is called. If there is an > issue while binding the channel's socket to a local address to configure the > socket to listen for connections following exception is thrown but the socket > open just before is not closed. > 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: > Unable to start EventDrivenSourceRunner: { > source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - > Exception follows. > org.apache.flume.FlumeException: java.net.BindException: Address already in > use > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) > at > org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) > at > org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) > at sun.nio.ch.Net.bind(Net.java:444) > at sun.nio.ch.Net.bind(Net.java:436) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) > at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) > ... 9 more > The source's start() is then called again leading to another socket being > opened but not closed and so on. This leads to file descriptor (socket) leaks. > This can be easily reproduced as follows: > 1. Set Netcat as the source in flume agent configuration. > 2. Set the bind port for the netcat source to a port which is already in use. > e.g. in my case I used 50010 which is the port for DataNode's XCeiver > Protocol in use by the HDFS service. > 3. Start flume agent and perform "lsof -p | wc -l". Notice > the file descriptors keep on growing due to socket leaks with errors like: > "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLUME-2905) NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks
Siddharth Ahuja created FLUME-2905: -- Summary: NetcatSource - Socket not closed when an exception is encountered during start() leading to file descriptor leaks Key: FLUME-2905 URL: https://issues.apache.org/jira/browse/FLUME-2905 Project: Flume Issue Type: Bug Components: Sinks+Sources Affects Versions: v1.6.0 Reporter: Siddharth Ahuja During the flume agent start-up, the flume configuration containing the NetcatSource is parsed and the source's start() is called. If there is an issue while binding the channel's socket to a local address to configure the socket to listen for connections following exception is thrown but the socket open just before is not closed. 2016-05-01 03:04:37,273 ERROR org.apache.flume.lifecycle.LifecycleSupervisor: Unable to start EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:src-1,state:IDLE} } - Exception follows. org.apache.flume.FlumeException: java.net.BindException: Address already in use at org.apache.flume.source.NetcatSource.start(NetcatSource.java:173) at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:444) at sun.nio.ch.Net.bind(Net.java:436) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:214) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) at org.apache.flume.source.NetcatSource.start(NetcatSource.java:167) ... 9 more The source's start() is then called again leading to another socket being opened but not closed and so on. This leads to file descriptor (socket) leaks. This can be easily reproduced as follows: 1. Set Netcat as the source in flume agent configuration. 2. Set the bind port for the netcat source to a port which is already in use. e.g. in my case I used 50010 which is the port for DataNode's XCeiver Protocol in use by the HDFS service. 3. Start flume agent and perform "lsof -p | wc -l". Notice the file descriptors keep on growing due to socket leaks with errors like: "can't identify protocol". -- This message was sent by Atlassian JIRA (v6.3.4#6332)