[jira] [Issue Comment Edited] (FLUME-1175) RollingFileSink complains of Bad File Descriptor upon a reconfig event

Will McQueen (JIRA) Thu, 03 May 2012 03:49:14 -0700

    [ 
https://issues.apache.org/jira/browse/FLUME-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267338#comment-13267338
 ]


Will McQueen edited comment on FLUME-1175 at 5/3/12 10:48 AM:
--------------------------------------------------------------

I'm not sure, but what I think may be happening is:
1) Reconfig event occurs
2) RollingFileSink finishes executing its process() method, or maybe the sink 
runner thread that's calling RollingFileSink.process gets interrupted somewhere 
in the middle of the process() call. Either way, I believe in these cases that 
it's possible that the process() method will return while the outputStream 
field still has a non-null value pointing to some BufferedOutputStreamObject
3) As part of the reconfig steps, the sink's configure() is called next. And 
configure() sets the 'directory' field, but outputStream field is still set to 
the old value.
4) Next, stop() is called (again as part of reconfig steps). outputStream 
field's object is flushed and closed, but not nulled-out afterwards.
5) Next, start() is called. Reconfig steps are done.
6) Next, sink runner calls process(), which checks if outputStream != null 
(which is true, since outputStream is pointing to the old, closed 
BufferedOutputStream).

So maybe one possible fix could be to insert a line in stop() to null-out the 
outputStream field (and also null-out the serializer field while we're at 
it)... something like this:
{code}
if (serializer != null) {
  try {
    serializer.flush();
    serializer.beforeClose();
  } catch (IOException e) {
    logger.error("Unable to cleanup serializer. Exception follows.", e);
  } finally {
    serializer = null;
  }
}

if (outputStream != null) {
  logger.debug("Closing file {}", pathController.getCurrentFile());
  try {
    outputStream.flush();
    outputStream.close();
  } catch (IOException e) {
    logger.error("Unable to close outputStream. Exception follows.", e);
  } finally {
    outputStream = null;
  }
}
{code}
                
      was (Author: [email protected]):
    I'm not sure, but what I think may be happening is:
1) Reconfig event occurs
2) RollingFileSink finishes executing its process() method, or maybe the sink 
runner thread that's calling RollingFileSink.process gets interrupted somewhere 
in the middle of the process() call. Either way, I believe in these cases that 
it's possible that the process() method will return while the outputStream 
field still has a non-null value pointing to some BufferedOutputStreamObject
3) As part of the reconfig steps, the sink's configure() is called next. And 
configure() sets the 'directory' field, but outputStream field is still set to 
the old value.
4) Next, stop() is called (again as part of reconfig steps). outputStream 
field's object is flushed and closed, but not nulled-out afterwards.
5) Next, start() is called. Reconfig steps are done.
6) Next, sink runner calls process(), which checks if outputStream != null 
(which is true, since outputStream is pointing to the old, closed 
BufferedOutputStream).

So maybe one possible fix could be to insert a line in stop() to null-out the 
outputStream field... something like this:
{code}
if (serializer != null) {
  try {
    serializer.flush();
    serializer.beforeClose();
  } catch (IOException e) {
    logger.error("Unable to cleanup serializer. Exception follows.", e);
  } finally {
    serializer = null;
  }
}

if (outputStream != null) {
  logger.debug("Closing file {}", pathController.getCurrentFile());
  try {
    outputStream.flush();
    outputStream.close();
  } catch (IOException e) {
    logger.error("Unable to close outputStream. Exception follows.", e);
  } finally {
    outputStream = null;
  }
}
{code}
                  
> RollingFileSink complains of Bad File Descriptor upon a reconfig event
> ----------------------------------------------------------------------
>
>                 Key: FLUME-1175
>                 URL: https://issues.apache.org/jira/browse/FLUME-1175
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.2.0
>         Environment: CentOS 6.2 64-bit
>            Reporter: Will McQueen
>             Fix For: v1.2.0
>
>
> Steps:
> 1) Create a config file that looks something like this:
> agent.channels = c1
> agent.sources = r1
> agent.sinks = k1
> #
> agent.channels.c1.type = MEMORY
> #
> agent.sources.r1.channels = c1
> agent.sources.r1.type = SEQ
> #
> agent.sinks.k1.channel = c1
> agent.sinks.k1.type = FILE_ROLL
> agent.sinks.k1.sink.directory = /var/log/flume-ng
> agent.sinks.k1.sink.rollInterval = 0
> 2) Start the Flume NG agent
> 3) touch the config file so that a reconfig event is triggered within 30 secs
> 4) tail the output file to observer the sequence generator events:
> tail -f /var/log/flume-ng/XXXXXXXXXXXX
> 5) Notice that the flow suddenly stops at the reconfig event (within 30 secs 
> after touching the config file). Flow doesn't continue. The flume log shows a 
> Bad File Descriptor error for the RollingFileSink:
> 2012-05-03 01:34:34,806 (SinkRunner-PollingRunner-DefaultSinkProcessor) 
> [ERROR - org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:160)] 
> Unable to deliver event. Exception follows.
> org.apache.flume.EventDeliveryException: Failed to process event: [Event 
> headers = {timestamp=1336034074797, nanos=3762297996593382, pri=INFO, 
> host=<mysupersecrethost>, FlumeOG=yes, execcmd=java.nio.HeapByteBuffer[pos=0 
> lim=24 cap=24], procsource=java.nio.HeapByteBuffer[pos=0 lim=6 cap=6], 
> service=java.nio.HeapByteBuffer[pos=0 lim=4 cap=4]}, body.length = 26 ]
>         at 
> org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:201)
>         at 
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: Bad file descriptor
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:282)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>         at 
> org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:193)
>         ... 3 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (FLUME-1175) RollingFileSink complains of Bad File Descriptor upon a reconfig event

Reply via email to