Seems the problem only exists when using load balance sink processor with groups of sinks that use lzo compression.
Looking into the details. 2015-09-08 10:19 GMT+08:00 Shady Xu <[email protected]>: > Using different prefixes does not fix the problem. Any other idea? > > 2015-09-07 10:19 GMT+08:00 Shady Xu <[email protected]>: > >> Yes, I have several sinks that all write to sub directories of >> /user/data. Among them, there are two sinks, grouped as load balance sink >> processor, write to the same directory. I will try the set different prefix >> for the load balance sinks. >> >> If you don't see this as a bug, please make it clear in the documentation. >> >> 2015-09-07 0:24 GMT+08:00 Hari Shreedharan <[email protected]>: >> >>> Do you have multiple sinks writing to the same directory? If yes, that >>> could cause issues like this. Can you use different prefixes for each sink >>> if you want them to write to the same directory. >>> >>> >>> On Sunday, September 6, 2015, Shady Xu <[email protected]> wrote: >>> >>>> Hi all, >>>> >>>> Have anyone experienced the exception below? I am using LZO compression >>>> and when local data that hasn't been uploaded to HDFS accumulate to size of >>>> Gs, this error happens and Flume can not recover from it. >>>> >>>> 29 Aug 2015 18:50:37,031 WARN >>>> [SinkRunner-PollingRunner-LoadBalancingSinkProcessor] >>>> (org.apache.flume.sink.hdfs.BucketWriter.append:555) - Caught IOException >>>> writing to HDFSWriter (write beyond end of stream). Closing file >>>> (/user/log/data.1440845433925.lzo.tmp) and rethrowing exception. >>>> 29 Aug 2015 18:50:37,039 INFO >>>> [SinkRunner-PollingRunner-LoadBalancingSinkProcessor] >>>> (org.apache.flume.sink.hdfs.BucketWriter.close:363) - Closing >>>> /user/log/data.1440845433925.lzo.tmp >>>> 29 Aug 2015 18:50:37,064 INFO [hdfs-sink2-call-runner-3] >>>> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629) - Renaming >>>> /user/log/data.1440845433925.lzo.tmp to /user/log/data.1440845433925.lzo >>>> 29 Aug 2015 18:50:37,069 INFO >>>> [SinkRunner-PollingRunner-LoadBalancingSinkProcessor] >>>> (org.apache.flume.sink.hdfs.HDFSEventSink$1.run:394) - Writer callback >>>> called. >>>> 29 Aug 2015 18:50:37,086 WARN >>>> [SinkRunner-PollingRunner-LoadBalancingSinkProcessor] >>>> (org.apache.flume.sink.hdfs.HDFSEventSink.process:455) - HDFS IO error >>>> java.io.IOException: write beyond end of stream >>>> at >>>> com.hadoop.compression.lzo.LzopOutputStream.write(LzopOutputStream.java:134) >>>> at java.io.OutputStream.write(OutputStream.java:75) >>>> at >>>> org.apache.flume.serialization.BodyTextEventSerializer.write(BodyTextEventSerializer.java:71) >>>> at >>>> org.apache.flume.sink.hdfs.HDFSCompressedDataStream.append(HDFSCompressedDataStream.java:126) >>>> at >>>> org.apache.flume.sink.hdfs.BucketWriter$7.call(BucketWriter.java:550) >>>> at >>>> org.apache.flume.sink.hdfs.BucketWriter$7.call(BucketWriter.java:547) >>>> at >>>> org.apache.flume.sink.hdfs.BucketWriter$9$1.run(BucketWriter.java:679) >>>> at >>>> org.apache.flume.auth.SimpleAuthenticator.execute(SimpleAuthenticator.java:50) >>>> at >>>> org.apache.flume.sink.hdfs.BucketWriter$9.call(BucketWriter.java:676) >>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> at java.lang.Thread.run(Thread.java:724) >>>> >>> >>> >>> -- >>> >>> Thanks, >>> Hari >>> >>> >> >
