Hi Sohi,

Something was originally interrupted in DFSOutputStream$DataStreamer.run.
It was thrown in the timer callback which processed files in
CustomBucketingSink.
Task reported the failure to JM and JM triggered then job cancelation.

I do not see this CustomBucketingSink in Flink code. Is it one of your
application classes?
Did it override BucketingSink.onProcessingTime?

2019-01-10 18:22:45,295 WARN  org.apache.hadoop.hdfs.DFSClient
                - Slow ReadProcessor read fields took 128378ms
(threshold=30000ms); ack: seqno: 10 reply: SUCCESS reply: SUCCESS
downstreamAckTimeNanos: 457753 flag: 0 flag: 0, targets:
[DatanodeInfoWithStorage[192.168.3.180:50010,DS-92b67356-e83f-410e-aeb4-e1f58b6cc69a,DISK],
DatanodeInfoWithStorage[192.168.3.185:50010
,DS-0dcac37b-4832-4b4e-b167-70762a3c6f34,DISK]]
2019-01-10 18:22:45,300 DEBUG
org.apache.flink.streaming.connectors.fs.CustomBucketingSink  - Moving
in-progress bucket
hdfs:/new_data_pipeline/prod/phase1/aggregated-data/item_agg/20190110/18/20/_part-1547124600014-9-0.in-progress
to pending file
hdfs:/new_data_pipeline/prod/phase1/aggregated-data/item_agg/20190110/18/20/_part-1547124600014-9-0.pending
2019-01-10 18:22:45,309 WARN  org.apache.hadoop.hdfs.DFSClient
                - DataStreamer Exception
java.nio.channels.ClosedByInterruptException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:478)
at
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:618)
2019-01-10 18:22:45,319 INFO  org.apache.flink.runtime.taskmanager.Task
                 - Attempting to fail task externally item_agg-avro ->
Sink: item_agg (2/20) (3b85714e145ca9f6760757c6fb2203bb).
2019-01-10 18:22:45,319 INFO  org.apache.flink.runtime.taskmanager.Task
                 - item_agg-avro -> Sink: item_agg (2/20)
(3b85714e145ca9f6760757c6fb2203bb) switched from RUNNING to FAILED.
TimerException{java.nio.channels.ClosedByInterruptException}
at
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:288)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedByInterruptException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:478)
at
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:618)
2019-01-10 18:22:45,483 INFO  org.apache.flink.runtime.taskmanager.Task
                 - Triggering cancellation of task code item_agg-avro ->
Sink: item_agg (2/20) (3b85714e145ca9f6760757c6fb2203bb).

I will also cc Kostas and Aljoscha, maybe, they could help.

Best,
Andrey

On Wed, Jan 16, 2019 at 1:37 PM sohimankotia <sohimanko...@gmail.com> wrote:

> Hi Andrey ,
>
> Pls find logs . Attaching dropbox link as logs as large .
>
>
> Job Manager . : https://www.dropbox.com/s/q0rd60coydupl6w/full.log.gz?dl=0
> Application :
> https://www.dropbox.com/s/cn3yrd273wd99f2/jm-sohan.log.gz?dl=0
>
>
> Thanks
> Sohi
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Reply via email to