We reverted back to BucketingSink and that works as expected. In conclusion RollingFileSink needs hadoop 2.7 for for hadoop File System.
On Sat, Feb 23, 2019 at 2:30 PM Vishal Santoshi <vishal.santo...@gmail.com> wrote: > Any one ? I am sure there are hadoop 2.6 integrations with 1.7.1 OR I am > overlooking something... > > On Fri, Feb 15, 2019 at 2:44 PM Vishal Santoshi <vishal.santo...@gmail.com> > wrote: > >> Not sure, but it seems this >> https://issues.apache.org/jira/browse/FLINK-10203 may be a connected >> issue. >> >> On Fri, Feb 15, 2019 at 11:57 AM Vishal Santoshi < >> vishal.santo...@gmail.com> wrote: >> >>> That log does not appear. It looks like we have egg and chicken issue. >>> >>> 2019-02-15 16:49:15,045 DEBUG org.apache.hadoop.hdfs.DFSClient >>> - Connecting to datanode 10.246.221.10:50010 >>> >>> 2019-02-15 16:49:15,045 DEBUG >>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient >>> - SASL client skipping handshake in unsecured configuration for >>> >>> addr = /10.246.221.10, datanodeId = DatanodeInfoWithStorage[ >>> 10.246.221.10:50010,DS-c57a7667-f697-4f03-9fb1-532c5b82a9e8,DISK] >>> >>> 2019-02-15 16:49:15,072 DEBUG >>> org.apache.flink.runtime.fs.hdfs.HadoopFsFactory - >>> Instantiating for file system scheme hdfs Hadoop File System >>> org.apache.hadoop.hdfs.DistributedFileSystem >>> >>> 2019-02-15 16:49:15,072 DEBUG org.apache.hadoop.hdfs.BlockReaderLocal >>> - dfs.client.use.legacy.blockreader.local = false >>> >>> 2019-02-15 16:49:15,072 DEBUG org.apache.hadoop.hdfs.BlockReaderLocal >>> - dfs.client.read.shortcircuit = false >>> >>> 2019-02-15 16:49:15,072 DEBUG org.apache.hadoop.hdfs.BlockReaderLocal >>> - dfs.client.domain.socket.data.traffic = false >>> >>> 2019-02-15 16:49:15,072 DEBUG org.apache.hadoop.hdfs.BlockReaderLocal >>> - dfs.domain.socket.path = >>> >>> 2019-02-15 16:49:15,076 DEBUG org.apache.hadoop.io.retry.RetryUtils >>> - multipleLinearRandomRetry = null >>> >>> 2019-02-15 16:49:15,076 DEBUG org.apache.hadoop.ipc.Client >>> - getting client out of cache: >>> org.apache.hadoop.ipc.Client@31920ade >>> >>> 2019-02-15 16:49:15,076 DEBUG >>> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil - >>> DataTransferProtocol not using SaslPropertiesResolver, no QOP found in >>> configuration for dfs.data.transfer.protection >>> >>> 2019-02-15 16:49:15,080 INFO >>> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets - >>> Subtask 3 initializing its state (max part counter=58). >>> >>> 2019-02-15 16:49:15,081 DEBUG >>> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets - >>> Subtask 3 restoring: BucketState for >>> bucketId=ls_kraken_events/dt=2019-02-14/evt=ad_fill and >>> bucketPath=hdfs://nn-crunchy:8020/tmp/kafka-to-hdfs/ls_kraken_events/dt=2019-02-14/evt=ad_fill, >>> has open part file created @ 1550247946437 >>> >>> 2019-02-15 16:49:15,085 DEBUG org.apache.hadoop.ipc.Client >>> - IPC Client (1270836494) connection to >>> nn-crunchy.bf2.tumblr.net/10.246.199.154:8020 from root sending #56 >>> >>> 2019-02-15 16:49:15,188 DEBUG org.apache.hadoop.ipc.Client >>> - IPC Client (1270836494) connection to >>> nn-crunchy.bf2.tumblr.net/10.246.199.154:8020 from root got value #56 >>> >>> 2019-02-15 16:49:15,196 INFO org.apache.flink.runtime.taskmanager.Task >>> - Source: Custom Source -> (Sink: Unnamed, Process -> >>> Timestamps/Watermarks) (4/4) (f73403ac4763c99e6a244cba3797f7e9) switched >>> from RUNNING to FAILED. >>> >>> java.io.IOException: Missing data in tmp file: >>> hdfs://nn-crunchy:8020/tmp/kafka-to-hdfs/ls_kraken_events/dt=2019-02-14/evt=ad_fill/ >>> .part-3-32.inprogress.da2a75d1-0c83-47bc-9c83-950360c55c86 >>> >>> at >>> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.<init>(HadoopRecoverableFsDataOutputStream.java:93) >>> >>> >>> >>> >>> >>> >>> I do see >>> >>> >>> 2019-02-15 16:47:33,582 INFO >>> org.apache.flink.runtime.taskexecutor.TaskManagerRunner >>> - Current Hadoop/Kerberos user: root >>> >>> 2019-02-15 16:47:33,582 INFO >>> org.apache.flink.runtime.taskexecutor.TaskManagerRunner >>> - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - >>> 1.8/25.181-b13 >>> >>> 2019-02-15 16:47:33,582 INFO >>> org.apache.flink.runtime.taskexecutor.TaskManagerRunner >>> - Maximum heap size: 1204 MiBytes >>> >>> 2019-02-15 16:47:33,582 INFO >>> org.apache.flink.runtime.taskexecutor.TaskManagerRunner >>> - JAVA_HOME: /docker-java-home >>> >>> 2019-02-15 16:47:33,585 INFO >>> org.apache.flink.runtime.taskexecutor.TaskManagerRunner >>> - Hadoop version: 2.7.5 >>> >>> >>> >>> which has to be expected given that we are running the hadoop27flink >>> 1.7.1 version. >>> >>> >>> >>> Does it make sense to go with a hadoop less version and inject the >>> required jar files ? Has that been done by anyone ? >>> >>> >>> >>> >>> >>> >>> On Fri, Feb 15, 2019 at 2:33 AM Yun Tang <myas...@live.com> wrote: >>> >>>> Hi >>>> >>>> When 'RollingSink' try to initialize state, it would first check >>>> current file system supported truncate method. If file system not >>>> supported, it would use another work-around solution, which means you >>>> should not meet the problem. Otherwise 'RollingSink' thought and found the >>>> reflection method of 'truncate' while the file system actually not support. >>>> You could try to open DEBUG level to see whether log below could be >>>> printed: >>>> Truncate not found. Will write a file with suffix '.valid-length' and >>>> prefix '_' to specify how many bytes in a bucket are valid. >>>> >>>> However, from your second email, the more serious problem should be >>>> using 'Buckets' with Hadoop-2.6. From what I know the `RecoverableWriter` >>>> within 'Buckets' can only support Hadoop-2.7+ , I'm not sure whether >>>> existed work around solution. >>>> >>>> Best >>>> Yun Tang >>>> ------------------------------ >>>> *From:* Vishal Santoshi <vishal.santo...@gmail.com> >>>> *Sent:* Friday, February 15, 2019 3:43 >>>> *To:* user >>>> *Subject:* Re: StandAlone job on k8s fails with "Unknown method >>>> truncate" on restore >>>> >>>> And yes cannot work with RollingFleSink for hadoop 2.6 release of >>>> 1.7.1 b'coz of this. >>>> >>>> java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are >>>> only supported for HDFS and for Hadoop version 2.7 or newer >>>> at >>>> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.<init>(HadoopRecoverableWriter.java:57) >>>> at >>>> org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.createRecoverableWriter(HadoopFileSystem.java:202) >>>> at >>>> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.createRecoverableWriter(SafetyNetWrapperFileSystem.java:69) >>>> at >>>> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.<init>(Buckets.java:112) >>>> >>>> >>>> Any work around ? >>>> >>>> >>>> On Thu, Feb 14, 2019 at 1:42 PM Vishal Santoshi < >>>> vishal.santo...@gmail.com> wrote: >>>> >>>> The job uses a RolllingFileSink to push data to hdfs. Run an HA >>>> standalone cluster on k8s, >>>> >>>> * get the job running >>>> * kill the pod. >>>> >>>> The k8s deployment relaunches the pod but fails with >>>> >>>> java.io.IOException: Missing data in tmp file: >>>> hdfs://nn-crunchy:8020/tmp/kafka-to-hdfs/ls_kraken_events/dt=2019-02-14/evt=ad_fill/.part-2-16.inprogress.449e8668-e886-4f89-b5f6-45ac68e25987 >>>> >>>> >>>> Unknown method truncate called on >>>> org.apache.hadoop.hdfs.protocol.ClientProtocol protocol. >>>> >>>> >>>> The file does exist. We work with hadoop 2.6 , which does no have >>>> truncate. The previous version would see that "truncate" was not supported >>>> and drop a length file for the ,inprogress file and rename it to a valid >>>> part file. >>>> >>>> >>>> >>>> Is this a known issue ? >>>> >>>> >>>> Regards. >>>> >>>> >>>> >>>> >>>>