[ 
https://issues.apache.org/jira/browse/FLINK-17444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105372#comment-17105372
 ] 

Kostas Kloudas edited comment on FLINK-17444 at 5/12/20, 12:25 PM:
-------------------------------------------------------------------

What [~trohrmann] says is correct and it seems that this is the case since the 
introduction of the AzureFS support in Flink. So the AzureFS was never 
supported by the {{StreamingFileSink}}.

For the {{StreamingFileSink}} to work with the {{HadoopRecoverableWriter}}, the 
writer should be in the classpath and the check 
[here|https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableWriter.java#L60]
 should be changed to also accept {{wasb}}. 

That said, I am not sure about the guarantees (e.g. consistency) that the 
underlying {{org.apache.hadoop.fs.azure.NativeAzureFileSystem}} offers, so I 
cannot comment on the correctness of the result or the performance (e.g. the 
cost of {{renaming}} a file).


was (Author: kkl0u):
What [~trohrmann] says is correct and it seems that this is the case since the 
introduction of the AzureFS support in Flink. So the AzureFS was never 
supported by the {{StreamingFileSink}}.

For the {{StreamingFileSink}} to work with the {{HadoopRecoverableWriter}}, the 
writer should be in the classpath and the check 
[here|https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableWriter.java#L60]
 should be changed to also accept {{wasb}}. 

That said, I am not sure about the guarantees (e.g. consistency) that the 
underlying {{org.apache.hadoop.fs.azure.NativeAzureFileSystem}} offers, so I 
cannot comment on the correctness of the result.

> StreamingFileSink Azure HadoopRecoverableWriter class missing.
> --------------------------------------------------------------
>
>                 Key: FLINK-17444
>                 URL: https://issues.apache.org/jira/browse/FLINK-17444
>             Project: Flink
>          Issue Type: Bug
>          Components: FileSystems
>    Affects Versions: 1.10.0
>            Reporter: Marie May
>            Priority: Critical
>             Fix For: 1.11.0
>
>
> Hello, I was recently attempting to use the Streaming File Sink to store data 
> to Azure and get an exception error that it is missing the 
> HadoopRecoverableWriter. When I searched if anyone else had the issue I came 
> across this post here on [Stack Overflow 
> |[https://stackoverflow.com/questions/61246683/flink-streamingfilesink-on-azure-blob-storage]].
>  Seeing no one responded, I asked about it on the mailing list and was told 
> to submit the issue here.
> This is exception message they posted below but the stack overflow post goes 
> into more details of where they believe the issue comes from. 
> {code:java}
> java.lang.NoClassDefFoundError: 
> org/apache/flink/fs/azure/common/hadoop/HadoopRecoverableWriter               
>                                              
>     at 
> org.apache.flink.fs.azure.common.hadoop.HadoopFileSystem.createRecoverableWriter(HadoopFileSystem.java:202)
>                                          
>     at 
> org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.createRecoverableWriter(PluginFileSystemFactory.java:129)
>               
>     at 
> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.createRecoverableWriter(SafetyNetWrapperFileSystem.java:69)
>                                      
>     at 
> org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.<init>(Buckets.java:117)
>                                                            
>     at 
> org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink$RowFormatBuilder.createBuckets(StreamingFileSink.java:288)
>                
>     at 
> org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.initializeState(StreamingFileSink.java:402)
>                               
>     at 
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
>                                 
>     at 
> org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
>                               
>     at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
>                                
>     at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:284)
>                                     
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1006)
>                                                     
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454)
>                                                       
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449)
>                                                                
>     at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461)
>                                                                      
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)         
>                                                                              
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)           
>                                                                              
>     at java.lang.Thread.run(Thread.java:748)                          
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to