[ 
https://issues.apache.org/jira/browse/FLINK-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-10989:
----------------------------------
    Description: 
The {{OrcRowInputFormat}} seems to use two different {{FileSystem}}. The Flink 
{{FileSystem}} for listing the files and generating the {{InputSplits}} and 
then Hadoop's {{FileSystem}} to actually read the input splits. This can be 
problematic if one only configures Flink's S3 {{FileSystem}} but does not 
provide a S3 implementation for Hadoop's {{FileSystem}}.

I think this is not an intuitive behaviour and can lead to hard to debug 
problems for a user.

  was:The {{OrcRowInputFormat}} seems to use two different {{FileSystem}}. The 
Flink {{FileSystem}} for listing the files and generating the {{InputSplits}} 
and then Hadoop's {{FileSystem}} to actually read the input splits. This can be 
problematic if one only configures Flink's S3 {{FileSystem}} but does not 
provide a S3 implementation for Hadoop's {{FileSystem}}.


> OrcRowInputFormat uses two different file systems
> -------------------------------------------------
>
>                 Key: FLINK-10989
>                 URL: https://issues.apache.org/jira/browse/FLINK-10989
>             Project: Flink
>          Issue Type: Bug
>          Components: Batch Connectors and Input/Output Formats
>    Affects Versions: 1.7.0
>            Reporter: Till Rohrmann
>            Priority: Major
>
> The {{OrcRowInputFormat}} seems to use two different {{FileSystem}}. The 
> Flink {{FileSystem}} for listing the files and generating the {{InputSplits}} 
> and then Hadoop's {{FileSystem}} to actually read the input splits. This can 
> be problematic if one only configures Flink's S3 {{FileSystem}} but does not 
> provide a S3 implementation for Hadoop's {{FileSystem}}.
> I think this is not an intuitive behaviour and can lead to hard to debug 
> problems for a user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to