[jira] [Updated] (BEAM-2429) Conflicting filesystems with used of HadoopFileSystem

2017-06-09 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

François Wagner updated BEAM-2429:
--
Description: 
I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks 
like HadoopFileSystem is registring itself under the `file` schema 
(https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79),
 hence the following Exception is thrown when trying to register 
HadoopFileSystem.

java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: 
[org.apache.beam.sdk.io.LocalFileSystem, 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem]
at 
org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498)

What is the correct way to handle `hdfs` url out of the box with TextIO & 
AvroIO ?
{code:java}
String[] args = new String[]{
"--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": 
\"true\"}]"};
HadoopFileSystemOptions options = PipelineOptionsFactory
.fromArgs(args)
.withValidation()
.as(HadoopFileSystemOptions.class);
Pipeline pipeline = Pipeline.create(options); 
{code}


  was:
I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks 
like HadoopFileSystem is registring itself under the `file` schema 
(https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79),
 hence the following Exception is thrown when trying to register 
HadoopFileSystem.

java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: 
[org.apache.beam.sdk.io.LocalFileSystem, 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem]
at 
org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498)

What is the correct way to handle `hdfs` url out of the box with TextIO & 
AvroIO ?
{code:java}
String[] args = new String[]{
"--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": 
\"true\"}]"};
HadoopFileSystemOptions options = PipelineOptionsFactory
.fromArgs(args)
.withValidation()
.as(HadoopFileSystemOptions.class);
Pipeline pipeline = Pipeline.create(options);
configuration.add(config);
options.setHdfsConfiguration(configuration);
Pipeline pipeline = Pipeline.create(options); 
{code}



> Conflicting filesystems with used of HadoopFileSystem
> -
>
> Key: BEAM-2429
> URL: https://issues.apache.org/jira/browse/BEAM-2429
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Affects Versions: 2.0.0
>Reporter: François Wagner
>Assignee: Davor Bonaci
>
> I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks 
> like HadoopFileSystem is registring itself under the `file` schema 
> (https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79),
>  hence the following Exception is thrown when trying to register 
> HadoopFileSystem.
> java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: 
> [org.apache.beam.sdk.io.LocalFileSystem, 
> org.apache.beam.sdk.io.hdfs.HadoopFileSystem]
>   at 
> org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498)
> What is the correct way to handle `hdfs` url out of the box with TextIO & 
> AvroIO ?
> {code:java}
> String[] args = new String[]{
> "--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": 
> \"true\"}]"};
> HadoopFileSystemOptions options = PipelineOptionsFactory
> .fromArgs(args)
> .withValidation()
> .as(HadoopFileSystemOptions.class);
> Pipeline pipeline = Pipeline.create(options); 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2429) Conflicting filesystems with used of HadoopFileSystem

2017-06-09 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

François Wagner updated BEAM-2429:
--
Description: 
I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks 
like HadoopFileSystem is registring itself under the `file` schema 
(https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79),
 hence the following Exception is thrown when trying to register 
HadoopFileSystem.

java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: 
[org.apache.beam.sdk.io.LocalFileSystem, 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem]
at 
org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498)

What is the correct way to handle `hdfs` url out of the box with TextIO & 
AvroIO ?
{code:java}
String[] args = new String[]{
"--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": 
\"true\"}]"};
HadoopFileSystemOptions options = PipelineOptionsFactory
.fromArgs(args)
.withValidation()
.as(HadoopFileSystemOptions.class);
Pipeline pipeline = Pipeline.create(options);
configuration.add(config);
options.setHdfsConfiguration(configuration);
Pipeline pipeline = Pipeline.create(options); 
{code}


  was:
I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks 
like HadoopFileSystem is registring itself under the `file` schema 
(https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79),
 hence the following Exception is thrown when trying to register 
HadoopFileSystem.

java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: 
[org.apache.beam.sdk.io.LocalFileSystem, 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem]
at 
org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498)

What is the correct way to handle `hdfs` url out of the box with TextIO & 
AvroIO ?

String[] args = new String[]{
"--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": 
\"true\"}]"};
HadoopFileSystemOptions options = PipelineOptionsFactory
.fromArgs(args)
.withValidation()
.as(HadoopFileSystemOptions.class);
Pipeline pipeline = Pipeline.create(options);
configuration.add(config);
options.setHdfsConfiguration(configuration);
Pipeline pipeline = Pipeline.create(options); 




> Conflicting filesystems with used of HadoopFileSystem
> -
>
> Key: BEAM-2429
> URL: https://issues.apache.org/jira/browse/BEAM-2429
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Affects Versions: 2.0.0
>Reporter: François Wagner
>Assignee: Davor Bonaci
>
> I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks 
> like HadoopFileSystem is registring itself under the `file` schema 
> (https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79),
>  hence the following Exception is thrown when trying to register 
> HadoopFileSystem.
> java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: 
> [org.apache.beam.sdk.io.LocalFileSystem, 
> org.apache.beam.sdk.io.hdfs.HadoopFileSystem]
>   at 
> org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498)
> What is the correct way to handle `hdfs` url out of the box with TextIO & 
> AvroIO ?
> {code:java}
> String[] args = new String[]{
> "--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": 
> \"true\"}]"};
> HadoopFileSystemOptions options = PipelineOptionsFactory
> .fromArgs(args)
> .withValidation()
> .as(HadoopFileSystemOptions.class);
> Pipeline pipeline = Pipeline.create(options);
> configuration.add(config);
> options.setHdfsConfiguration(configuration);
> Pipeline pipeline = Pipeline.create(options); 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)