François Wagner created BEAM-2429: ------------------------------------- Summary: Conflicting filesystems with used of HadoopFileSystem Key: BEAM-2429 URL: https://issues.apache.org/jira/browse/BEAM-2429 Project: Beam Issue Type: Bug Components: sdk-java-extensions Affects Versions: 2.0.0 Reporter: François Wagner Assignee: Davor Bonaci
I'm facing issue when trying to use HadoopFileSystem in my pipeline. It looks like HadoopFileSystem is registring itself under the `file` schema (https://github.com/apache/beam/pull/2777/files#diff-330bd0854dcab6037ef0e52c05d68eb2L79), hence the following Exception is thrown when trying to register HadoopFileSystem. java.lang.IllegalStateException: Scheme: [file] has conflicting filesystems: [org.apache.beam.sdk.io.LocalFileSystem, org.apache.beam.sdk.io.hdfs.HadoopFileSystem] at org.apache.beam.sdk.io.FileSystems.verifySchemesAreUnique(FileSystems.java:498) What is the correct way to handle `hdfs` url out of the box with TextIO & AvroIO ? String[] args = new String[]{ "--hdfsConfiguration=[{\"dfs.client.use.datanode.hostname\": \"true\"}]"}; HadoopFileSystemOptions options = PipelineOptionsFactory .fromArgs(args) .withValidation() .as(HadoopFileSystemOptions.class); Pipeline pipeline = Pipeline.create(options); configuration.add(config); options.setHdfsConfiguration(configuration); Pipeline pipeline = Pipeline.create(options); -- This message was sent by Atlassian JIRA (v6.3.15#6346)