[ https://issues.apache.org/jira/browse/BEAM-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Udi Meiri updated BEAM-3965: ---------------------------- Description: When running a command like: python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount --output gs://.../py-wordcount-output --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner --project ... --temp_location gs://.../temp-hdfs-int --staging_location gs://.../staging-hdfs-int --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input hdfs://kinglear.txt I get: {{Traceback (most recent call last):}} {{ File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main}} {{ "__main__", fname, loader, pkg_name)}} {{ File "/usr/lib/python2.7/runpy.py", line 72, in _run_code}} {{ exec code in run_globals}} {{ File ".../beam/sdks/python/apache_beam/examples/wordcount.py", line 136, in <module>}} {{ run()}} {{ File ".../beam/sdks/python/apache_beam/examples/wordcount.py", line 90, in run}} {{ lines = p | 'read' >> ReadFromText(known_args.input)}} {{ File "apache_beam/io/textio.py", line 522, in __init__}} {{ skip_header_lines=skip_header_lines)}} {{ File "apache_beam/io/textio.py", line 117, in __init__}} {{ validate=validate)}} {{ File "apache_beam/io/filebasedsource.py", line 119, in __init__}} {{ self._validate()}} {{ File "apache_beam/options/value_provider.py", line 124, in _f}} {{ return fnc(self, *args, **kwargs)}} {{ File "apache_beam/io/filebasedsource.py", line 176, in _validate}} {{ match_result = FileSystems.match([pattern], limits=[1])[0]}} {{ File "apache_beam/io/filesystems.py", line 159, in match}} {{ return filesystem.match(patterns, limits)}} {{ File "apache_beam/io/hadoopfilesystem.py", line 221, in match}} {{ raise BeamIOError('Match operation failed', exceptions)}} {{apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions \{'hdfs://kinglear.txt': KeyError('name',)}}} was: When running a command like: {{python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount --output gs://.../py-wordcount-output --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner --project ... --temp_location gs://.../temp-hdfs-int --staging_location gs://.../staging-hdfs-int --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input hdfs://kinglear.txt}} I get: {{Traceback (most recent call last):}} {{ File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main}} {{ "__main__", fname, loader, pkg_name)}} {{ File "/usr/lib/python2.7/runpy.py", line 72, in _run_code}} {{ exec code in run_globals}} {{ File ".../beam/sdks/python/apache_beam/examples/wordcount.py", line 136, in <module>}} {{ run()}} {{ File ".../beam/sdks/python/apache_beam/examples/wordcount.py", line 90, in run}} {{ lines = p | 'read' >> ReadFromText(known_args.input)}} {{ File "apache_beam/io/textio.py", line 522, in __init__}} {{ skip_header_lines=skip_header_lines)}} {{ File "apache_beam/io/textio.py", line 117, in __init__}} {{ validate=validate)}} {{ File "apache_beam/io/filebasedsource.py", line 119, in __init__}} {{ self._validate()}} {{ File "apache_beam/options/value_provider.py", line 124, in _f}} {{ return fnc(self, *args, **kwargs)}} {{ File "apache_beam/io/filebasedsource.py", line 176, in _validate}} {{ match_result = FileSystems.match([pattern], limits=[1])[0]}} {{ File "apache_beam/io/filesystems.py", line 159, in match}} {{ return filesystem.match(patterns, limits)}} {{ File "apache_beam/io/hadoopfilesystem.py", line 221, in match}} {{ raise BeamIOError('Match operation failed', exceptions)}} {{apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions \{'hdfs://kinglear.txt': KeyError('name',)}}} > HDFS read broken > ---------------- > > Key: BEAM-3965 > URL: https://issues.apache.org/jira/browse/BEAM-3965 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Reporter: Udi Meiri > Assignee: Udi Meiri > Priority: Major > > When running a command like: > python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount > --output gs://.../py-wordcount-output --hdfs_host ... --hdfs_port 50070 > --hdfs_user ehudm --runner DataflowRunner --project ... --temp_location > gs://.../temp-hdfs-int --staging_location gs://.../staging-hdfs-int > --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input hdfs://kinglear.txt > I get: > {{Traceback (most recent call last):}} > {{ File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main}} > {{ "__main__", fname, loader, pkg_name)}} > {{ File "/usr/lib/python2.7/runpy.py", line 72, in _run_code}} > {{ exec code in run_globals}} > {{ File ".../beam/sdks/python/apache_beam/examples/wordcount.py", line 136, > in <module>}} > {{ run()}} > {{ File ".../beam/sdks/python/apache_beam/examples/wordcount.py", line 90, in > run}} > {{ lines = p | 'read' >> ReadFromText(known_args.input)}} > {{ File "apache_beam/io/textio.py", line 522, in __init__}} > {{ skip_header_lines=skip_header_lines)}} > {{ File "apache_beam/io/textio.py", line 117, in __init__}} > {{ validate=validate)}} > {{ File "apache_beam/io/filebasedsource.py", line 119, in __init__}} > {{ self._validate()}} > {{ File "apache_beam/options/value_provider.py", line 124, in _f}} > {{ return fnc(self, *args, **kwargs)}} > {{ File "apache_beam/io/filebasedsource.py", line 176, in _validate}} > {{ match_result = FileSystems.match([pattern], limits=[1])[0]}} > {{ File "apache_beam/io/filesystems.py", line 159, in match}} > {{ return filesystem.match(patterns, limits)}} > {{ File "apache_beam/io/hadoopfilesystem.py", line 221, in match}} > {{ raise BeamIOError('Match operation failed', exceptions)}} > {{apache_beam.io.filesystem.BeamIOError: Match operation failed with > exceptions \{'hdfs://kinglear.txt': KeyError('name',)}}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)