[ https://issues.apache.org/jira/browse/BEAM-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Udi Meiri resolved BEAM-3965. ----------------------------- Resolution: Fixed Fix Version/s: 2.5.0 > HDFS read broken in python > -------------------------- > > Key: BEAM-3965 > URL: https://issues.apache.org/jira/browse/BEAM-3965 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Reporter: Udi Meiri > Assignee: Udi Meiri > Priority: Major > Fix For: 2.5.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > When running a command like: > {noformat} > python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount > --output gs://.../py-wordcount-output \ > --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner > --project ... \ > --temp_location gs://.../temp-hdfs-int --staging_location > gs://.../staging-hdfs-int \ > --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input > hdfs://kinglear.txt > {noformat} > I get: > {noformat} > Traceback (most recent call last): > File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main > "__main__", fname, loader, pkg_name) > File "/usr/lib/python2.7/runpy.py", line 72, in _run_code > exec code in run_globals > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py", > line 136, in <module> > run() > File > "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py", > line 90, in run > lines = p | 'read' >> ReadFromText(known_args.input) > File "apache_beam/io/textio.py", line 522, in __init__ > skip_header_lines=skip_header_lines) > File "apache_beam/io/textio.py", line 117, in __init__ > validate=validate) > File "apache_beam/io/filebasedsource.py", line 119, in __init__ > self._validate() > File "apache_beam/options/value_provider.py", line 124, in _f > return fnc(self, *args, **kwargs) > File "apache_beam/io/filebasedsource.py", line 176, in _validate > match_result = FileSystems.match([pattern], limits=[1])[0] > File "apache_beam/io/filesystems.py", line 159, in match > return filesystem.match(patterns, limits) > File "apache_beam/io/hadoopfilesystem.py", line 221, in match > raise BeamIOError('Match operation failed', exceptions) > apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions > {'hdfs://kinglear.txt': KeyError('name',)} > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)