[ 
https://issues.apache.org/jira/browse/BEAM-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri resolved BEAM-3965.
-----------------------------
       Resolution: Fixed
    Fix Version/s: 2.5.0

> HDFS read broken in python
> --------------------------
>
>                 Key: BEAM-3965
>                 URL: https://issues.apache.org/jira/browse/BEAM-3965
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Udi Meiri
>            Assignee: Udi Meiri
>            Priority: Major
>             Fix For: 2.5.0
>
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
>     "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
>     exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in <module>
>     run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
>     lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
>     skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
>     validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
>     self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
>     return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
>     match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
>     return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
>     raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to