[ https://issues.apache.org/jira/browse/BEAM-6068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ruoyun Huang resolved BEAM-6068. -------------------------------- Resolution: Not A Problem Changed to Not a problem. See the fixing steps in previous comment. > Wordcount example fails to read from gcs shakespare text file > ------------------------------------------------------------- > > Key: BEAM-6068 > URL: https://issues.apache.org/jira/browse/BEAM-6068 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Reporter: Ruoyun Huang > Assignee: Mark Liu > Priority: Major > Fix For: 2.9.0 > > > Symptom: > In a synced-to-head repo, following command fails: > python -m apache_beam.examples.wordcount --input > gs://dataflow-samples/shakespeare/kinglear.txt --output gs://$USER-test/tmp > --runner DataflowRunner --project google.com:clouddfe --temp_location > gs://$USER-test/temp-it --experiment beam_fn_api --sdk_location > dist/apache-beam-2.9.0.dev0.tar.gz > > error message being: > File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main > "__main__", fname, loader, pkg_name) > File "/usr/lib/python2.7/runpy.py", line 72, in _run_code > exec code in run_globals > File > "/usr/local/google/home/ruoyun/projects/beam2/sdks/python/apache_beam/examples/wordcount.py", > line 136, in <module> > run() > File > "/usr/local/google/home/ruoyun/projects/beam2/sdks/python/apache_beam/examples/wordcount.py", > line 90, in run > lines = p | 'read' >> ReadFromText(known_args.input) > File "apache_beam/io/textio.py", line 524, in __init__ > skip_header_lines=skip_header_lines) > File "apache_beam/io/textio.py", line 119, in __init__ > validate=validate) > File "apache_beam/io/filebasedsource.py", line 121, in __init__ > self._validate() > File "apache_beam/options/value_provider.py", line 137, in _f > return fnc(self, *args, **kwargs) > File "apache_beam/io/filebasedsource.py", line 178, in _validate > match_result = FileSystems.match([pattern], limits=[1])[0] > File "apache_beam/io/filesystems.py", line 187, in match > return filesystem.match(patterns, limits) > File "apache_beam/io/filesystem.py", line 705, in match > raise BeamIOError("Match operation failed", exceptions) > apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions > \{'gs://dataflow-samples/shakespeare/kinglear.txt': TypeError("__init__() got > an unexpected keyword argument 'response_encoding'",)} > > > However, I can run the similar command by reverting to 2.8 release and > rebuild everything. This command succeeds: > python -m apache_beam.examples.wordcount > --input=gs://dataflow-samples/shakespeare/kinglear.txt > --output=gs://test-$USER/portable/ --runner DataflowRunner --project > $GCP_PROJECT --staging_location gs://test-$USER/staging_wc --temp_location > gs://test-$USER/tmp \ --sdk_location=./dist/apache-beam-2.8.0.dev0.tar.gz > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)