[ 
https://issues.apache.org/jira/browse/BEAM-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199572#comment-17199572
 ] 

Valentyn Tymofieiev commented on BEAM-9154:
-------------------------------------------

Thank you, [~kamilwu]!

> To be honest, I have no idea. I tried tensorflow 1.15.3 (the latest 1.x 
> version), but that version depends on tfx-bsl, which depends on an older 
> version of Beam, which leads us to a circular dependency. I suppose 
> tensorflow 2.x won't work either, since the code was written with tensorflow 
> 1.x in mind

If the test scenario downgrades the version of Beam, then it does not make 
sense to run this against Beam HEAD. Depending on the purpose of the test, we 
could pick the latest released version of the TFX libraries, or use some TFX 
library version that does set an upper bound on Beam.

[~altay] Do you know who are the stakeholders for this test and what was the 
motivations to add it to Beam? It sounds this test is misconfigured and may not 
add much value.

> Move Chicago Taxi Example to Python 3
> -------------------------------------
>
>                 Key: BEAM-9154
>                 URL: https://issues.apache.org/jira/browse/BEAM-9154
>             Project: Beam
>          Issue Type: Improvement
>          Components: testing
>            Reporter: Kamil Wasilewski
>            Assignee: Kamil Wasilewski
>            Priority: P1
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The Chicago Taxi Example[1] should be moved to the latest version of Python 
> supported by Beam (currently it's Python 3.7).
> At the moment, the following error occurs when running the benchmark on 
> Python 3.7 (requires futher investigation):
> {code:java}
> Traceback (most recent call last):
>   File "preprocess.py", line 259, in <module>
>     main()
>   File "preprocess.py", line 254, in main
>     project=known_args.metric_reporting_project
>   File "preprocess.py", line 155, in transform_data
>     ('Analyze' >> tft_beam.AnalyzeDataset(preprocessing_fn)))
>   File 
> "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 987, in __ror__
>     return self.transform.__ror__(pvalueish, self.label)
>   File 
> "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 547, in __ror__
>     result = p.apply(self, pvalueish, label)
>   File 
> "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 
> 532, in apply
>     return self.apply(transform, pvalueish)
>   File 
> "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line 
> 573, in apply
>     pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
>   File 
> "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", 
> line 193, in apply
>     return m(transform, input, options)
>   File 
> "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", 
> line 223, in apply_PTransform
>     return transform.expand(input)
>   File 
> "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py",
>  line 825, in expand
>     input_metadata))
>   File 
> "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py",
>  line 716, in expand
>     output_signature = self._preprocessing_fn(copied_inputs)
>   File "preprocess.py", line 102, in preprocessing_fn
>     _fill_in_missing(inputs[key]),
> KeyError: 'company'
> {code}
> [1] sdks/python/apache_beam/testing/benchmarks/chicago_taxi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to