[ https://issues.apache.org/jira/browse/BEAM-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17199572#comment-17199572 ]
Valentyn Tymofieiev commented on BEAM-9154: ------------------------------------------- Thank you, [~kamilwu]! > To be honest, I have no idea. I tried tensorflow 1.15.3 (the latest 1.x > version), but that version depends on tfx-bsl, which depends on an older > version of Beam, which leads us to a circular dependency. I suppose > tensorflow 2.x won't work either, since the code was written with tensorflow > 1.x in mind If the test scenario downgrades the version of Beam, then it does not make sense to run this against Beam HEAD. Depending on the purpose of the test, we could pick the latest released version of the TFX libraries, or use some TFX library version that does set an upper bound on Beam. [~altay] Do you know who are the stakeholders for this test and what was the motivations to add it to Beam? It sounds this test is misconfigured and may not add much value. > Move Chicago Taxi Example to Python 3 > ------------------------------------- > > Key: BEAM-9154 > URL: https://issues.apache.org/jira/browse/BEAM-9154 > Project: Beam > Issue Type: Improvement > Components: testing > Reporter: Kamil Wasilewski > Assignee: Kamil Wasilewski > Priority: P1 > Time Spent: 10m > Remaining Estimate: 0h > > The Chicago Taxi Example[1] should be moved to the latest version of Python > supported by Beam (currently it's Python 3.7). > At the moment, the following error occurs when running the benchmark on > Python 3.7 (requires futher investigation): > {code:java} > Traceback (most recent call last): > File "preprocess.py", line 259, in <module> > main() > File "preprocess.py", line 254, in main > project=known_args.metric_reporting_project > File "preprocess.py", line 155, in transform_data > ('Analyze' >> tft_beam.AnalyzeDataset(preprocessing_fn))) > File > "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", > line 987, in __ror__ > return self.transform.__ror__(pvalueish, self.label) > File > "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/transforms/ptransform.py", > line 547, in __ror__ > result = p.apply(self, pvalueish, label) > File > "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line > 532, in apply > return self.apply(transform, pvalueish) > File > "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/pipeline.py", line > 573, in apply > pvalueish_result = self.runner.apply(transform, pvalueish, self._options) > File > "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", > line 193, in apply > return m(transform, input, options) > File > "/Users/kamilwasilewski/proj/beam/sdks/python/apache_beam/runners/runner.py", > line 223, in apply_PTransform > return transform.expand(input) > File > "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", > line 825, in expand > input_metadata)) > File > "/Users/kamilwasilewski/proj/beam/build/gradleenv/2022703441/lib/python3.7/site-packages/tensorflow_transform/beam/impl.py", > line 716, in expand > output_signature = self._preprocessing_fn(copied_inputs) > File "preprocess.py", line 102, in preprocessing_fn > _fill_in_missing(inputs[key]), > KeyError: 'company' > {code} > [1] sdks/python/apache_beam/testing/benchmarks/chicago_taxi -- This message was sent by Atlassian Jira (v8.3.4#803005)