gemini-code-assist[bot] commented on code in PR #35945:
URL: https://github.com/apache/beam/pull/35945#discussion_r2294881434
##########
sdks/python/apache_beam/yaml/examples/testing/input_data.py:
##########
@@ -65,20 +65,26 @@ def word_count_jinja_parameter_data():
return json.dumps(params)
-def word_count_jinja_template_data():
- return \
-[('apache_beam/yaml/examples/transforms/jinja/'
- 'include/submodules/readFromTextTransform.yaml'),
- ('apache_beam/yaml/examples/transforms/jinja/'
- 'include/submodules/mapToFieldsSplitConfig.yaml'),
- ('apache_beam/yaml/examples/transforms/jinja/'
- 'include/submodules/explodeTransform.yaml'),
- ('apache_beam/yaml/examples/transforms/jinja/'
- 'include/submodules/combineTransform.yaml'),
- ('apache_beam/yaml/examples/transforms/jinja/'
- 'include/submodules/mapToFieldsCountConfig.yaml'),
- ('apache_beam/yaml/examples/transforms/jinja/'
- 'include/submodules/writeToTextTransform.yaml')]
+def word_count_jinja_template_data(test_name: str) -> list[str]:
+ if test_name == 'test_wordCountInclude_yaml':
+ return \
+ [('apache_beam/yaml/examples/transforms/jinja/'
+ 'include/submodules/readFromTextTransform.yaml'),
+ ('apache_beam/yaml/examples/transforms/jinja/'
+ 'include/submodules/mapToFieldsSplitConfig.yaml'),
+ ('apache_beam/yaml/examples/transforms/jinja/'
+ 'include/submodules/explodeTransform.yaml'),
+ ('apache_beam/yaml/examples/transforms/jinja/'
+ 'include/submodules/combineTransform.yaml'),
+ ('apache_beam/yaml/examples/transforms/jinja/'
+ 'include/submodules/mapToFieldsCountConfig.yaml'),
+ ('apache_beam/yaml/examples/transforms/jinja/'
+ 'include/submodules/writeToTextTransform.yaml')]
+ elif test_name == 'test_wordCountImport_yaml':
+ return \
+ [('apache_beam/yaml/examples/transforms/jinja/'
+ 'import/macros/wordCountMacros.yaml')]
+ return []
Review Comment:

The `word_count_jinja_template_data` function can be made more readable and
aligned with modern Python styling. The use of backslashes for line
continuation is unnecessary when inside brackets. Also, the parentheses around
each string literal in the lists are redundant.
```python
def word_count_jinja_template_data(test_name: str) -> list[str]:
if test_name == 'test_wordCountInclude_yaml':
return [
'apache_beam/yaml/examples/transforms/jinja/'
'include/submodules/readFromTextTransform.yaml',
'apache_beam/yaml/examples/transforms/jinja/'
'include/submodules/mapToFieldsSplitConfig.yaml',
'apache_beam/yaml/examples/transforms/jinja/'
'include/submodules/explodeTransform.yaml',
'apache_beam/yaml/examples/transforms/jinja/'
'include/submodules/combineTransform.yaml',
'apache_beam/yaml/examples/transforms/jinja/'
'include/submodules/mapToFieldsCountConfig.yaml',
'apache_beam/yaml/examples/transforms/jinja/'
'include/submodules/writeToTextTransform.yaml'
]
elif test_name == 'test_wordCountImport_yaml':
return [
'apache_beam/yaml/examples/transforms/jinja/'
'import/macros/wordCountMacros.yaml'
]
return []
```
##########
sdks/python/apache_beam/yaml/examples/transforms/jinja/import/README.md:
##########
@@ -0,0 +1,63 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+
+## Jinja % import Pipeline
+
+This example leverages the `% import` Jinja directive by having one main
+pipeline and then one macros file containing all the transforms and configs
+used..
+
+General setup:
+```sh
+export
PIPELINE_FILE=apache_beam/yaml/examples/transforms/jinja/import/wordCountImport.yaml
+export KINGLEAR="gs://dataflow-samples/shakespeare/kinglear.txt"
+export TEMP_LOCATION="gs://MY-BUCKET/wordCounts/"
+
+cd <PATH_TO_BEAM_REPO>/beam/sdks/python
+```
+
+Multiline Run Example:
+```sh
+python -m apache_beam.yaml.main \
+ --yaml_pipeline_file="${PIPELINE_FILE}" \
+ --jinja_variables='{
+ "readFromTextTransform": {"path": "'"${KINGLEAR}"'"},
+ "mapToFieldsSplitConfig": {
+ "language": "python",
+ "fields": {
+ "value": "1"
+ }
+ },
+ "explodeTransform": {"fields": "word"},
+ "combineTransform": {
+ "group_by": "word",
+ "combine": {"value": "sum"}
+ },
+ "mapToFieldsCountConfig": {
+ "language": "python",
+ "fields": {"output": "word + \" - \" + str(value)"}
+ },
+ "writeToTextTransform": {"path": "'"${TEMP_LOCATION}"'"}
+ }'
+```
+
+Single Line Run Example:
+```sh
+python -m apache_beam.yaml.main --yaml_pipeline_file="${PIPELINE_FILE}"
--jinja_variables='{"readFromTextTransform": {"path":
"gs://dataflow-samples/shakespeare/kinglear.txt"}, "mapToFieldsSplitConfig":
{"language": "python", "fields":{"value":"1"}},
"explodeTransform":{"fields":"word"}, "combineTransform":{"group_by":"word",
"combine":{"value":"sum"}}, "mapToFieldsCountConfig":{"language": "python",
"fields":{"output":"word + \" - \" + str(value)"}},
"writeToTextTransform":{"path":"${TEMP_LOCATION}"}}'
Review Comment:

In the "Single Line Run Example", the shell variable `${TEMP_LOCATION}` is
inside a single-quoted string for the `--jinja_variables` argument. This will
prevent the shell from expanding the variable, and it will be passed as a
literal string to the pipeline, likely causing an error. To fix this, you
should use the same technique as in the multiline example to correctly embed
the variable's value.
```suggestion
python -m apache_beam.yaml.main --yaml_pipeline_file="${PIPELINE_FILE}"
--jinja_variables='{"readFromTextTransform": {"path":
"gs://dataflow-samples/shakespeare/kinglear.txt"}, "mapToFieldsSplitConfig":
{"language": "python", "fields":{"value":"1"}},
"explodeTransform":{"fields":"word"}, "combineTransform":{"group_by":"word",
"combine":{"value":"sum"}}, "mapToFieldsCountConfig":{"language": "python",
"fields":{"output":"word + \" - \" + str(value)"}},
"writeToTextTransform":{"path":"'"${TEMP_LOCATION}"'"}}'
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]