gemini-code-assist[bot] commented on code in PR #35733: URL: https://github.com/apache/beam/pull/35733#discussion_r2241357426
########## sdks/python/scripts/run_pytest.sh: ########## @@ -42,14 +42,14 @@ user_marker="" # Isolate the user-provided -m argument (matching only). -if [[ $pytest_args =~ "-m" ]]; then +if [[ $posargs =~ "-m" ]]; then # Extract the marker value using the defined regex. - user_marker=$(echo "$pytest_args" | sed -nE "s/.*$marker_regex.*/\1/p") + user_marker=$(echo "$posargs" | sed -nE "s/.*$marker_regex.*/\1/p") fi -# Remove the -m argument from pytest_args (substitution only). +# Remove the -m argument from posargs (substitution only). if [[ -n $user_marker ]]; then - pytest_args=$(echo "$pytest_args" | sed -E "s/$marker_regex//") + posargs=$(echo "$posargs" | sed -E "s/$marker_regex//") fi Review Comment:  This logic for extracting and removing the `-m` argument can be simplified and made more efficient. Additionally, the current implementation has a subtle bug when multiple `-m` arguments are provided. 1. **Efficiency and Clarity**: The current code uses subshells to call `sed`, which is less efficient than using bash's built-in regex matching (`=~`). The logic is also split across two `if` statements, which can be consolidated. 2. **Bug**: The extraction logic (`sed -nE "s/.*$marker_regex.*/\1/p"`) gets the *last* `-m` argument due to the greedy `.*`, but the removal logic (`sed -E "s/$marker_regex//"`) removes the *first* one. This is inconsistent. Using `BASH_REMATCH` handles the *first* match consistently for both extraction and removal, and avoids spawning external processes. This would be a behavior change (processing the first `-m` instead of a mix), but it would be more correct and predictable. ```suggestion if [[ $posargs =~ $marker_regex ]]; then user_marker="${BASH_REMATCH[1]}" posargs="${posargs/${BASH_REMATCH[0]}/}" fi ``` ########## sdks/python/scripts/run_pytest.sh: ########## @@ -63,7 +63,7 @@ if [[ -n $user_marker ]]; then fi # Run tests in parallel. -echo "Running parallel tests with: pytest -m \"$marker_for_parallel_tests\" $pytest_args" +echo "Running parallel tests with: pytest -m \"$marker_for_parallel_tests\" $posargs" pytest -v -rs -o junit_suite_name=${envname} \ --junitxml=pytest_${envname}.xml -m "$marker_for_parallel_tests" -n 6 --import-mode=importlib ${pytest_args} --pyargs ${posargs} Review Comment:  There appears to be a significant issue with how `$posargs` is used in the `pytest` command on line 68. The entire `$posargs` variable is passed to the `--pyargs` option, which expects a list of Python package names to search for tests. However, according to the script's own documentation (line 26), `$posargs` can contain other arguments like `'-k keyword'`. If `$posargs` contains something like `path/to/tests -k "foo"`, the `pytest` command will become `pytest ... --pyargs path/to/tests -k "foo"`, which is incorrect. `pytest` will treat `-k` and `"foo"` as package names, which will likely cause test discovery to fail. To fix this, you should separate the test paths from other options within `$posargs` and pass only the paths to `--pyargs`, with other options passed as regular arguments to `pytest`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
