bito-code-review[bot] commented on code in PR #35481:
URL: https://github.com/apache/superset/pull/35481#discussion_r2401828025
##########
tests/unit_tests/utils/csv_tests.py:
##########
@@ -263,3 +278,39 @@ def
test_get_chart_dataframe_with_hierarchical_columns(monkeypatch: pytest.Monke
| ('idx',) | 2 |
"""
assert markdown_str.strip() == expected_markdown_str.strip()
+
+
+def test_get_chart_dataframe_removes_na_string_values(monkeypatch:
pytest.MonkeyPatch):
+ """
+ Test that get_chart_dataframe currently removes rows containing "NA"
string values.
+ This test verifies the existing behavior before implementing custom NA
handling.
+ """
Review Comment:
<div>
<div id="suggestion">
<div id="issue"><b>Test documentation mismatch</b></div>
<div id="fix">
The test name `test_get_chart_dataframe_removes_na_string_values` and its
docstring claim to verify that `get_chart_dataframe` removes rows containing
"NA" string values, but the actual assertions verify the opposite behavior -
that both rows are preserved and the "NA" string value remains intact. This
creates a critical contradiction that will mislead developers about the actual
system behavior. The test should be renamed to
`test_get_chart_dataframe_preserves_na_string_values` and the docstring updated
to accurately reflect that the current implementation preserves rows with "NA"
string values, which is the actual behavior verified by the assertions.
</div>
<details>
<summary>
<b>Code suggestion</b>
</summary>
<blockquote>Check the AI-generated fix before applying</blockquote>
<div id="code">
```suggestion
def test_get_chart_dataframe_preserves_na_string_values(monkeypatch:
pytest.MonkeyPatch):
"""
Test that get_chart_dataframe currently preserves rows containing "NA"
string values.
This test verifies the existing behavior before implementing custom NA
handling.
"""
```
</div>
</details>
</div>
<small><i>Code Review Run <a
href=https://github.com/apache/superset/pull/35481#issuecomment-3365574513>#82cc79</a></i></small>
</div>
---
Should Bito avoid suggestions like this for future reviews? (<a
href=https://alpha.bito.ai/home/ai-agents/review-rules>Manage Rules</a>)
- [ ] Yes, avoid them
##########
superset/config.py:
##########
@@ -1354,6 +1354,15 @@ def allowed_schemas_for_csv_upload( # pylint:
disable=unused-argument
# Values that should be treated as nulls for the csv uploads.
CSV_DEFAULT_NA_NAMES = list(STR_NA_VALUES)
+# Values that should be treated as nulls for scheduled reports CSV processing.
+# By default, uses the same values as CSV uploads. Set to None to disable
+# automatic NA conversion (preserving original string values like "NA"),
+# or provide a custom list of values that should be treated as null.
+# Examples:
+# REPORTS_CSV_NA_NAMES = None # Disable all automatic NA conversion
+# REPORTS_CSV_NA_NAMES = ["", "NULL", "null"] # Only treat these as NA
+REPORTS_CSV_NA_NAMES = CSV_DEFAULT_NA_NAMES
Review Comment:
<div>
<div id="suggestion">
<div id="issue"><b>Shared mutable reference</b></div>
<div id="fix">
The configuration `REPORTS_CSV_NA_NAMES = CSV_DEFAULT_NA_NAMES` creates a
shared reference issue. Both variables point to the same list object from
`STR_NA_VALUES`, so any runtime modifications to `REPORTS_CSV_NA_NAMES` (like
appending values) will also modify `CSV_DEFAULT_NA_NAMES` and potentially
affect CSV upload processing. This breaks the intended isolation between CSV
upload NA handling and scheduled reports CSV processing. Use
`list(CSV_DEFAULT_NA_NAMES)` to create a separate copy.
</div>
<details>
<summary>
<b>Code suggestion</b>
</summary>
<blockquote>Check the AI-generated fix before applying</blockquote>
<div id="code">
```suggestion
REPORTS_CSV_NA_NAMES = list(CSV_DEFAULT_NA_NAMES)
```
</div>
</details>
</div>
<small><i>Code Review Run <a
href=https://github.com/apache/superset/pull/35481#issuecomment-3365574513>#82cc79</a></i></small>
</div>
---
Should Bito avoid suggestions like this for future reviews? (<a
href=https://alpha.bito.ai/home/ai-agents/review-rules>Manage Rules</a>)
- [ ] Yes, avoid them
##########
superset/charts/client_processing.py:
##########
@@ -340,7 +341,11 @@ def apply_client_processing( # noqa: C901
if query["result_format"] == ChartDataResultFormat.JSON:
df = pd.DataFrame.from_dict(data)
elif query["result_format"] == ChartDataResultFormat.CSV:
- df = pd.read_csv(StringIO(data))
+ # Use custom NA values configuration for
+ # reports to avoid unwanted conversions
+ # This allows users to control which values should be treated as
null/NA
+ na_values = current_app.config.get("REPORTS_CSV_NA_NAMES", None)
+ df = pd.read_csv(StringIO(data), keep_default_na=False,
na_values=na_values)
Review Comment:
<div>
<div id="suggestion">
<div id="issue"><b>Backward compatibility break</b></div>
<div id="fix">
The current implementation always sets `keep_default_na=False` which will
break backward compatibility for existing CSV reports. When
`REPORTS_CSV_NA_NAMES` is not configured (None), this disables ALL automatic NA
detection including standard values like empty strings, "NA", "N/A", etc. This
will cause existing reports to show literal "NA" strings instead of null
values. The fix preserves default pandas behavior when the config is not
explicitly set, only applying custom NA handling when `REPORTS_CSV_NA_NAMES` is
provided.
</div>
<details>
<summary>
<b>Code suggestion</b>
</summary>
<blockquote>Check the AI-generated fix before applying</blockquote>
<div id="code">
```suggestion
na_values = current_app.config.get("REPORTS_CSV_NA_NAMES", None)
if na_values is not None:
df = pd.read_csv(StringIO(data), keep_default_na=False,
na_values=na_values)
else:
df = pd.read_csv(StringIO(data))
```
</div>
</details>
</div>
<small><i>Code Review Run <a
href=https://github.com/apache/superset/pull/35481#issuecomment-3365574513>#82cc79</a></i></small>
</div>
---
Should Bito avoid suggestions like this for future reviews? (<a
href=https://alpha.bito.ai/home/ai-agents/review-rules>Manage Rules</a>)
- [ ] Yes, avoid them
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]