codeant-ai-for-open-source[bot] commented on code in PR #39885:
URL: https://github.com/apache/superset/pull/39885#discussion_r3192351648


##########
superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/useColDefs.ts:
##########
@@ -317,6 +320,20 @@ export const useColDefs = ({
         ...(isPercentMetric && {
           filterValueGetter,
         }),
+        ...(dataType === GenericDataType.String && {
+          // HTML cells (e.g. anchor markup) are rendered by TextCellRenderer
+          // via dangerouslySetInnerHTML; without these the filter and sort
+          // operate on raw HTML so the URL inside the markup dictates order
+          // and the "Contains" filter matches against the raw HTML string.
+          //
+          // Scope: client-side only. When `serverPagination` is enabled, a
+          // later spread overrides `comparator` with `() => 0` so sorting is
+          // delegated to the server; the database does not know to extract
+          // visible text from HTML, so server-paginated tables with HTML
+          // columns are out of scope for this fix.
+          filterValueGetter: htmlTextFilterValueGetter,
+          comparator: htmlTextComparator,

Review Comment:
   **🟠 Architect Review — HIGH**
   
   When serverPagination is enabled, string columns still use 
htmlTextFilterValueGetter on the client while server-side filtering continues 
to operate on raw HTML values, so client-side filter evaluation is based on 
visible text but the backend query is based on the unstripped HTML, leading to 
inconsistent or empty results for the same filter (e.g. HTML with nested tags 
where the visible text is not a contiguous substring of the raw HTML).
   
   **Suggestion:** Gate htmlTextFilterValueGetter behind client-side mode for 
string columns when serverPagination is true, or introduce equivalent 
HTML-to-text normalization in the server-side filter conversion so both client 
and server evaluate filters against the same text value.
   
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt=This%20is%20an%20%2A%2AArchitect%20%2F%20Logical%20Review%2A%2A%20comment%20left%20during%20a%20code%20review.%20These%20reviews%20are%20first-class%2C%20important%20findings%20%E2%80%94%20not%20optional%20suggestions.%20Do%20NOT%20dismiss%20this%20as%20a%20%27big%20architectural%20change%27%20just%20because%20the%20title%20says%20architect%20review%3B%20most%20of%20these%20can%20be%20resolved%20with%20a%20small%2C%20localized%20fix%20once%20the%20intent%20is%20understood.%0A%0A%2A%2APath%3A%2A%2A%20superset-frontend%2Fplugins%2Fplugin-chart-ag-grid-table%2Fsrc%2Futils%2FuseColDefs.ts%0A%2A%2ALine%3A%2A%2A%20323%3A335%0A%2A%2AComment%3A%2A%2A%0A%09%2AHIGH%3A%20When%20serverPagination%20is%20enabled%2C%20string%20columns%20still%20use%20htmlTextFilterValueGetter%20on%20the%20client%20while%20server-side%20filtering%20continues%20to%20operate%20on%20raw%20HTML%20values%2C%20so%20client-side%20filter%20evaluation%20
 
is%20based%20on%20visible%20text%20but%20the%20backend%20query%20is%20based%20on%20the%20unstripped%20HTML%2C%20leading%20to%20inconsistent%20or%20empty%20results%20for%20the%20same%20filter%20%28e.g.%20HTML%20with%20nested%20tags%20where%20the%20visible%20text%20is%20not%20a%20contiguous%20substring%20of%20the%20raw%20HTML%29.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AIf%20a%20suggested%20approach%20is%20provided%20above%2C%20use%20it%20as%20the%20authoritative%20instruction.%20If%20no%20explicit%20code%20suggestion%20is%20given%2C%20you%20MUST%20still%20draft%20and%20apply%20your%20own%20minimal%2C%20localized%20fix%20%E2%80%94%20do%20not%20punt%20back%20with%20%27no%20suggestion%20provided%2C%20review%20manually%27.%20Keep%20the%20change%20as%20small%20as%20possible%3A%20add%20a%20guard%20clause%2C%20gate%20on%20a%2
 
0loading%20state%2C%20reorder%20an%20await%2C%20wrap%20in%20a%20conditional%2C%20etc.%20Do%20not%20refactor%20surrounding%20code%20or%20expand%20scope%20beyond%20the%20finding.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt=This%20is%20an%20%2A%2AArchitect%20%2F%20Logical%20Review%2A%2A%20comment%20left%20during%20a%20code%20review.%20These%20reviews%20are%20first-class%2C%20important%20findings%20%E2%80%94%20not%20optional%20suggestions.%20Do%20NOT%20dismiss%20this%20as%20a%20%27big%20architectural%20change%27%20just%20because%20the%20title%20says%20architect%20review%3B%20most%20of%20these%20can%20be%20resolved%20with%
 
20a%20small%2C%20localized%20fix%20once%20the%20intent%20is%20understood.%0A%0A%2A%2APath%3A%2A%2A%20superset-frontend%2Fplugins%2Fplugin-chart-ag-grid-table%2Fsrc%2Futils%2FuseColDefs.ts%0A%2A%2ALine%3A%2A%2A%20323%3A335%0A%2A%2AComment%3A%2A%2A%0A%09%2AHIGH%3A%20When%20serverPagination%20is%20enabled%2C%20string%20columns%20still%20use%20htmlTextFilterValueGetter%20on%20the%20client%20while%20server-side%20filtering%20continues%20to%20operate%20on%20raw%20HTML%20values%2C%20so%20client-side%20filter%20evaluation%20is%20based%20on%20visible%20text%20but%20the%20backend%20query%20is%20based%20on%20the%20unstripped%20HTML%2C%20leading%20to%20inconsistent%20or%20empty%20results%20for%20the%20same%20filter%20%28e.g.%20HTML%20with%20nested%20tags%20where%20the%20visible%20text%20is%20not%20a%20contiguous%20substring%20of%20the%20raw%20HTML%29.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20
 
fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AIf%20a%20suggested%20approach%20is%20provided%20above%2C%20use%20it%20as%20the%20authoritative%20instruction.%20If%20no%20explicit%20code%20suggestion%20is%20given%2C%20you%20MUST%20still%20draft%20and%20apply%20your%20own%20minimal%2C%20localized%20fix%20%E2%80%94%20do%20not%20punt%20back%20with%20%27no%20suggestion%20provided%2C%20review%20manually%27.%20Keep%20the%20change%20as%20small%20as%20possible%3A%20add%20a%20guard%20clause%2C%20gate%20on%20a%20loading%20state%2C%20reorder%20an%20await%2C%20wrap%20in%20a%20conditional%2C%20etc.%20Do%20not%20refactor%20surrounding%20code%20or%20expand%20scope%20beyond%20the%20finding.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20i
 mplement%20a%20minimal%20fix%0A)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is an **Architect / Logical Review** comment left during a code review. 
These reviews are first-class, important findings — not optional suggestions. 
Do NOT dismiss this as a 'big architectural change' just because the title says 
architect review; most of these can be resolved with a small, localized fix 
once the intent is understood.
   
   **Path:** 
superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/useColDefs.ts
   **Line:** 323:335
   **Comment:**
        *HIGH: When serverPagination is enabled, string columns still use 
htmlTextFilterValueGetter on the client while server-side filtering continues 
to operate on raw HTML values, so client-side filter evaluation is based on 
visible text but the backend query is based on the unstripped HTML, leading to 
inconsistent or empty results for the same filter (e.g. HTML with nested tags 
where the visible text is not a contiguous substring of the raw HTML).
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   If a suggested approach is provided above, use it as the authoritative 
instruction. If no explicit code suggestion is given, you MUST still draft and 
apply your own minimal, localized fix — do not punt back with 'no suggestion 
provided, review manually'. Keep the change as small as possible: add a guard 
clause, gate on a loading state, reorder an await, wrap in a conditional, etc. 
Do not refactor surrounding code or expand scope beyond the finding.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>



##########
superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/htmlTextFilterValueGetter.ts:
##########
@@ -0,0 +1,60 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+import { isProbablyHTML, sanitizeHtml } from '@superset-ui/core';
+import { ValueGetterParams } from 
'@superset-ui/core/components/ThemedAgGridReact';
+
+const stripHtmlToText = (html: string): string => {
+  const doc = new DOMParser().parseFromString(sanitizeHtml(html), 'text/html');
+  return (doc.body.textContent || '').trim();
+};

Review Comment:
   **Suggestion:** The comparator sanitizes and parses HTML during every 
comparison call, and sorting invokes the comparator many times (`O(n log n)`), 
which can cause major UI slowdowns on larger tables. Precompute/cache the 
extracted text once per row value (for example via value getter or memoization) 
and compare cached text instead of reparsing inside comparator hot path. 
[performance]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ⚠️ Sorting large HTML string columns can lag noticeably.
   - ⚠️ Comparator does DOMParser work on every comparison call.
   ```
   </details>
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Create a Table V2 (AG Grid) chart via `TableChart` in
   
`superset-frontend/plugins/plugin-chart-ag-grid-table/src/AgGridTableChart.tsx:55-89`,
   with a string dimension whose values contain HTML (e.g., anchor tags 
rendered by
   `TextCellRenderer`), and a large row count (thousands of records).
   
   2. `TableChart` builds column definitions using `useColDefs`
   
(`superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/useColDefs.ts:215-311`);
   for string columns (`dataType === GenericDataType.String`) it sets 
`filterValueGetter:
   htmlTextFilterValueGetter` and `comparator: htmlTextComparator` on the 
colDef at lines
   44-57.
   
   3. When the user clicks that column's header to sort, AG Grid (via 
`ThemedAgGridReact`,
   imported in `AgGridTableChart.tsx:29-32`) repeatedly invokes 
`htmlTextComparator` from
   `htmlTextFilterValueGetter.ts:47-57` as part of its `O(n log n)` sort, 
passing the raw
   cell values for many pairwise comparisons.
   
   4. For each comparison where the value looks like HTML, `htmlTextComparator` 
calls
   `stripHtmlToText` (`htmlTextFilterValueGetter.ts:22-25`), which runs 
`sanitizeHtml(html)`
   and `new DOMParser().parseFromString(..., 'text/html')`. With thousands of 
HTML cells,
   this DOM parsing and sanitization happens thousands of times in a hot sort 
loop, which can
   realistically cause noticeable UI sluggishness while sorting large 
HTML-heavy tables.
   Precomputing or caching the stripped text per row would avoid this repeated 
work.
   ```
   </details>
   
   [Fix in 
Cursor](https://app.codeant.ai/fix-in-ide?tool=cursor&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20superset-frontend%2Fplugins%2Fplugin-chart-ag-grid-table%2Fsrc%2Futils%2FhtmlTextFilterValueGetter.ts%0A%2A%2ALine%3A%2A%2A%2022%3A25%0A%2A%2AComment%3A%2A%2A%0A%09%2APerformance%3A%20The%20comparator%20sanitizes%20and%20parses%20HTML%20during%20every%20comparison%20call%2C%20and%20sorting%20invokes%20the%20comparator%20many%20times%20%28%60O%28n%20log%20n%29%60%29%2C%20which%20can%20cause%20major%20UI%20slowdowns%20on%20larger%20tables.%20Precompute%2Fcache%20the%20extracted%20text%20once%20per%20row%20value%20%28for%20example%20via%20value%20getter%20or%20memoization%29%20and%20compare%20cached%20text%20instead%20of%20reparsing%20inside%20comparator%20hot%20path.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%
 
20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
 | [Fix in VSCode 
Claude](https://app.codeant.ai/fix-in-ide?tool=vscode-claude&prompt=This%20is%20a%20comment%20left%20during%20a%20code%20review.%0A%0A%2A%2APath%3A%2A%2A%20superset-frontend%2Fplugins%2Fplugin-chart-ag-grid-table%2Fsrc%2Futils%2FhtmlTextFilterValueGetter.ts%0A%2A%2ALine%3A%2A%2A%2022%3A25%0A%2A%2AComment%3A%2A%2A%0A%09%2APerformance%3A%20The%20comparator%20sanitizes%20and%20parses%20HTML%20during%20every%20comparison%20call%2C%20and%20sorting%20invokes%20the%20comparator%20many%20times%20%28%60O%28n%20log%20n%29%60%29%2C%20which%20can%20cause%20major%20UI%20slowdowns%20on%20larger%20tables.%20Precom
 
pute%2Fcache%20the%20extracted%20text%20once%20per%20row%20value%20%28for%20example%20via%20value%20getter%20or%20memoization%29%20and%20compare%20cached%20text%20instead%20of%20reparsing%20inside%20comparator%20hot%20path.%0A%0AValidate%20the%20correctness%20of%20the%20flagged%20issue.%20If%20correct%2C%20How%20can%20I%20resolve%20this%3F%20If%20you%20propose%20a%20fix%2C%20implement%20it%20and%20please%20make%20it%20concise.%0AOnce%20fix%20is%20implemented%2C%20also%20check%20other%20comments%20on%20the%20same%20PR%2C%20and%20ask%20user%20if%20the%20user%20wants%20to%20fix%20the%20rest%20of%20the%20comments%20as%20well.%20if%20said%20yes%2C%20then%20fetch%20all%20the%20comments%20validate%20the%20correctness%20and%20implement%20a%20minimal%20fix%0A)
   
   *(Use Cmd/Ctrl + Click for best experience)*
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** 
superset-frontend/plugins/plugin-chart-ag-grid-table/src/utils/htmlTextFilterValueGetter.ts
   **Line:** 22:25
   **Comment:**
        *Performance: The comparator sanitizes and parses HTML during every 
comparison call, and sorting invokes the comparator many times (`O(n log n)`), 
which can cause major UI slowdowns on larger tables. Precompute/cache the 
extracted text once per row value (for example via value getter or memoization) 
and compare cached text instead of reparsing inside comparator hot path.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   Once fix is implemented, also check other comments on the same PR, and ask 
user if the user wants to fix the rest of the comments as well. if said yes, 
then fetch all the comments validate the correctness and implement a minimal fix
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39885&comment_hash=aceb14a91df018c1246365468d7178489465204c0984016adcf90e24c2a3f61d&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F39885&comment_hash=aceb14a91df018c1246365468d7178489465204c0984016adcf90e24c2a3f61d&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to