Re: [PR] feat(chart): add toggle for percentage metric calculation mode in Table chart [superset]

via GitHub Sun, 01 Jun 2025 13:50:06 -0700


korbit-ai[bot] commented on code in PR #33656:
URL: https://github.com/apache/superset/pull/33656#discussion_r2119579742



##########
superset/utils/pandas_postprocessing/contribution_with_totals.py:
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from decimal import Decimal
+from typing import Any
+
+from flask_babel import gettext as _
+from pandas import DataFrame, MultiIndex
+
+from superset.exceptions import InvalidPostProcessingError
+from superset.utils.core import PostProcessingContributionOrientation
+from superset.utils.pandas_postprocessing.utils import validate_column_args
+
+
+@validate_column_args("columns")
+def contribution_with_totals(  # noqa: C901
+    df: DataFrame,
+    orientation: PostProcessingContributionOrientation = 
PostProcessingContributionOrientation.COLUMN,  # noqa: E501
+    columns: list[str] | None = None,
+    time_shifts: list[str] | None = None,
+    rename_columns: list[str] | None = None,
+    totals: dict[str, float] | None = None,

Review Comment:
   ### Inconsistent Numeric Type Support <sub>![category 
Functionality](https://img.shields.io/badge/Functionality-0284c7)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   The totals parameter type annotation only allows float values, but the input 
DataFrame supports both float and Decimal types
   
   
   ###### Why this matters
   This type mismatch could cause issues when working with Decimal values, 
potentially leading to precision loss or type errors.
   
   ###### Suggested change ∙ *Feature Preview*
   Update the type annotation to support both float and Decimal:
   ```python
   from typing import Union
   totals: dict[str, Union[float, Decimal]] | None = None
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/d6b14e62-830a-4630-91f1-647fab49a1de/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/d6b14e62-830a-4630-91f1-647fab49a1de?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/d6b14e62-830a-4630-91f1-647fab49a1de?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/d6b14e62-830a-4630-91f1-647fab49a1de?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/d6b14e62-830a-4630-91f1-647fab49a1de)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:5a2c1351-ca58-4e53-bfd3-d39a1c6c73af -->
   
   
   [](5a2c1351-ca58-4e53-bfd3-d39a1c6c73af)



##########
superset/utils/pandas_postprocessing/contribution_with_totals.py:
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from decimal import Decimal
+from typing import Any
+
+from flask_babel import gettext as _
+from pandas import DataFrame, MultiIndex
+
+from superset.exceptions import InvalidPostProcessingError
+from superset.utils.core import PostProcessingContributionOrientation
+from superset.utils.pandas_postprocessing.utils import validate_column_args
+
+
+@validate_column_args("columns")
+def contribution_with_totals(  # noqa: C901
+    df: DataFrame,
+    orientation: PostProcessingContributionOrientation = 
PostProcessingContributionOrientation.COLUMN,  # noqa: E501
+    columns: list[str] | None = None,
+    time_shifts: list[str] | None = None,
+    rename_columns: list[str] | None = None,
+    totals: dict[str, float] | None = None,
+) -> DataFrame:
+    """
+    Contribution with totals from full dataset.
+
+    Divides values in `columns` by totals provided externally (from auxiliary 
query),
+    instead of computing totals from the current DataFrame.
+
+    :param df: The input DataFrame
+    :param orientation: Whether to calculate contribution per row or per column
+    :param columns: Columns to calculate percentages on
+    :param time_shifts: Optional time shift columns
+    :param rename_columns: New column names
+    :param totals: External totals per column (from full dataset)
+    """
+    if totals is None:
+        raise InvalidPostProcessingError(
+            "Missing `totals` for contribution_with_totals"
+        )
+
+    contribution_df = df.copy()
+    numeric_df = contribution_df.select_dtypes(include=["number", Decimal])
+    numeric_df.fillna(0, inplace=True)
+
+    if columns:
+        numeric_columns = numeric_df.columns.tolist()
+        for col in columns:
+            if col not in numeric_columns:
+                raise InvalidPostProcessingError(
+                    _(
+                        'Column "%(column)s" does not exist in the query 
results.',  # noqa: F823
+                        column=col,
+                    )
+                )
+    actual_columns = columns or numeric_df.columns.tolist()
+    rename_columns = rename_columns or [f"%{col}" for col in actual_columns]
+
+    if len(rename_columns) != len(actual_columns):
+        raise InvalidPostProcessingError(
+            _("`rename_columns` must match length of `columns`.")
+        )
+
+    numeric_df_view = numeric_df[actual_columns]
+
+    if orientation == PostProcessingContributionOrientation.COLUMN:
+        for i, col in enumerate(actual_columns):
+            total = totals.get(col)
+            rename_col = rename_columns[i]
+            if total is None:
+                raise InvalidPostProcessingError(_(f"Missing total for column 
`{col}`"))
+            if total == 0:
+                contribution_df[rename_col] = 0
+            else:
+                contribution_df[rename_col] = numeric_df_view[col] / total
+        return contribution_df
+
+    result = get_column_groups(numeric_df_view, time_shifts, rename_columns)
+    calculate_row_contribution(
+        contribution_df, result["non_time_shift"][0], 
result["non_time_shift"][1]
+    )
+    for __, (cols, renames) in result["time_shifts"].items():
+        calculate_row_contribution(contribution_df, cols, renames)
+
+    return contribution_df
+
+
+def get_column_groups(
+    df: DataFrame, time_shifts: list[str] | None, rename_columns: list[str]
+) -> dict[str, Any]:
+    result: dict[str, Any] = {
+        "non_time_shift": ([], []),
+        "time_shifts": {},
+    }
+    for i, col in enumerate(df.columns):
+        col_0 = col[0] if isinstance(df.columns, MultiIndex) else col
+        time_shift = None
+        if time_shifts and isinstance(col_0, str):
+            for ts in time_shifts:
+                if col_0.endswith(ts):
+                    time_shift = ts
+                    break

Review Comment:
   ### Inefficient Time Shift Lookup <sub>![category 
Performance](https://img.shields.io/badge/Performance-4f46e5)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   Linear search through time_shifts list for each column when checking 
endswith condition.
   
   
   ###### Why this matters
   O(n*m) complexity where n is number of columns and m is number of time 
shifts. For large datasets with many time shift options, this becomes 
inefficient.
   
   ###### Suggested change ∙ *Feature Preview*
   Convert time_shifts to a set for O(1) lookup and use any() with generator 
expression:
   ```python
   time_shifts_set = set(time_shifts) if time_shifts else set()
   time_shift = next((ts for ts in time_shifts_set if col_0.endswith(ts)), None)
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/e9428c1a-0d0a-426c-9a33-7bf07aabec12/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/e9428c1a-0d0a-426c-9a33-7bf07aabec12?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/e9428c1a-0d0a-426c-9a33-7bf07aabec12?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/e9428c1a-0d0a-426c-9a33-7bf07aabec12?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/e9428c1a-0d0a-426c-9a33-7bf07aabec12)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:42b12479-6747-46ce-915d-a8ef0a82180d -->
   
   
   [](42b12479-6747-46ce-915d-a8ef0a82180d)



##########
superset/common/query_context_processor.py:
##########
@@ -741,24 +741,73 @@ def get_data(
 
         return df.to_dict(orient="records")
 
-    def get_payload(
+    def get_payload(  # noqa: C901
         self,
         cache_query_context: bool | None = False,
         force_cached: bool = False,
     ) -> dict[str, Any]:
         """Returns the query results with both metadata and data"""
+        totals_data = None
+        main_query = self._query_context.queries[0]
+        totals_query_index = None
+        totals_result = None
+        for i, query_obj in enumerate(self._query_context.queries):
+            if (
+                query_obj.columns == []
+                and not query_obj.post_processing
+                and query_obj.metrics == main_query.metrics
+            ):
+                totals_query_index = i
+                totals_result = get_query_results(

Review Comment:
   ### Inefficient Totals Query Search <sub>![category 
Performance](https://img.shields.io/badge/Performance-4f46e5)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   Inefficient search for totals query by iterating through all queries and 
checking multiple conditions repeatedly.
   
   
   ###### Why this matters
   Linear search through queries with multiple condition checks creates 
unnecessary overhead, especially when the totals query is either at the end of 
the list or doesn't exist.
   
   ###### Suggested change ∙ *Feature Preview*
   Pre-filter queries using a list comprehension or filter to identify the 
totals query in a single pass:
   ```python
   totals_queries = [
       (i, q) for i, q in enumerate(self._query_context.queries)
       if not q.columns and not q.post_processing and q.metrics == 
main_query.metrics
   ]
   totals_query_index, totals_query = totals_queries[0] if totals_queries else 
(None, None)
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35c1cecb-3b02-4fc4-8c99-0096259460aa/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35c1cecb-3b02-4fc4-8c99-0096259460aa?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35c1cecb-3b02-4fc4-8c99-0096259460aa?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35c1cecb-3b02-4fc4-8c99-0096259460aa?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35c1cecb-3b02-4fc4-8c99-0096259460aa)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:01f2ff50-04ac-45ff-83eb-953f7d53e688 -->
   
   
   [](01f2ff50-04ac-45ff-83eb-953f7d53e688)



##########
superset/utils/pandas_postprocessing/contribution_with_totals.py:
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from decimal import Decimal
+from typing import Any
+
+from flask_babel import gettext as _
+from pandas import DataFrame, MultiIndex
+
+from superset.exceptions import InvalidPostProcessingError
+from superset.utils.core import PostProcessingContributionOrientation
+from superset.utils.pandas_postprocessing.utils import validate_column_args
+
+
+@validate_column_args("columns")
+def contribution_with_totals(  # noqa: C901
+    df: DataFrame,
+    orientation: PostProcessingContributionOrientation = 
PostProcessingContributionOrientation.COLUMN,  # noqa: E501
+    columns: list[str] | None = None,
+    time_shifts: list[str] | None = None,
+    rename_columns: list[str] | None = None,
+    totals: dict[str, float] | None = None,
+) -> DataFrame:
+    """
+    Contribution with totals from full dataset.
+
+    Divides values in `columns` by totals provided externally (from auxiliary 
query),
+    instead of computing totals from the current DataFrame.
+
+    :param df: The input DataFrame
+    :param orientation: Whether to calculate contribution per row or per column
+    :param columns: Columns to calculate percentages on
+    :param time_shifts: Optional time shift columns
+    :param rename_columns: New column names
+    :param totals: External totals per column (from full dataset)
+    """
+    if totals is None:
+        raise InvalidPostProcessingError(
+            "Missing `totals` for contribution_with_totals"
+        )
+
+    contribution_df = df.copy()
+    numeric_df = contribution_df.select_dtypes(include=["number", Decimal])
+    numeric_df.fillna(0, inplace=True)

Review Comment:
   ### Forced NA to Zero Conversion <sub>![category 
Functionality](https://img.shields.io/badge/Functionality-0284c7)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   The function fills NA values with 0 before calculation, which might not be 
the desired behavior for all use cases
   
   
   ###### Why this matters
   Missing values could represent meaningful absence of data, and automatically 
converting them to 0 affects the contribution calculations without giving users 
control over NA handling.
   
   ###### Suggested change ∙ *Feature Preview*
   Add an optional parameter for NA handling:
   ```python
   def contribution_with_totals(
       df: DataFrame,
       orientation: PostProcessingContributionOrientation = 
PostProcessingContributionOrientation.COLUMN,
       columns: list[str] | None = None,
       time_shifts: list[str] | None = None,
       rename_columns: list[str] | None = None,
       totals: dict[str, float] | None = None,
       fill_na: float | None = 0,
   ) -> DataFrame:
       # ... existing code ...
       if fill_na is not None:
           numeric_df.fillna(fill_na, inplace=True)
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35b02e69-1be0-422e-af0e-d371a3b155ee/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35b02e69-1be0-422e-af0e-d371a3b155ee?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35b02e69-1be0-422e-af0e-d371a3b155ee?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35b02e69-1be0-422e-af0e-d371a3b155ee?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/35b02e69-1be0-422e-af0e-d371a3b155ee)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:58f772e4-ad19-475b-898d-13fbecd6fbe8 -->
   
   
   [](58f772e4-ad19-475b-898d-13fbecd6fbe8)



##########
superset-frontend/plugins/plugin-chart-table/src/buildQuery.ts:
##########
@@ -139,22 +141,35 @@ const buildQuery: BuildQuery<TableChartFormData> = (
               timeOffsets,
             )
           : percentMetrics.map(getMetricLabel);
+
         const percentMetricLabels = removeDuplicates(
           percentMetricsLabelsWithTimeComparison,
         );
+
         metrics = removeDuplicates(
           metrics.concat(percentMetrics),
           getMetricLabel,
         );
-        postProcessing = [
-          {
+
+        if (calculationMode === 'all_records') {
+          postProcessing.push({
+            operation: 'contribution_with_totals',
+            options: {
+              columns: percentMetricLabels,
+              rename_columns: percentMetricLabels.map(m => `%${m}`),
+              totals: (formData.extra_form_data as Record<string, any>)
+                ?.contribution_totals?.totals,
+            },
+          } as unknown as PostProcessingRule);

Review Comment:
   ### Unsafe Double Type Assertion <sub>![category 
Readability](https://img.shields.io/badge/Readability-0284c7)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   Double type assertion (as unknown as PostProcessingRule) indicates a 
potential type safety issue and makes the code harder to understand.
   
   
   ###### Why this matters
   Using double type assertions bypasses TypeScript's type checking, making it 
unclear why the assertion is needed and potentially hiding type 
incompatibilities.
   
   ###### Suggested change ∙ *Feature Preview*
   ```typescript
   // Define proper interface for the contribution operation
   interface ContributionWithTotalsRule extends PostProcessingRule {
     operation: 'contribution_with_totals';
     options: {
       columns: string[];
       rename_columns: string[];
       totals: any; // Replace 'any' with proper type
     };
   }
   // Then use it directly
   } as ContributionWithTotalsRule);
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/c7504401-5931-414d-8607-e7508aa8af66/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/c7504401-5931-414d-8607-e7508aa8af66?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/c7504401-5931-414d-8607-e7508aa8af66?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/c7504401-5931-414d-8607-e7508aa8af66?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/c7504401-5931-414d-8607-e7508aa8af66)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:8e44da7d-1ce6-4499-9a63-83b81a48646f -->
   
   
   [](8e44da7d-1ce6-4499-9a63-83b81a48646f)



##########
superset-frontend/plugins/plugin-chart-table/src/controlPanel.tsx:
##########
@@ -161,12 +161,30 @@ const generateComparisonColumns = (colname: string) => [
   `△ ${colname}`,
   `% ${colname}`,
 ];
+
 /**
  * Generate column types for the comparison columns.
  */
 const generateComparisonColumnTypes = (count: number) =>
   Array(count).fill(GenericDataType.Numeric);
 
+const percentMetricCalculationControl: ControlConfig<'SelectControl'> = {
+  type: 'SelectControl',
+  label: t('Percentage metric calculation'),
+  description: t(
+    'Row Limit: percentages are calculated based on the subset of data 
retrieved, respecting the row limit. ' +
+      'All Records: Percentages are calculated based on the total dataset, 
ignoring the row limit.',
+  ),
+  default: 'row_limit',
+  clearable: false,
+  choices: [
+    ['row_limit', t('Row limit')],
+    ['all_records', t('All records')],
+  ],
+  visibility: isAggMode,
+  renderTrigger: false,

Review Comment:
   ### Missing render trigger for percentage calculation mode <sub>![category 
Functionality](https://img.shields.io/badge/Functionality-0284c7)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   The 'percent_metric_calculation' control is not set to trigger a re-render 
when its value changes, which means the table won't update immediately when 
users switch between 'Row limit' and 'All records' modes.
   
   
   ###### Why this matters
   Users won't see immediate visual feedback when they change the percentage 
calculation mode, leading to confusion about whether their selection was 
applied.
   
   ###### Suggested change ∙ *Feature Preview*
   Set renderTrigger to true in the percentMetricCalculationControl config:
   ```typescript
   const percentMetricCalculationControl: ControlConfig<'SelectControl'> = {
     // ... other config
     renderTrigger: true,
   }
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/61bb0402-25fa-4d4a-b1c1-bc5b578e1357/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/61bb0402-25fa-4d4a-b1c1-bc5b578e1357?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/61bb0402-25fa-4d4a-b1c1-bc5b578e1357?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/61bb0402-25fa-4d4a-b1c1-bc5b578e1357?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/61bb0402-25fa-4d4a-b1c1-bc5b578e1357)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:61308594-60b9-4e6a-9419-72eca5707d1e -->
   
   
   [](61308594-60b9-4e6a-9419-72eca5707d1e)



##########
superset/utils/pandas_postprocessing/contribution_with_totals.py:
##########
@@ -0,0 +1,132 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+from __future__ import annotations
+
+from decimal import Decimal
+from typing import Any
+
+from flask_babel import gettext as _
+from pandas import DataFrame, MultiIndex
+
+from superset.exceptions import InvalidPostProcessingError
+from superset.utils.core import PostProcessingContributionOrientation
+from superset.utils.pandas_postprocessing.utils import validate_column_args
+
+
+@validate_column_args("columns")
+def contribution_with_totals(  # noqa: C901
+    df: DataFrame,
+    orientation: PostProcessingContributionOrientation = 
PostProcessingContributionOrientation.COLUMN,  # noqa: E501
+    columns: list[str] | None = None,
+    time_shifts: list[str] | None = None,
+    rename_columns: list[str] | None = None,
+    totals: dict[str, float] | None = None,
+) -> DataFrame:
+    """
+    Contribution with totals from full dataset.
+
+    Divides values in `columns` by totals provided externally (from auxiliary 
query),
+    instead of computing totals from the current DataFrame.
+
+    :param df: The input DataFrame
+    :param orientation: Whether to calculate contribution per row or per column
+    :param columns: Columns to calculate percentages on
+    :param time_shifts: Optional time shift columns
+    :param rename_columns: New column names
+    :param totals: External totals per column (from full dataset)
+    """
+    if totals is None:
+        raise InvalidPostProcessingError(
+            "Missing `totals` for contribution_with_totals"
+        )
+
+    contribution_df = df.copy()
+    numeric_df = contribution_df.select_dtypes(include=["number", Decimal])
+    numeric_df.fillna(0, inplace=True)
+
+    if columns:
+        numeric_columns = numeric_df.columns.tolist()
+        for col in columns:
+            if col not in numeric_columns:
+                raise InvalidPostProcessingError(
+                    _(
+                        'Column "%(column)s" does not exist in the query 
results.',  # noqa: F823
+                        column=col,
+                    )
+                )
+    actual_columns = columns or numeric_df.columns.tolist()
+    rename_columns = rename_columns or [f"%{col}" for col in actual_columns]
+
+    if len(rename_columns) != len(actual_columns):
+        raise InvalidPostProcessingError(
+            _("`rename_columns` must match length of `columns`.")
+        )
+
+    numeric_df_view = numeric_df[actual_columns]
+
+    if orientation == PostProcessingContributionOrientation.COLUMN:
+        for i, col in enumerate(actual_columns):
+            total = totals.get(col)
+            rename_col = rename_columns[i]
+            if total is None:
+                raise InvalidPostProcessingError(_(f"Missing total for column 
`{col}`"))
+            if total == 0:
+                contribution_df[rename_col] = 0
+            else:
+                contribution_df[rename_col] = numeric_df_view[col] / total
+        return contribution_df
+
+    result = get_column_groups(numeric_df_view, time_shifts, rename_columns)
+    calculate_row_contribution(
+        contribution_df, result["non_time_shift"][0], 
result["non_time_shift"][1]
+    )
+    for __, (cols, renames) in result["time_shifts"].items():
+        calculate_row_contribution(contribution_df, cols, renames)
+
+    return contribution_df
+
+
+def get_column_groups(
+    df: DataFrame, time_shifts: list[str] | None, rename_columns: list[str]
+) -> dict[str, Any]:
+    result: dict[str, Any] = {
+        "non_time_shift": ([], []),
+        "time_shifts": {},
+    }
+    for i, col in enumerate(df.columns):
+        col_0 = col[0] if isinstance(df.columns, MultiIndex) else col
+        time_shift = None
+        if time_shifts and isinstance(col_0, str):
+            for ts in time_shifts:
+                if col_0.endswith(ts):
+                    time_shift = ts
+                    break
+        if time_shift:
+            result["time_shifts"].setdefault(time_shift, ([], []))
+            result["time_shifts"][time_shift][0].append(col)
+            result["time_shifts"][time_shift][1].append(rename_columns[i])
+        else:
+            result["non_time_shift"][0].append(col)
+            result["non_time_shift"][1].append(rename_columns[i])
+    return result
+
+
+def calculate_row_contribution(
+    df: DataFrame, columns: list[str], rename_columns: list[str]
+) -> None:
+    row_sum = df.loc[:, columns].sum(axis=1).replace(0, None)
+    df.loc[:, rename_columns] = df.loc[:, columns].div(row_sum, 
axis=0).fillna(0)

Review Comment:
   ### Suboptimal DataFrame Operations <sub>![category 
Performance](https://img.shields.io/badge/Performance-4f46e5)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   Multiple DataFrame operations with repeated column access and intermediate 
object creation.
   
   
   ###### Why this matters
   Creates unnecessary temporary DataFrames and performs multiple passes over 
the data.
   
   ###### Suggested change ∙ *Feature Preview*
   Combine operations to minimize intermediates:
   ```python
   df.loc[:, rename_columns] = (df.loc[:, columns]
                               .div(df.loc[:, columns].sum(axis=1)
                                    .replace(0, float('inf')), axis=0)
                               .fillna(0))
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/604fa2f0-c2ab-438a-8d1c-1d4fa007c85d/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/604fa2f0-c2ab-438a-8d1c-1d4fa007c85d?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/604fa2f0-c2ab-438a-8d1c-1d4fa007c85d?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/604fa2f0-c2ab-438a-8d1c-1d4fa007c85d?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/604fa2f0-c2ab-438a-8d1c-1d4fa007c85d)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:8a3e8a46-6c10-4706-b81c-2baef441cf7e -->
   
   
   [](8a3e8a46-6c10-4706-b81c-2baef441cf7e)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org
For additional commands, e-mail: notifications-h...@superset.apache.org

Re: [PR] feat(chart): add toggle for percentage metric calculation mode in Table chart [superset]

Reply via email to