Re: [PR] feat: Joins editor [superset]

via GitHub Thu, 18 Dec 2025 03:12:08 -0800


codeant-ai-for-open-source[bot] commented on code in PR #36734:
URL: https://github.com/apache/superset/pull/36734#discussion_r2630620061



##########
superset-frontend/src/components/DatabaseSchemaEditor/JoinsList.tsx:
##########
@@ -0,0 +1,341 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+import { useState } from 'react';
+import { t, SupersetClient } from '@superset-ui/core';
+import { styled } from '@apache-superset/core/ui';
+import {
+  Table,
+  Button,
+  Space,
+  Typography,
+  Popconfirm,
+  Tag,
+  message,
+} from '@superset-ui/core/components';
+import { EditOutlined, DeleteOutlined, PlusOutlined } from '@ant-design/icons';
+import JoinEditorModal, {
+  Join,
+  JoinType,
+  Cardinality,
+  Table as TableType,
+} from './JoinEditorModal';
+
+interface JoinsListProps {
+  databaseReportId: number;
+  joins: Join[];
+  tables: TableType[];
+  onJoinsUpdate?: (joins: Join[]) => void;
+  editable?: boolean;
+}
+
+const StyledTableContainer = styled.div`
+  ${({ theme }) => `
+    padding: ${theme.sizeUnit * 4}px;
+    background-color: ${theme.colorBgContainer};
+    border-radius: ${theme.borderRadiusSM}px;
+  `}
+`;
+
+const HeaderContainer = styled.div`
+  ${({ theme }) => `
+    display: flex;
+    justify-content: space-between;
+    align-items: center;
+    margin-bottom: ${theme.sizeUnit * 4}px;
+  `}
+`;
+
+const JoinTypeTag = styled(Tag)<{ joinType: JoinType }>`
+  ${({ theme, joinType }) => {
+    const colorMap = {
+      [JoinType.INNER]: theme.colorSuccess,
+      [JoinType.LEFT]: theme.colorInfo,
+      [JoinType.RIGHT]: theme.colorWarning,
+      [JoinType.FULL]: theme.colorError,
+      [JoinType.CROSS]: theme.colorTextSecondary,
+    };
+    return `
+      background-color: ${colorMap[joinType]}20;
+      color: ${colorMap[joinType]};
+      border-color: ${colorMap[joinType]};
+    `;
+  }}
+`;
+
+const CardinalityTag = styled(Tag)`
+  ${({ theme }) => `
+    background-color: ${theme.colorPrimaryBg};
+    color: ${theme.colorPrimary};
+    border-color: ${theme.colorPrimary};
+  `}
+`;
+
+const JoinsList = ({
+  databaseReportId,
+  joins: initialJoins,
+  tables,
+  onJoinsUpdate,
+  editable = true,
+}: JoinsListProps) => {
+  const [joins, setJoins] = useState<Join[]>(initialJoins);
+  const [modalVisible, setModalVisible] = useState(false);
+  const [editingJoin, setEditingJoin] = useState<Join | null>(null);
+  const [loading, setLoading] = useState(false);
+
+  const handleAddJoin = () => {
+    setEditingJoin(null);
+    setModalVisible(true);
+  };
+
+  const handleEditJoin = (join: Join) => {
+    setEditingJoin(join);
+    setModalVisible(true);
+  };
+
+  const handleDeleteJoin = async (joinId: number) => {
+    setLoading(true);
+    try {
+      const response = await SupersetClient.delete({
+        endpoint: 
`/api/v1/datasource_analyzer/report/${databaseReportId}/join/${joinId}`,
+      });
+
+      if (response.ok) {
+        const updatedJoins = joins.filter(j => j.id !== joinId);
+        setJoins(updatedJoins);
+        onJoinsUpdate?.(updatedJoins);
+        message.success(t('Join deleted successfully'));
+      } else {
+        throw new Error('Failed to delete join');
+      }

Review Comment:
   **Suggestion:** The delete handler checks `response.ok` on the value 
returned by `SupersetClient.delete`, but `SupersetClient.delete` returns a 
SupersetClientResponse object (not a raw fetch Response), so `ok` is always 
undefined and the success branch is never taken, causing deletion to be treated 
as a failure even when the backend succeeds; instead, you should rely on the 
promise rejecting on error and treat any resolved call as success. [logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
         await SupersetClient.delete({
           endpoint: 
`/api/v1/datasource_analyzer/report/${databaseReportId}/join/${joinId}`,
         });
   
         const updatedJoins = joins.filter(j => j.id !== joinId);
         setJoins(updatedJoins);
         onJoinsUpdate?.(updatedJoins);
         message.success(t('Join deleted successfully'));
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   The suggestion is correct. SupersetClient.delete delegates to 
SupersetClient.request which returns the parsed result from 
callApiAndParseWithTimeout and rejects on HTTP errors; it does not return a raw 
Fetch Response with an `ok` boolean that user code should check. In real code 
across the repo callers consume the resolved value's `json` or rely on promise 
rejection for errors. Checking `response.ok` here is therefore unreliable and 
can cause successful deletes to be treated as failures. The proposed change 
(await the call and treat resolution as success, letting rejection drive the 
catch path) fixes the logic bug.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset-frontend/src/components/DatabaseSchemaEditor/JoinsList.tsx
   **Line:** 114:125
   **Comment:**
        *Logic Error: The delete handler checks `response.ok` on the value 
returned by `SupersetClient.delete`, but `SupersetClient.delete` returns a 
SupersetClientResponse object (not a raw fetch Response), so `ok` is always 
undefined and the success branch is never taken, causing deletion to be treated 
as a failure even when the backend succeeds; instead, you should rely on the 
promise rejecting on error and treat any resolved call as success.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset-frontend/src/components/DatabaseSchemaEditor/JoinEditorModal.tsx:
##########
@@ -0,0 +1,356 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+import { useState, useEffect, useMemo } from 'react';
+import { t } from '@superset-ui/core';
+import { styled } from '@apache-superset/core/ui';
+import {
+  Modal,
+  Select,
+  Input,
+  Form,
+  Space,

Review Comment:
   **Suggestion:** The `Space` component is imported but never used in this 
file, which is dead code and can lead to unnecessary bundle size and potential 
confusion about intended layout behavior. [code quality]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
   
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   The import list in the PR includes Space (lines 22-30 in the new hunk) but 
the component is not used anywhere in the current file state. Removing the 
unused import is a harmless code-quality cleanup that reduces minor bundle 
noise and avoids confusion about intended layout. It's a straightforward 
improvement.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** 
superset-frontend/src/components/DatabaseSchemaEditor/JoinEditorModal.tsx
   **Line:** 27:27
   **Comment:**
        *Code Quality: The `Space` component is imported but never used in this 
file, which is dead code and can lead to unnecessary bundle size and potential 
confusion about intended layout behavior.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/databases/analyzer_api.py:
##########
@@ -304,3 +304,546 @@ def get_report(self, report_id: int) -> Response:
         except Exception as e:
             logger.exception("Error retrieving report")
             return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/table/<int:table_id>", methods=("PUT",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.update_table",
+        log_to_statsd=True,
+    )
+    def update_table(self, report_id: int, table_id: int) -> Response:
+        """Update table description.
+        ---
+        put:
+          summary: Update table description
+          description: Update the AI-generated or user-edited description for 
a table
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          - in: path
+            name: table_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  properties:
+                    description:
+                      type: string
+                      description: New description for the table
+          responses:
+            200:
+              description: Table updated successfully
+            404:
+              description: Table not found
+            500:
+              description: Internal server error
+        """
+        try:
+            from superset.models.database_analyzer import AnalyzedTable
+
+            data = request.json or {}
+            
+            table = (
+                db.session.query(AnalyzedTable)
+                .filter_by(id=table_id, report_id=report_id)
+                .first()
+            )
+
+            if not table:
+                return self.response_404(message="Table not found")
+
+            table.ai_description = data.get("description")
+            db.session.commit()
+
+            return self.response(
+                200,
+                id=table.id,
+                name=table.table_name,
+                description=table.ai_description,
+            )
+
+        except Exception as e:
+            logger.exception("Error updating table")
+            db.session.rollback()
+            return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/column/<int:column_id>", methods=("PUT",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.update_column",
+        log_to_statsd=True,
+    )
+    def update_column(self, report_id: int, column_id: int) -> Response:
+        """Update column description.
+        ---
+        put:
+          summary: Update column description
+          description: Update the AI-generated or user-edited description for 
a column
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          - in: path
+            name: column_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  properties:
+                    description:
+                      type: string
+                      description: New description for the column
+          responses:
+            200:
+              description: Column updated successfully
+            404:
+              description: Column not found
+            500:
+              description: Internal server error
+        """
+        try:
+            from superset.models.database_analyzer import AnalyzedColumn, 
AnalyzedTable
+
+            data = request.json or {}
+            
+            # Verify column belongs to a table in this report
+            column = (
+                db.session.query(AnalyzedColumn)
+                .join(AnalyzedTable)
+                .filter(
+                    AnalyzedColumn.id == column_id,
+                    AnalyzedTable.report_id == report_id,
+                )
+                .first()
+            )
+
+            if not column:
+                return self.response_404(message="Column not found")
+
+            column.ai_description = data.get("description")
+            db.session.commit()
+
+            return self.response(
+                200,
+                id=column.id,
+                name=column.column_name,
+                description=column.ai_description,
+            )
+
+        except Exception as e:
+            logger.exception("Error updating column")
+            db.session.rollback()
+            return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/join", methods=("POST",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.create_join",
+        log_to_statsd=True,
+    )
+    def create_join(self, report_id: int) -> Response:
+        """Create a new join relationship.
+        ---
+        post:
+          summary: Create join relationship
+          description: Create a new join relationship between tables
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  required: [source_table_id, target_table_id, source_columns, 
target_columns, join_type, cardinality]
+                  properties:
+                    source_table_id:
+                      type: integer
+                    target_table_id:
+                      type: integer
+                    source_columns:
+                      type: array
+                      items:
+                        type: string
+                    target_columns:
+                      type: array
+                      items:
+                        type: string
+                    join_type:
+                      type: string
+                      enum: [inner, left, right, full, cross]
+                    cardinality:
+                      type: string
+                      enum: ["1:1", "1:N", "N:1", "N:M"]
+                    semantic_context:
+                      type: string
+          responses:
+            201:
+              description: Join created successfully
+            400:
+              description: Bad request
+            404:
+              description: Report or table not found
+            500:
+              description: Internal server error
+        """
+        try:
+            import json
+            from superset.models.database_analyzer import (
+                AnalyzedTable,
+                InferredJoin,
+                JoinType,
+                Cardinality,
+            )
+
+            data = request.json or {}
+

Review Comment:
   **Suggestion:** In `create_join`, missing required fields or a non-object 
JSON body will cause KeyError/TypeError when indexing `data[...]`, which is 
caught and returned as a 500 internal server error instead of a 400 bad 
request, contradicting the endpoint's documented behavior for invalid input. 
[logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
               if not isinstance(data, dict):
                   return self.response_400(message="Request body must be a 
JSON object")
   
               required_keys = [
                   "source_table_id",
                   "target_table_id",
                   "source_columns",
                   "target_columns",
                   "join_type",
                   "cardinality",
               ]
               missing_keys = [key for key in required_keys if key not in data]
               if missing_keys:
                   return self.response_400(
                       message=f"Missing required fields: {', 
'.join(missing_keys)}"
                   )
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   Accurate. The handler indexes into data with required keys without 
validating shape or presence, so a missing key or non-dict body will raise 
KeyError/TypeError and lead to a 500. Explicitly checking that request.json is 
a dict and that required fields exist (and ideally validating types) fixes a 
real input-validation bug and matches the documented 400 responses.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/databases/analyzer_api.py
   **Line:** 528:528
   **Comment:**
        *Logic Error: In `create_join`, missing required fields or a non-object 
JSON body will cause KeyError/TypeError when indexing `data[...]`, which is 
caught and returned as a 500 internal server error instead of a 400 bad 
request, contradicting the endpoint's documented behavior for invalid input.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/databases/analyzer_api.py:
##########
@@ -304,3 +304,546 @@ def get_report(self, report_id: int) -> Response:
         except Exception as e:
             logger.exception("Error retrieving report")
             return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/table/<int:table_id>", methods=("PUT",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.update_table",
+        log_to_statsd=True,
+    )
+    def update_table(self, report_id: int, table_id: int) -> Response:
+        """Update table description.
+        ---
+        put:
+          summary: Update table description
+          description: Update the AI-generated or user-edited description for 
a table
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          - in: path
+            name: table_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  properties:
+                    description:
+                      type: string
+                      description: New description for the table
+          responses:
+            200:
+              description: Table updated successfully
+            404:
+              description: Table not found
+            500:
+              description: Internal server error
+        """
+        try:
+            from superset.models.database_analyzer import AnalyzedTable
+
+            data = request.json or {}
+            
+            table = (
+                db.session.query(AnalyzedTable)
+                .filter_by(id=table_id, report_id=report_id)
+                .first()
+            )
+
+            if not table:
+                return self.response_404(message="Table not found")
+
+            table.ai_description = data.get("description")
+            db.session.commit()
+
+            return self.response(
+                200,
+                id=table.id,
+                name=table.table_name,
+                description=table.ai_description,
+            )
+
+        except Exception as e:
+            logger.exception("Error updating table")
+            db.session.rollback()
+            return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/column/<int:column_id>", methods=("PUT",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.update_column",
+        log_to_statsd=True,
+    )
+    def update_column(self, report_id: int, column_id: int) -> Response:
+        """Update column description.
+        ---
+        put:
+          summary: Update column description
+          description: Update the AI-generated or user-edited description for 
a column
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          - in: path
+            name: column_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  properties:
+                    description:
+                      type: string
+                      description: New description for the column
+          responses:
+            200:
+              description: Column updated successfully
+            404:
+              description: Column not found
+            500:
+              description: Internal server error
+        """
+        try:
+            from superset.models.database_analyzer import AnalyzedColumn, 
AnalyzedTable
+
+            data = request.json or {}
+            
+            # Verify column belongs to a table in this report
+            column = (
+                db.session.query(AnalyzedColumn)
+                .join(AnalyzedTable)
+                .filter(
+                    AnalyzedColumn.id == column_id,
+                    AnalyzedTable.report_id == report_id,
+                )
+                .first()
+            )
+
+            if not column:
+                return self.response_404(message="Column not found")
+
+            column.ai_description = data.get("description")
+            db.session.commit()
+
+            return self.response(
+                200,
+                id=column.id,
+                name=column.column_name,
+                description=column.ai_description,
+            )
+
+        except Exception as e:
+            logger.exception("Error updating column")
+            db.session.rollback()
+            return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/join", methods=("POST",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.create_join",
+        log_to_statsd=True,
+    )
+    def create_join(self, report_id: int) -> Response:
+        """Create a new join relationship.
+        ---
+        post:
+          summary: Create join relationship
+          description: Create a new join relationship between tables
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  required: [source_table_id, target_table_id, source_columns, 
target_columns, join_type, cardinality]
+                  properties:
+                    source_table_id:
+                      type: integer
+                    target_table_id:
+                      type: integer
+                    source_columns:
+                      type: array
+                      items:
+                        type: string
+                    target_columns:
+                      type: array
+                      items:
+                        type: string
+                    join_type:
+                      type: string
+                      enum: [inner, left, right, full, cross]
+                    cardinality:
+                      type: string
+                      enum: ["1:1", "1:N", "N:1", "N:M"]
+                    semantic_context:
+                      type: string
+          responses:
+            201:
+              description: Join created successfully
+            400:
+              description: Bad request
+            404:
+              description: Report or table not found
+            500:
+              description: Internal server error
+        """
+        try:
+            import json
+            from superset.models.database_analyzer import (
+                AnalyzedTable,
+                InferredJoin,
+                JoinType,
+                Cardinality,
+            )
+
+            data = request.json or {}
+
+            # Verify report exists
+            report = db.session.query(DatabaseSchemaReport).get(report_id)
+            if not report:
+                return self.response_404(message="Report not found")
+
+            # Verify tables belong to this report
+            source_table = (
+                db.session.query(AnalyzedTable)
+                .filter_by(id=data["source_table_id"], report_id=report_id)
+                .first()
+            )
+            target_table = (
+                db.session.query(AnalyzedTable)
+                .filter_by(id=data["target_table_id"], report_id=report_id)
+                .first()
+            )
+
+            if not source_table or not target_table:
+                return self.response_404(message="Table not found")
+
+            # Create join
+            join = InferredJoin(
+                report_id=report_id,
+                source_table_id=data["source_table_id"],
+                target_table_id=data["target_table_id"],
+                source_columns=json.dumps(data["source_columns"]),
+                target_columns=json.dumps(data["target_columns"]),
+                join_type=JoinType(data["join_type"]),
+                cardinality=Cardinality(data["cardinality"]),
+                semantic_context=data.get("semantic_context"),
+            )
+
+            db.session.add(join)
+            db.session.commit()
+
+            return self.response(
+                201,
+                id=join.id,
+                source_table=source_table.table_name,
+                source_columns=data["source_columns"],
+                target_table=target_table.table_name,
+                target_columns=data["target_columns"],
+                join_type=join.join_type.value,
+                cardinality=join.cardinality.value,
+                semantic_context=join.semantic_context,
+            )
+
+        except Exception as e:
+            logger.exception("Error creating join")
+            db.session.rollback()
+            return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/join/<int:join_id>", methods=("PUT",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.update_join",
+        log_to_statsd=True,
+    )
+    def update_join(self, report_id: int, join_id: int) -> Response:
+        """Update a join relationship.
+        ---
+        put:
+          summary: Update join relationship
+          description: Update an existing join relationship
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          - in: path
+            name: join_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  properties:
+                    source_table_id:
+                      type: integer
+                    target_table_id:
+                      type: integer
+                    source_columns:
+                      type: array
+                      items:
+                        type: string
+                    target_columns:
+                      type: array
+                      items:
+                        type: string
+                    join_type:
+                      type: string
+                      enum: [inner, left, right, full, cross]
+                    cardinality:
+                      type: string
+                      enum: ["1:1", "1:N", "N:1", "N:M"]
+                    semantic_context:
+                      type: string
+          responses:
+            200:
+              description: Join updated successfully
+            404:
+              description: Join not found
+            500:
+              description: Internal server error
+        """
+        try:
+            import json
+            from superset.models.database_analyzer import (
+                AnalyzedTable,
+                InferredJoin,
+                JoinType,
+                Cardinality,
+            )
+
+            data = request.json or {}
+
+            join = (
+                db.session.query(InferredJoin)
+                .filter_by(id=join_id, report_id=report_id)
+                .first()
+            )
+
+            if not join:
+                return self.response_404(message="Join not found")
+
+            # Update fields if provided
+            if "source_table_id" in data:
+                source_table = (
+                    db.session.query(AnalyzedTable)
+                    .filter_by(id=data["source_table_id"], report_id=report_id)
+                    .first()
+                )
+                if not source_table:
+                    return self.response_404(message="Source table not found")
+                join.source_table_id = data["source_table_id"]
+
+            if "target_table_id" in data:
+                target_table = (
+                    db.session.query(AnalyzedTable)
+                    .filter_by(id=data["target_table_id"], report_id=report_id)
+                    .first()
+                )
+                if not target_table:
+                    return self.response_404(message="Target table not found")
+                join.target_table_id = data["target_table_id"]
+
+            if "source_columns" in data:
+                join.source_columns = json.dumps(data["source_columns"])
+            if "target_columns" in data:
+                join.target_columns = json.dumps(data["target_columns"])
+            if "join_type" in data:
+                join.join_type = JoinType(data["join_type"])
+            if "cardinality" in data:
+                join.cardinality = Cardinality(data["cardinality"])
+            if "semantic_context" in data:

Review Comment:
   **Suggestion:** In `update_join`, if the client supplies an invalid 
`join_type` or `cardinality` value, constructing the Enum (`JoinType(...)` or 
`Cardinality(...)`) will raise a ValueError that is caught and returned as a 
500 error rather than a 400 indicating invalid input. [logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
                   try:
                       join.join_type = JoinType(data["join_type"])
                   except ValueError:
                       return self.response_400(message="Invalid join_type 
value")
               if "cardinality" in data:
                   try:
                       join.cardinality = Cardinality(data["cardinality"])
                   except ValueError:
                       return self.response_400(message="Invalid cardinality 
value")
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   Correct: constructing the Enum with an invalid value will raise ValueError 
which is currently caught by the broad except and turned into a 500. Catching 
ValueError and returning 400 gives a clearer client-facing error and preserves 
server-side logging for unexpected exceptions.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/databases/analyzer_api.py
   **Line:** 688:691
   **Comment:**
        *Logic Error: In `update_join`, if the client supplies an invalid 
`join_type` or `cardinality` value, constructing the Enum (`JoinType(...)` or 
`Cardinality(...)`) will raise a ValueError that is caught and returned as a 
500 error rather than a 400 indicating invalid input.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/databases/analyzer_api.py:
##########
@@ -304,3 +304,546 @@ def get_report(self, report_id: int) -> Response:
         except Exception as e:
             logger.exception("Error retrieving report")
             return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/table/<int:table_id>", methods=("PUT",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.update_table",
+        log_to_statsd=True,
+    )
+    def update_table(self, report_id: int, table_id: int) -> Response:
+        """Update table description.
+        ---
+        put:
+          summary: Update table description
+          description: Update the AI-generated or user-edited description for 
a table
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          - in: path
+            name: table_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  properties:
+                    description:
+                      type: string
+                      description: New description for the table
+          responses:
+            200:
+              description: Table updated successfully
+            404:
+              description: Table not found
+            500:
+              description: Internal server error
+        """
+        try:
+            from superset.models.database_analyzer import AnalyzedTable
+
+            data = request.json or {}
+            
+            table = (
+                db.session.query(AnalyzedTable)
+                .filter_by(id=table_id, report_id=report_id)
+                .first()
+            )
+
+            if not table:
+                return self.response_404(message="Table not found")
+
+            table.ai_description = data.get("description")
+            db.session.commit()
+
+            return self.response(
+                200,
+                id=table.id,
+                name=table.table_name,
+                description=table.ai_description,
+            )
+
+        except Exception as e:
+            logger.exception("Error updating table")
+            db.session.rollback()
+            return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/column/<int:column_id>", methods=("PUT",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.update_column",
+        log_to_statsd=True,
+    )
+    def update_column(self, report_id: int, column_id: int) -> Response:
+        """Update column description.
+        ---
+        put:
+          summary: Update column description
+          description: Update the AI-generated or user-edited description for 
a column
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          - in: path
+            name: column_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  properties:
+                    description:
+                      type: string
+                      description: New description for the column
+          responses:
+            200:
+              description: Column updated successfully
+            404:
+              description: Column not found
+            500:
+              description: Internal server error
+        """
+        try:
+            from superset.models.database_analyzer import AnalyzedColumn, 
AnalyzedTable
+
+            data = request.json or {}
+            

Review Comment:
   **Suggestion:** In `update_column`, a JSON body that is not an object (e.g., 
a list) will be assigned to `data`, and `data.get("description")` will raise an 
AttributeError, resulting in a 500 response instead of clearly returning a 400 
for an invalid payload shape. [logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
               if not isinstance(data, dict):
                   return self.response_400(message="Request body must be a 
JSON object")
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   Same issue as the table update handler: data may not be a dict and later 
using data.get will raise AttributeError and return a 500. Defensively 
validating the request body shape and returning 400 is appropriate and fixes a 
real validation bug.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/databases/analyzer_api.py
   **Line:** 429:429
   **Comment:**
        *Logic Error: In `update_column`, a JSON body that is not an object 
(e.g., a list) will be assigned to `data`, and `data.get("description")` will 
raise an AttributeError, resulting in a 500 response instead of clearly 
returning a 400 for an invalid payload shape.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/databases/analyzer_api.py:
##########
@@ -304,3 +304,546 @@ def get_report(self, report_id: int) -> Response:
         except Exception as e:
             logger.exception("Error retrieving report")
             return self.response_500(message=str(e))
+
+    @expose("/report/<int:report_id>/table/<int:table_id>", methods=("PUT",))
+    @protect()
+    @safe
+    @statsd_metrics
+    @requires_json
+    @event_logger.log_this_with_context(
+        action=lambda self, *args, **kwargs: 
f"{self.__class__.__name__}.update_table",
+        log_to_statsd=True,
+    )
+    def update_table(self, report_id: int, table_id: int) -> Response:
+        """Update table description.
+        ---
+        put:
+          summary: Update table description
+          description: Update the AI-generated or user-edited description for 
a table
+          parameters:
+          - in: path
+            name: report_id
+            required: true
+            schema:
+              type: integer
+          - in: path
+            name: table_id
+            required: true
+            schema:
+              type: integer
+          requestBody:
+            required: true
+            content:
+              application/json:
+                schema:
+                  type: object
+                  properties:
+                    description:
+                      type: string
+                      description: New description for the table
+          responses:
+            200:
+              description: Table updated successfully
+            404:
+              description: Table not found
+            500:
+              description: Internal server error
+        """
+        try:
+            from superset.models.database_analyzer import AnalyzedTable
+
+            data = request.json or {}
+            

Review Comment:
   **Suggestion:** In `update_table`, if the JSON body is not an object (for 
example a JSON list or literal), `data = request.json or {}` will assign a 
non-dict to `data`, and `data.get("description")` will raise an AttributeError, 
which is caught and returned as a 500 error instead of a 400 for an invalid 
request body. [logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
               if not isinstance(data, dict):
                   return self.response_400(message="Request body must be a 
JSON object")
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   The diagnosis is accurate: request.json can be a non-dict (list, string, 
etc.). The code later calls data.get("description") which will raise 
AttributeError for non-dict inputs and end up as a 500. Returning a 400 with an 
explicit check for dict shape is the correct behavior and matches the endpoint 
contract. This is a real validation bug, not a cosmetic nit.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/databases/analyzer_api.py
   **Line:** 356:356
   **Comment:**
        *Logic Error: In `update_table`, if the JSON body is not an object (for 
example a JSON list or literal), `data = request.json or {}` will assign a 
non-dict to `data`, and `data.get("description")` will raise an AttributeError, 
which is caught and returned as a 500 error instead of a 400 for an invalid 
request body.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset-frontend/src/components/DatabaseSchemaEditor/JoinEditorModal.tsx:
##########
@@ -0,0 +1,356 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+import { useState, useEffect, useMemo } from 'react';
+import { t } from '@superset-ui/core';
+import { styled } from '@apache-superset/core/ui';
+import {
+  Modal,
+  Select,
+  Input,
+  Form,
+  Space,
+  Typography,
+  Alert,
+} from '@superset-ui/core/components';
+
+export enum JoinType {
+  INNER = 'inner',
+  LEFT = 'left',
+  RIGHT = 'right',
+  FULL = 'full',
+  CROSS = 'cross',
+}
+
+export enum Cardinality {
+  ONE_TO_ONE = '1:1',
+  ONE_TO_MANY = '1:N',
+  MANY_TO_ONE = 'N:1',
+  MANY_TO_MANY = 'N:M',
+}
+
+export interface Table {
+  id: number;
+  name: string;
+  columns?: Column[];
+}
+
+export interface Column {
+  id: number;
+  name: string;
+  type: string;
+}
+
+export interface Join {
+  id?: number;
+  source_table: string;
+  source_table_id?: number;
+  source_columns: string[];
+  target_table: string;
+  target_table_id?: number;
+  target_columns: string[];
+  join_type: JoinType;
+  cardinality: Cardinality;
+  semantic_context?: string;
+}
+
+interface JoinEditorModalProps {
+  visible: boolean;
+  join?: Join | null;
+  tables: Table[];
+  onSave: (join: Join) => void;
+  onCancel: () => void;
+}
+
+const StyledForm = styled(Form)`
+  .ant-form-item {
+    margin-bottom: ${({ theme }) => theme.sizeUnit * 4}px;
+  }
+`;
+
+const ColumnSelectionGroup = styled.div`
+  display: flex;
+  gap: ${({ theme }) => theme.sizeUnit * 2}px;
+  align-items: center;
+`;
+
+const JoinEditorModal = ({
+  visible,
+  join,
+  tables,
+  onSave,
+  onCancel,
+}: JoinEditorModalProps) => {
+  const [form] = Form.useForm();
+  const [sourceTable, setSourceTable] = useState<string | undefined>(
+    join?.source_table,
+  );
+  const [targetTable, setTargetTable] = useState<string | undefined>(
+    join?.target_table,
+  );
+
+  useEffect(() => {
+    if (visible) {
+      if (join) {
+        form.setFieldsValue({
+          source_table: join.source_table,
+          source_columns: join.source_columns,
+          target_table: join.target_table,
+          target_columns: join.target_columns,
+          join_type: join.join_type,
+          cardinality: join.cardinality,
+          semantic_context: join.semantic_context,
+        });
+        setSourceTable(join.source_table);
+        setTargetTable(join.target_table);
+      } else {
+        form.resetFields();
+        setSourceTable(undefined);
+        setTargetTable(undefined);
+      }
+    }
+  }, [visible, join, form]);
+
+  const sourceTableColumns = useMemo(
+    () =>
+      tables.find(table => table.name === sourceTable)?.columns || [],
+    [tables, sourceTable],
+  );
+
+  const targetTableColumns = useMemo(
+    () =>
+      tables.find(table => table.name === targetTable)?.columns || [],
+    [tables, targetTable],
+  );
+
+  const joinTypeOptions = [
+    { value: JoinType.INNER, label: t('Inner Join') },
+    { value: JoinType.LEFT, label: t('Left Join') },
+    { value: JoinType.RIGHT, label: t('Right Join') },
+    { value: JoinType.FULL, label: t('Full Outer Join') },
+    { value: JoinType.CROSS, label: t('Cross Join') },
+  ];
+
+  const cardinalityOptions = [
+    { value: Cardinality.ONE_TO_ONE, label: t('One to One (1:1)') },
+    { value: Cardinality.ONE_TO_MANY, label: t('One to Many (1:N)') },
+    { value: Cardinality.MANY_TO_ONE, label: t('Many to One (N:1)') },
+    { value: Cardinality.MANY_TO_MANY, label: t('Many to Many (N:M)') },
+  ];
+
+  const handleSourceTableChange = (value: string) => {
+    setSourceTable(value);
+    form.setFieldValue('source_columns', []);
+  };
+
+  const handleTargetTableChange = (value: string) => {
+    setTargetTable(value);
+    form.setFieldValue('target_columns', []);

Review Comment:
   **Suggestion:** The calls to `form.setFieldValue` are likely incorrect for 
the Ant Design form instance used in this codebase (which exposes 
`setFieldsValue`), and will result in a runtime error like "`form.setFieldValue 
is not a function`" when the source or target table is changed, preventing the 
dependent column fields from being reset. [possible bug]
   
   **Severity Level:** Critical 🚨
   ```suggestion
       form.setFieldsValue({ source_columns: [] });
     };
   
     const handleTargetTableChange = (value: string) => {
       setTargetTable(value);
       form.setFieldsValue({ target_columns: [] });
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   The PR's file indeed uses form.setFieldValue in the new hunk (lines 
156-164). Elsewhere in the same file the code uses form.setFieldsValue (plural) 
when setting many fields (lines 110-118), which matches Ant Design Form API. 
Ant Design's Form instance exposes setFieldsValue, not setFieldValue, so the 
existing code will throw "form.setFieldValue is not a function" at runtime when 
switching tables and attempting to clear dependent columns. The improved change 
(calling form.setFieldsValue with an object) fixes a real runtime bug and is 
minimal and targeted.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** 
superset-frontend/src/components/DatabaseSchemaEditor/JoinEditorModal.tsx
   **Line:** 158:163
   **Comment:**
        *Possible Bug: The calls to `form.setFieldValue` are likely incorrect 
for the Ant Design form instance used in this codebase (which exposes 
`setFieldsValue`), and will result in a runtime error like "`form.setFieldValue 
is not a function`" when the source or target table is changed, preventing the 
dependent column fields from being reset.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: Joins editor [superset]

Reply via email to