Re: [PR] feat(ci): add GitHub Action to track role permission changes [superset]

via GitHub Wed, 07 Jan 2026 16:08:10 -0800


codeant-ai-for-open-source[bot] commented on code in PR #36961:
URL: https://github.com/apache/superset/pull/36961#discussion_r2670447301



##########
scripts/extract_role_definitions.py:
##########
@@ -0,0 +1,204 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Extract role definitions from the security manager for CI comparison.
+
+This script parses the SupersetSecurityManager class and extracts the constants
+that define role permissions. It outputs JSON that can be compared between
+commits to detect role changes.
+
+Usage:
+    python scripts/extract_role_definitions.py [--output FILE]
+"""
+
+import argparse
+import ast
+import json
+import sys
+from pathlib import Path
+from typing import Any
+
+
+# Constants that define role permissions in SupersetSecurityManager
+ROLE_DEFINING_CONSTANTS = [
+    "READ_ONLY_MODEL_VIEWS",
+    "USER_MODEL_VIEWS",
+    "GAMMA_READ_ONLY_MODEL_VIEWS",
+    "ADMIN_ONLY_VIEW_MENUS",
+    "ALPHA_ONLY_VIEW_MENUS",
+    "ALPHA_ONLY_PMVS",
+    "ADMIN_ONLY_PERMISSIONS",
+    "READ_ONLY_PERMISSION",
+    "ALPHA_ONLY_PERMISSIONS",
+    "OBJECT_SPEC_PERMISSIONS",
+    "ACCESSIBLE_PERMS",
+    "SQLLAB_ONLY_PERMISSIONS",
+    "SQLLAB_EXTRA_PERMISSION_VIEWS",
+    "data_access_permissions",
+]
+
+
+def parse_set_or_tuple(node: ast.expr) -> list[Any]:
+    """Parse an AST node representing a set, frozenset, tuple, or set 
operations."""
+    if isinstance(node, ast.Set):
+        return [parse_element(elt) for elt in node.elts]
+    if isinstance(node, ast.Tuple):
+        return [parse_element(elt) for elt in node.elts]
+    if isinstance(node, ast.Call):
+        # Handle frozenset() or set() calls
+        if isinstance(node.func, ast.Name) and node.func.id in ("frozenset", 
"set"):

Review Comment:
   **Suggestion:** parse_set_or_tuple does not handle list literals (ast.List) 
and misses calls to set/frozenset when the call is written as an attribute 
(e.g., builtins.frozenset or module.frozenset), which will silently drop 
elements and lead to incomplete role definitions; add support for ast.List and 
detect ast.Attribute callees for "set"/"frozenset". [possible bug]
   
   **Severity Level:** Critical 🚨
   ```suggestion
       """Parse an AST node representing a set, frozenset, tuple, list, or set 
operations."""
       if isinstance(node, ast.Set):
           return [parse_element(elt) for elt in node.elts]
       if isinstance(node, (ast.Tuple, ast.List)):
           return [parse_element(elt) for elt in node.elts]
       if isinstance(node, ast.Call):
           # Handle frozenset() or set() calls, including qualified names like 
builtins.frozenset
           func_name = None
           if isinstance(node.func, ast.Name):
               func_name = node.func.id
           elif isinstance(node.func, ast.Attribute):
               func_name = node.func.attr
           if func_name in ("frozenset", "set"):
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   Correct: the current implementation will silently drop list literals and 
qualified calls
   like builtins.frozenset(...) because it only handles ast.Tuple and ast.Call 
with ast.Name
   callees. Adding ast.List and checking ast.Attribute.func.attr for 
"set"/"frozenset"
   fixes a real omission and prevents incomplete extraction of role elements.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** scripts/extract_role_definitions.py
   **Line:** 56:63
   **Comment:**
        *Possible Bug: parse_set_or_tuple does not handle list literals 
(ast.List) and misses calls to set/frozenset when the call is written as an 
attribute (e.g., builtins.frozenset or module.frozenset), which will silently 
drop elements and lead to incomplete role definitions; add support for ast.List 
and detect ast.Attribute callees for "set"/"frozenset".
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
scripts/compare_role_definitions.py:
##########
@@ -0,0 +1,284 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Compare role definitions between two JSON files and generate a report.
+
+This script compares the output of extract_role_definitions.py between
+two commits to identify changes in role permissions.
+
+Usage:
+    python scripts/compare_role_definitions.py base.json head.json [--format 
FORMAT]
+"""
+
+import argparse
+import json
+import os
+import sys
+from pathlib import Path
+from typing import Any
+
+
+def load_definitions(path: Path) -> dict[str, Any]:
+    """Load role definitions from a JSON file."""
+    return json.loads(path.read_text())
+
+
+def compare_sets(
+    base_values: list[str],
+    head_values: list[str],
+) -> tuple[list[str], list[str], list[str]]:
+    """Compare two lists and return added, removed, and unchanged items."""
+    base_set = set(base_values)
+    head_set = set(head_values)
+
+    added = sorted(head_set - base_set)
+    removed = sorted(base_set - head_set)
+    unchanged = sorted(base_set & head_set)
+
+    return added, removed, unchanged
+
+
+def generate_diff(
+    base: dict[str, Any],
+    head: dict[str, Any],
+) -> dict[str, Any]:
+    """Generate a diff between base and head definitions."""
+    base_constants = base.get("constants", {})
+    head_constants = head.get("constants", {})
+
+    all_keys = sorted(set(base_constants.keys()) | set(head_constants.keys()))
+
+    diff: dict[str, Any] = {
+        "changes": {},
+        "summary": {
+            "total_constants": len(all_keys),
+            "constants_with_changes": 0,
+            "total_added": 0,
+            "total_removed": 0,
+        },
+    }
+
+    for key in all_keys:
+        base_values = base_constants.get(key, [])
+        head_values = head_constants.get(key, [])
+
+        added, removed, _ = compare_sets(base_values, head_values)
+
+        if added or removed:
+            diff["changes"][key] = {
+                "added": added,
+                "removed": removed,
+            }
+            diff["summary"]["constants_with_changes"] += 1
+            diff["summary"]["total_added"] += len(added)
+            diff["summary"]["total_removed"] += len(removed)
+
+    return diff
+
+
+def format_markdown(diff: dict[str, Any]) -> str:
+    """Format the diff as markdown for PR comments."""
+    lines = []
+    summary = diff["summary"]
+    changes = diff["changes"]
+
+    if not changes:
+        lines.append("## Role Definitions: No Changes Detected")
+        lines.append("")
+        lines.append(
+            "No changes to role-defining constants in 
`superset/security/manager.py`."
+        )
+        return "\n".join(lines)
+
+    lines.append("## Role Definitions: Changes Detected")
+    lines.append("")
+    lines.append(
+        f"Found changes in **{summary['constants_with_changes']}** "
+        f"role-defining constant(s):"
+    )
+    lines.append(
+        f"- **{summary['total_added']}** permission(s) added"
+    )
+    lines.append(
+        f"- **{summary['total_removed']}** permission(s) removed"
+    )
+    lines.append("")
+
+    # Map constant names to their role implications
+    role_implications = {
+        "ADMIN_ONLY_VIEW_MENUS": "Admin",
+        "ADMIN_ONLY_PERMISSIONS": "Admin",
+        "ALPHA_ONLY_VIEW_MENUS": "Alpha",
+        "ALPHA_ONLY_PERMISSIONS": "Alpha",
+        "ALPHA_ONLY_PMVS": "Alpha",
+        "GAMMA_READ_ONLY_MODEL_VIEWS": "Gamma",
+        "READ_ONLY_MODEL_VIEWS": "All roles (read-only)",
+        "READ_ONLY_PERMISSION": "All roles (read-only)",
+        "SQLLAB_ONLY_PERMISSIONS": "SQL Lab",
+        "SQLLAB_EXTRA_PERMISSION_VIEWS": "SQL Lab",
+        "ACCESSIBLE_PERMS": "All authenticated users",
+        "OBJECT_SPEC_PERMISSIONS": "Data access (user-defined)",
+        "USER_MODEL_VIEWS": "Admin (user management)",
+        "data_access_permissions": "Data access",
+    }
+
+    for const_name, const_changes in sorted(changes.items()):
+        role_hint = role_implications.get(const_name, "Various")
+        lines.append(f"### `{const_name}` (affects: {role_hint})")
+        lines.append("")
+
+        if const_changes["added"]:
+            lines.append("**Added:**")
+            for item in const_changes["added"]:
+                lines.append(f"- `{item}`")
+            lines.append("")
+
+        if const_changes["removed"]:
+            lines.append("**Removed:**")
+            for item in const_changes["removed"]:
+                lines.append(f"- `{item}`")
+            lines.append("")
+
+    lines.append("---")
+    lines.append("")
+    lines.append(
+        "*This report was generated by the role tracking workflow. "
+        "Please review these changes carefully as they affect user 
permissions.*"
+    )
+
+    return "\n".join(lines)
+
+
+def format_json(diff: dict[str, Any]) -> str:
+    """Format the diff as JSON."""
+    return json.dumps(diff, indent=2, sort_keys=True)
+
+
+def format_text(diff: dict[str, Any]) -> str:
+    """Format the diff as plain text."""
+    lines = []
+    summary = diff["summary"]
+    changes = diff["changes"]
+
+    if not changes:
+        lines.append("No changes to role definitions detected.")
+        return "\n".join(lines)
+
+    lines.append("Role Definition Changes")
+    lines.append("=" * 50)
+    lines.append(f"Constants with changes: 
{summary['constants_with_changes']}")
+    lines.append(f"Total added: {summary['total_added']}")
+    lines.append(f"Total removed: {summary['total_removed']}")
+    lines.append("")
+
+    for const_name, const_changes in sorted(changes.items()):
+        lines.append(f"{const_name}:")
+        if const_changes["added"]:
+            lines.append("  Added:")
+            for item in const_changes["added"]:
+                lines.append(f"    + {item}")
+        if const_changes["removed"]:
+            lines.append("  Removed:")
+            for item in const_changes["removed"]:
+                lines.append(f"    - {item}")
+        lines.append("")
+
+    return "\n".join(lines)
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(
+        description="Compare role definitions between two commits"
+    )
+    parser.add_argument(
+        "base",
+        type=str,
+        help="Path to base (before) definitions JSON",
+    )
+    parser.add_argument(
+        "head",
+        type=str,
+        help="Path to head (after) definitions JSON",
+    )
+    parser.add_argument(
+        "--format",
+        "-f",
+        type=str,
+        choices=["markdown", "json", "text"],
+        default="markdown",
+        help="Output format (default: markdown)",
+    )
+    parser.add_argument(
+        "--output",
+        "-o",
+        type=str,
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--github-output",
+        action="store_true",
+        help="Write to GITHUB_OUTPUT for GitHub Actions",
+    )
+    args = parser.parse_args()
+
+    base_path = Path(args.base)
+    head_path = Path(args.head)
+
+    if not base_path.exists():
+        print(f"Error: Base file not found: {base_path}")
+        return 1
+    if not head_path.exists():
+        print(f"Error: Head file not found: {head_path}")
+        return 1
+
+    try:
+        base = load_definitions(base_path)
+        head = load_definitions(head_path)
+    except json.JSONDecodeError as e:
+        print(f"Error parsing JSON: {e}")
+        return 1
+
+    diff = generate_diff(base, head)
+
+    formatters = {
+        "markdown": format_markdown,
+        "json": format_json,
+        "text": format_text,
+    }
+    output = formatters[args.format](diff)
+
+    if args.output:
+        Path(args.output).write_text(output)
+    else:
+        print(output)
+
+    # Write GitHub Actions output if requested
+    if args.github_output:
+        github_output = os.environ.get("GITHUB_OUTPUT")
+        if github_output:
+            has_changes = "true" if diff["changes"] else "false"
+            with open(github_output, "a") as f:
+                f.write(f"has_changes={has_changes}\n")
+                f.write(f"added_count={diff['summary']['total_added']}\n")
+                f.write(f"removed_count={diff['summary']['total_removed']}\n")
+
+    # Return non-zero if there are changes (useful for CI)
+    return 0

Review Comment:
   **Suggestion:** The script always returns 0 from main even when there are 
changes; CI/consumers expect a non-zero exit code when changes are present. 
Change the final return to return a non-zero code when `diff["changes"]` is 
non-empty. [logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
       return 1 if diff["changes"] else 0
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   The code currently always returns 0 despite the comment claiming it should 
return non‑zero on changes.
   Changing to `return 1 if diff["changes"] else 0` fixes a real logic bug and 
aligns behavior with the comment/intent.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** scripts/compare_role_definitions.py
   **Line:** 280:280
   **Comment:**
        *Logic Error: The script always returns 0 from main even when there are 
changes; CI/consumers expect a non-zero exit code when changes are present. 
Change the final return to return a non-zero code when `diff["changes"]` is 
non-empty.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
scripts/compare_role_definitions.py:
##########
@@ -0,0 +1,284 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Compare role definitions between two JSON files and generate a report.
+
+This script compares the output of extract_role_definitions.py between
+two commits to identify changes in role permissions.
+
+Usage:
+    python scripts/compare_role_definitions.py base.json head.json [--format 
FORMAT]
+"""
+
+import argparse
+import json
+import os
+import sys
+from pathlib import Path
+from typing import Any
+
+
+def load_definitions(path: Path) -> dict[str, Any]:
+    """Load role definitions from a JSON file."""
+    return json.loads(path.read_text())
+
+
+def compare_sets(
+    base_values: list[str],
+    head_values: list[str],
+) -> tuple[list[str], list[str], list[str]]:
+    """Compare two lists and return added, removed, and unchanged items."""
+    base_set = set(base_values)
+    head_set = set(head_values)
+
+    added = sorted(head_set - base_set)
+    removed = sorted(base_set - head_set)
+    unchanged = sorted(base_set & head_set)
+
+    return added, removed, unchanged
+
+
+def generate_diff(
+    base: dict[str, Any],
+    head: dict[str, Any],
+) -> dict[str, Any]:
+    """Generate a diff between base and head definitions."""
+    base_constants = base.get("constants", {})
+    head_constants = head.get("constants", {})
+
+    all_keys = sorted(set(base_constants.keys()) | set(head_constants.keys()))
+
+    diff: dict[str, Any] = {
+        "changes": {},
+        "summary": {
+            "total_constants": len(all_keys),
+            "constants_with_changes": 0,
+            "total_added": 0,
+            "total_removed": 0,
+        },
+    }
+
+    for key in all_keys:
+        base_values = base_constants.get(key, [])
+        head_values = head_constants.get(key, [])
+
+        added, removed, _ = compare_sets(base_values, head_values)
+
+        if added or removed:
+            diff["changes"][key] = {
+                "added": added,
+                "removed": removed,
+            }
+            diff["summary"]["constants_with_changes"] += 1
+            diff["summary"]["total_added"] += len(added)
+            diff["summary"]["total_removed"] += len(removed)
+
+    return diff
+
+
+def format_markdown(diff: dict[str, Any]) -> str:
+    """Format the diff as markdown for PR comments."""
+    lines = []
+    summary = diff["summary"]
+    changes = diff["changes"]
+
+    if not changes:
+        lines.append("## Role Definitions: No Changes Detected")
+        lines.append("")
+        lines.append(
+            "No changes to role-defining constants in 
`superset/security/manager.py`."
+        )
+        return "\n".join(lines)
+
+    lines.append("## Role Definitions: Changes Detected")
+    lines.append("")
+    lines.append(
+        f"Found changes in **{summary['constants_with_changes']}** "
+        f"role-defining constant(s):"
+    )
+    lines.append(
+        f"- **{summary['total_added']}** permission(s) added"
+    )
+    lines.append(
+        f"- **{summary['total_removed']}** permission(s) removed"
+    )
+    lines.append("")
+
+    # Map constant names to their role implications
+    role_implications = {
+        "ADMIN_ONLY_VIEW_MENUS": "Admin",
+        "ADMIN_ONLY_PERMISSIONS": "Admin",
+        "ALPHA_ONLY_VIEW_MENUS": "Alpha",
+        "ALPHA_ONLY_PERMISSIONS": "Alpha",
+        "ALPHA_ONLY_PMVS": "Alpha",
+        "GAMMA_READ_ONLY_MODEL_VIEWS": "Gamma",
+        "READ_ONLY_MODEL_VIEWS": "All roles (read-only)",
+        "READ_ONLY_PERMISSION": "All roles (read-only)",
+        "SQLLAB_ONLY_PERMISSIONS": "SQL Lab",
+        "SQLLAB_EXTRA_PERMISSION_VIEWS": "SQL Lab",
+        "ACCESSIBLE_PERMS": "All authenticated users",
+        "OBJECT_SPEC_PERMISSIONS": "Data access (user-defined)",
+        "USER_MODEL_VIEWS": "Admin (user management)",
+        "data_access_permissions": "Data access",
+    }
+
+    for const_name, const_changes in sorted(changes.items()):
+        role_hint = role_implications.get(const_name, "Various")
+        lines.append(f"### `{const_name}` (affects: {role_hint})")
+        lines.append("")
+
+        if const_changes["added"]:
+            lines.append("**Added:**")
+            for item in const_changes["added"]:
+                lines.append(f"- `{item}`")
+            lines.append("")
+
+        if const_changes["removed"]:
+            lines.append("**Removed:**")
+            for item in const_changes["removed"]:
+                lines.append(f"- `{item}`")
+            lines.append("")
+
+    lines.append("---")
+    lines.append("")
+    lines.append(
+        "*This report was generated by the role tracking workflow. "
+        "Please review these changes carefully as they affect user 
permissions.*"
+    )
+
+    return "\n".join(lines)
+
+
+def format_json(diff: dict[str, Any]) -> str:
+    """Format the diff as JSON."""
+    return json.dumps(diff, indent=2, sort_keys=True)
+
+
+def format_text(diff: dict[str, Any]) -> str:
+    """Format the diff as plain text."""
+    lines = []
+    summary = diff["summary"]
+    changes = diff["changes"]
+
+    if not changes:
+        lines.append("No changes to role definitions detected.")
+        return "\n".join(lines)
+
+    lines.append("Role Definition Changes")
+    lines.append("=" * 50)
+    lines.append(f"Constants with changes: 
{summary['constants_with_changes']}")
+    lines.append(f"Total added: {summary['total_added']}")
+    lines.append(f"Total removed: {summary['total_removed']}")
+    lines.append("")
+
+    for const_name, const_changes in sorted(changes.items()):
+        lines.append(f"{const_name}:")
+        if const_changes["added"]:
+            lines.append("  Added:")
+            for item in const_changes["added"]:
+                lines.append(f"    + {item}")
+        if const_changes["removed"]:
+            lines.append("  Removed:")
+            for item in const_changes["removed"]:
+                lines.append(f"    - {item}")
+        lines.append("")
+
+    return "\n".join(lines)
+
+
+def main() -> int:
+    parser = argparse.ArgumentParser(
+        description="Compare role definitions between two commits"
+    )
+    parser.add_argument(
+        "base",
+        type=str,
+        help="Path to base (before) definitions JSON",
+    )
+    parser.add_argument(
+        "head",
+        type=str,
+        help="Path to head (after) definitions JSON",
+    )
+    parser.add_argument(
+        "--format",
+        "-f",
+        type=str,
+        choices=["markdown", "json", "text"],
+        default="markdown",
+        help="Output format (default: markdown)",
+    )
+    parser.add_argument(
+        "--output",
+        "-o",
+        type=str,
+        help="Output file path (default: stdout)",
+    )
+    parser.add_argument(
+        "--github-output",
+        action="store_true",
+        help="Write to GITHUB_OUTPUT for GitHub Actions",
+    )
+    args = parser.parse_args()
+
+    base_path = Path(args.base)
+    head_path = Path(args.head)
+
+    if not base_path.exists():
+        print(f"Error: Base file not found: {base_path}")
+        return 1
+    if not head_path.exists():
+        print(f"Error: Head file not found: {head_path}")
+        return 1
+
+    try:
+        base = load_definitions(base_path)
+        head = load_definitions(head_path)
+    except json.JSONDecodeError as e:

Review Comment:
   **Suggestion:** Reading the file may raise a UnicodeDecodeError (e.g., 
non-UTF-8 bytes), but the code only catches json.JSONDecodeError; an unhandled 
UnicodeDecodeError will crash the script. Catch UnicodeDecodeError alongside 
JSONDecodeError when loading JSON. [possible bug]
   
   **Severity Level:** Critical 🚨
   ```suggestion
       except (json.JSONDecodeError, UnicodeDecodeError) as e:
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   Path.read_text() can raise UnicodeDecodeError for non-UTF-8 input; the 
current except only catches JSONDecodeError.
   Catching UnicodeDecodeError (or decoding earlier) prevents an uncaught crash 
and provides a sensible error message/exit code.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** scripts/compare_role_definitions.py
   **Line:** 251:251
   **Comment:**
        *Possible Bug: Reading the file may raise a UnicodeDecodeError (e.g., 
non-UTF-8 bytes), but the code only catches json.JSONDecodeError; an unhandled 
UnicodeDecodeError will crash the script. Catch UnicodeDecodeError alongside 
JSONDecodeError when loading JSON.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
scripts/compare_role_definitions.py:
##########
@@ -0,0 +1,284 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Compare role definitions between two JSON files and generate a report.
+
+This script compares the output of extract_role_definitions.py between
+two commits to identify changes in role permissions.
+
+Usage:
+    python scripts/compare_role_definitions.py base.json head.json [--format 
FORMAT]
+"""
+
+import argparse
+import json
+import os
+import sys
+from pathlib import Path
+from typing import Any
+
+
+def load_definitions(path: Path) -> dict[str, Any]:
+    """Load role definitions from a JSON file."""
+    return json.loads(path.read_text())
+
+
+def compare_sets(
+    base_values: list[str],
+    head_values: list[str],
+) -> tuple[list[str], list[str], list[str]]:
+    """Compare two lists and return added, removed, and unchanged items."""
+    base_set = set(base_values)
+    head_set = set(head_values)
+
+    added = sorted(head_set - base_set)
+    removed = sorted(base_set - head_set)
+    unchanged = sorted(base_set & head_set)
+
+    return added, removed, unchanged
+
+
+def generate_diff(
+    base: dict[str, Any],
+    head: dict[str, Any],
+) -> dict[str, Any]:
+    """Generate a diff between base and head definitions."""
+    base_constants = base.get("constants", {})
+    head_constants = head.get("constants", {})
+
+    all_keys = sorted(set(base_constants.keys()) | set(head_constants.keys()))
+
+    diff: dict[str, Any] = {
+        "changes": {},
+        "summary": {
+            "total_constants": len(all_keys),
+            "constants_with_changes": 0,
+            "total_added": 0,
+            "total_removed": 0,
+        },
+    }
+
+    for key in all_keys:
+        base_values = base_constants.get(key, [])
+        head_values = head_constants.get(key, [])

Review Comment:
   **Suggestion:** If a constant key exists in the JSON with a null value, 
`base_constants.get(key, [])` will return None (not the default) and 
`set(None)` will raise a TypeError. Coalesce falsy values to an empty list 
before converting to a set. [type error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
           base_values = base_constants.get(key) or []
           head_values = head_constants.get(key) or []
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   If a key exists with value null in the JSON, .get(key, []) will return None 
and set(None) will raise TypeError.
   Using `.get(key) or []` ensures falsy/None values become an empty list and 
avoids runtime errors.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** scripts/compare_role_definitions.py
   **Line:** 76:77
   **Comment:**
        *Type Error: If a constant key exists in the JSON with a null value, 
`base_constants.get(key, [])` will return None (not the default) and 
`set(None)` will raise a TypeError. Coalesce falsy values to an empty list 
before converting to a set.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
scripts/extract_role_definitions.py:
##########
@@ -0,0 +1,204 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Extract role definitions from the security manager for CI comparison.
+
+This script parses the SupersetSecurityManager class and extracts the constants
+that define role permissions. It outputs JSON that can be compared between
+commits to detect role changes.
+
+Usage:
+    python scripts/extract_role_definitions.py [--output FILE]
+"""
+
+import argparse
+import ast
+import json
+import sys
+from pathlib import Path
+from typing import Any
+
+
+# Constants that define role permissions in SupersetSecurityManager
+ROLE_DEFINING_CONSTANTS = [
+    "READ_ONLY_MODEL_VIEWS",
+    "USER_MODEL_VIEWS",
+    "GAMMA_READ_ONLY_MODEL_VIEWS",
+    "ADMIN_ONLY_VIEW_MENUS",
+    "ALPHA_ONLY_VIEW_MENUS",
+    "ALPHA_ONLY_PMVS",
+    "ADMIN_ONLY_PERMISSIONS",
+    "READ_ONLY_PERMISSION",
+    "ALPHA_ONLY_PERMISSIONS",
+    "OBJECT_SPEC_PERMISSIONS",
+    "ACCESSIBLE_PERMS",
+    "SQLLAB_ONLY_PERMISSIONS",
+    "SQLLAB_EXTRA_PERMISSION_VIEWS",
+    "data_access_permissions",
+]
+
+
+def parse_set_or_tuple(node: ast.expr) -> list[Any]:
+    """Parse an AST node representing a set, frozenset, tuple, or set 
operations."""
+    if isinstance(node, ast.Set):
+        return [parse_element(elt) for elt in node.elts]
+    if isinstance(node, ast.Tuple):
+        return [parse_element(elt) for elt in node.elts]
+    if isinstance(node, ast.Call):
+        # Handle frozenset() or set() calls
+        if isinstance(node.func, ast.Name) and node.func.id in ("frozenset", 
"set"):
+            if node.args:
+                return parse_set_or_tuple(node.args[0])
+        return []
+    if isinstance(node, ast.BinOp) and isinstance(node.op, ast.BitOr):
+        # Handle set union operations like {a} | {b}
+        left = parse_set_or_tuple(node.left)
+        right = parse_set_or_tuple(node.right)
+        return left + right
+    if isinstance(node, ast.Name):
+        # Reference to another constant - return a placeholder
+        return [f"${node.id}"]
+    return []
+
+
+def parse_element(node: ast.expr) -> Any:
+    """Parse a single element from a set or tuple."""
+    if isinstance(node, ast.Constant):
+        return node.value
+    if isinstance(node, ast.Str):  # Python 3.7 compatibility
+        return node.s
+    if isinstance(node, ast.Tuple):
+        return tuple(parse_element(elt) for elt in node.elts)
+    if isinstance(node, ast.Name):
+        return f"${node.id}"
+    return str(ast.dump(node))
+
+
+def extract_class_attributes(
+    class_def: ast.ClassDef,
+    target_names: list[str],
+) -> dict[str, list[Any]]:
+    """Extract specified attributes from a class definition."""
+    results: dict[str, list[Any]] = {}
+
+    for node in class_def.body:
+        if isinstance(node, ast.Assign):
+            for target in node.targets:
+                if isinstance(target, ast.Name) and target.id in target_names:
+                    results[target.id] = sorted(
+                        parse_set_or_tuple(node.value),
+                        key=lambda x: str(x),
+                    )
+        elif isinstance(node, ast.AnnAssign):
+            if (
+                isinstance(node.target, ast.Name)
+                and node.target.id in target_names
+                and node.value
+            ):
+                results[node.target.id] = sorted(
+                    parse_set_or_tuple(node.value),
+                    key=lambda x: str(x),
+                )
+
+    return results
+
+
+def resolve_references(
+    definitions: dict[str, list[Any]],
+) -> dict[str, list[Any]]:
+    """Resolve references to other constants (marked with $)."""
+    resolved: dict[str, list[Any]] = {}
+
+    for name, values in definitions.items():
+        resolved_values = []
+        for value in values:
+            if isinstance(value, str) and value.startswith("$"):
+                ref_name = value[1:]
+                if ref_name in definitions:
+                    resolved_values.extend(definitions[ref_name])
+                else:
+                    resolved_values.append(value)
+            else:
+                resolved_values.append(value)

Review Comment:
   **Suggestion:** resolve_references only resolves one level of "$" references 
and does not recursively expand chain references (or protect against cycles), 
so referenced constants that themselves reference others remain unresolved; 
make resolution recursive with cycle protection to fully expand references. 
[logic error]
   
   **Severity Level:** Minor ⚠️
   ```suggestion
       """Resolve references to other constants (marked with $), recursively 
with cycle protection."""
       resolved: dict[str, list[Any]] = {}
   
       def _resolve_value(val: Any, seen: set[str]) -> list[Any]:
           """Resolve a single value, expanding $refs recursively and 
protecting against cycles."""
           if isinstance(val, str) and val.startswith("$"):
               ref_name = val[1:]
               if ref_name in seen:
                   # Cycle detected; return the placeholder to avoid infinite 
recursion
                   return [val]
               if ref_name not in definitions:
                   return [val]
               # Recurse into the referenced definition
               out: list[Any] = []
               for v in definitions[ref_name]:
                   out.extend(_resolve_value(v, seen | {ref_name}))
               return out
           return [val]
   
       for name in definitions:
           resolved_values: list[Any] = []
           for value in definitions[name]:
               resolved_values.extend(_resolve_value(value, {name}))
           # Normalize to strings, deduplicate and sort for stable output
   ```
   <details>
   <summary><b>Why it matters? ⭐ </b></summary>
   
   Correct and important: the current code only expands $-references one level, 
so chains like A -> $B and B -> $C remain partially unresolved.
   A small recursive resolver with cycle detection fixes this and yields fully 
expanded constants. This changes output semantics in a desirable way.
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** scripts/extract_role_definitions.py
   **Line:** 123:136
   **Comment:**
        *Logic Error: resolve_references only resolves one level of "$" 
references and does not recursively expand chain references (or protect against 
cycles), so referenced constants that themselves reference others remain 
unresolved; make resolution recursive with cycle protection to fully expand 
references.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(ci): add GitHub Action to track role permission changes [superset]

Reply via email to