Re: [PR] feat(docs): auto-generate database documentation from lib.py [superset]

via GitHub Wed, 14 Jan 2026 22:39:19 -0800


codeant-ai-for-open-source[bot] commented on code in PR #36805:
URL: https://github.com/apache/superset/pull/36805#discussion_r2693166969



##########
superset/db_engine_specs/couchbase.py:
##########
@@ -86,6 +87,34 @@ class CouchbaseEngineSpec(BasicParametersMixin, 
BaseEngineSpec):
     )
     parameters_schema = CouchbaseParametersSchema()
 
+    metadata = {
+        "description": (
+            "Couchbase is a distributed NoSQL document database with SQL++ 
support."
+        ),
+        "logo": "couchbase.png",
+        "homepage_url": "https://www.couchbase.com/";,
+        "category": DatabaseCategory.SEARCH_NOSQL,
+        "pypi_packages": ["couchbase-sqlalchemy"],
+        "connection_string": 
"couchbase://{username}:{password}@{host}:{port}?ssl=true",
+        "default_port": 8091,
+        "parameters": {
+            "username": "Couchbase username",
+            "password": "Couchbase password",
+            "host": "Couchbase host or connection string for cloud",
+            "port": "Couchbase port (default 8091)",
+        },

Review Comment:
   **Suggestion:** The Couchbase metadata `parameters` list omits the required 
`database` field even though `validate_parameters` requires `"database"` to be 
present, so the generated documentation will not mention a mandatory parameter 
and can mislead users into configuring incomplete connection details. [logic 
error]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Users misled by docs, causing connection validation errors.
   - ⚠️ Generated docs out-of-sync with runtime validation.
   ```
   </details>
   
   ```suggestion
               "database": "Couchbase bucket or database name",
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Open superset/db_engine_specs/couchbase.py and locate the metadata 
"parameters" mapping
   (PR hunk shows it starting at line 100). The mapping currently lists 
username, password,
   host, port — it omits "database".
   
   2. Inspect CouchbaseEngineSpec.validate_parameters (present in the same 
file) which builds
   required = {"host", "username", "password", "database"} and checks for 
missing required
   parameters. This function is defined in this file and will flag "database" 
as required at
   runtime when validating connection properties.
   
   3. Run the docs generator (docs/scripts/generate-database-docs.mjs) which 
uses
   metadata["parameters"] to document required/optional connection fields. 
Because "database"
   is omitted in metadata (lines 100-105), the generated docs won't indicate 
that "database"
   is required.
   
   4. Create a connection guided by the generated docs (e.g., follow the MDX 
page for
   Couchbase). The docs will not instruct the user to provide "database", 
leading them to
   configure a connection missing the database/bucket name. When Superset later 
attempts to
   validate connection properties (which calls 
CouchbaseEngineSpec.validate_parameters in
   this file), the validation will produce a missing-parameter warning/error 
referencing
   "database", causing confusion and a mismatch between docs and runtime 
validation.
   
   Note: This reproduction traces the exact code paths: metadata used by the 
docs generator
   (superset/db_engine_specs/couchbase.py:100-105) and runtime validation in
   CouchbaseEngineSpec.validate_parameters within the same module.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/couchbase.py
   **Line:** 105:105
   **Comment:**
        *Logic Error: The Couchbase metadata `parameters` list omits the 
required `database` field even though `validate_parameters` requires 
`"database"` to be present, so the generated documentation will not mention a 
mandatory parameter and can mislead users into configuring incomplete 
connection details.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



##########
superset/db_engine_specs/gsheets.py:
##########
@@ -103,6 +104,23 @@ class GSheetsEngineSpec(ShillelaghEngineSpec):
     default_driver = "apsw"
     sqlalchemy_uri_placeholder = "gsheets://"
 
+    metadata = {
+        "description": (
+            "Google Sheets allows querying spreadsheets as SQL tables "
+            "via shillelagh."
+        ),
+        "logo": "gsheets.png",
+        "homepage_url": "https://www.google.com/sheets/about/";,
+        "category": DatabaseCategory.CLOUD_GOOGLE,

Review Comment:
   **Suggestion:** The `category` field in the `metadata` dict uses 
`DatabaseCategory.CLOUD_GOOGLE`, but `DatabaseCategory` only defines 
`CLOUD_GCP`, so this will raise an `AttributeError` when the module is imported 
and prevent Superset from starting. [logic error]
   
   <details>
   <summary><b>Severity Level:</b> Critical 🚨</summary>
   
   ```mdx
   - ❌ Superset process fails during startup/import.
   - ❌ Webserver/worker cannot load DB engine specs.
   - ❌ CLI commands importing engine specs error.
   - ⚠️ Docs generation that imports engine specs may fail.
   ```
   </details>
   
   ```suggestion
           "category": DatabaseCategory.CLOUD_GCP,
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. Ensure the repository is checked out with the PR changes applied (the 
final file state
   includes `superset/db_engine_specs/gsheets.py`).
   
   2. Start a Python REPL or import the package that loads DB engine specs, 
e.g. run: `python
   -c "import superset.db_engine_specs.gsheets"` from the repository root. This 
attempts to
   import `gsheets.py` which references DatabaseCategory.
   
   3. During import, `gsheets.py` executes the module-level assignment of 
`metadata` which
   contains the line at `superset/db_engine_specs/gsheets.py:114` ("category":
   DatabaseCategory.CLOUD_GOOGLE,).
   
   4. Python resolves `DatabaseCategory` from 
`superset/db_engine_specs/base.py`. The class
   `DatabaseCategory` in `superset/db_engine_specs/base.py` (see lines 190-199) 
defines
   `CLOUD_GCP` but does not define `CLOUD_GOOGLE`. A read of
   `superset/db_engine_specs/base.py:196-199` shows `CLOUD_GCP = "Cloud - 
Google"`.
   
   5. Because `CLOUD_GOOGLE` is not an attribute on `DatabaseCategory`, 
attempting to access
   `DatabaseCategory.CLOUD_GOOGLE` raises AttributeError at import time, e.g.
   "AttributeError: type object 'DatabaseCategory' has no attribute 
'CLOUD_GOOGLE'".
   
   6. Observed outcome: import fails immediately and the process that attempted 
the import
   (webserver, CLI, or docs generation that imports engine specs) crashes 
before completing
   initialization.
   
   Note: This reproduction uses concrete file locations and line numbers 
discovered via code
   inspection: `gsheets.py:114` (the problematic metadata line) and 
`base.py:190-199`
   (DatabaseCategory definition).
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/db_engine_specs/gsheets.py
   **Line:** 114:114
   **Comment:**
        *Logic Error: The `category` field in the `metadata` dict uses 
`DatabaseCategory.CLOUD_GOOGLE`, but `DatabaseCategory` only defines 
`CLOUD_GCP`, so this will raise an `AttributeError` when the module is imported 
and prevent Superset from starting.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat(docs): auto-generate database documentation from lib.py [superset]

Reply via email to