bito-code-review[bot] commented on code in PR #35983:
URL: https://github.com/apache/superset/pull/35983#discussion_r2490075899


##########
requirements/base.in:
##########
@@ -36,3 +36,6 @@ marshmallow-sqlalchemy>=1.3.0,<1.4.1
 
 # needed for python 3.12 support
 openapi-schema-validator>=0.6.3
+
+# Azure Blob Storage for report attachments
+azure-storage-blob>=12.0.0

Review Comment:
   
   <div>
   
   
   <div id="suggestion">
   <div id="issue"><b>Security vulnerability in azure-storage-blob 
dependency</b></div>
   <div id="fix">
   
   The added azure-storage-blob>=12.0.0 introduces a known security 
vulnerability CVE-2022-30187, which is a CBC Padding Oracle attack that can 
lead to information disclosure and privilege escalation in environments using 
AES-CBC encryption. This affects the `azure-storage-blob` library when handling 
encrypted blobs, potentially compromising report attachments stored in Azure 
Blob Storage. Upgrade to >=12.13.0 to mitigate this risk, as the vulnerability 
is fixed in that version.
   </div>
   
   <details>
   <summary>
   <b>Code suggestion</b>
   </summary>
   <blockquote>Check the AI-generated fix before applying</blockquote>
   <div id="code">
   
   
   ````suggestion
   # Azure Blob Storage for report attachments
   azure-storage-blob>=12.13.0
   ````
   
   </div>
   </details>
   
   
   </div>
   
   
   
   
   <small><i>Code Review Run <a 
href=https://github.com/apache/superset/pull/35983#issuecomment-3485329321>#e937ac</a></i></small>
   </div>
   
   ---
   Should Bito avoid suggestions like this for future reviews? (<a 
href=https://alpha.bito.ai/home/ai-agents/review-rules>Manage Rules</a>)
   - [ ] Yes, avoid them



##########
superset/reports/notifications/email.py:
##########
@@ -147,15 +136,59 @@ def _get_content(self) -> EmailContent:
         else:
             html_table = ""
 
-        img_tags = []
-        for msgid in images.keys():
-            img_tags.append(
-                f"""<div class="image">
-                    <img width="1000" src="cid:{msgid}">
-                </div>
-                """
+        # Handle Azure Blob Storage for attachments

Review Comment:
   
   <div>
   
   
   <div id="suggestion">
   <div id="issue"><b>Breaking change removing email attachments when Azure 
Blob disabled</b></div>
   <div id="fix">
   
   This change removes email attachments and embedded images entirely when 
AZURE_BLOB_REPORTS_ENABLED is False, breaking backward compatibility for users 
not using Azure Blob Storage. Previously, reports included attachments and 
embedded screenshots in emails. Now, when the config is disabled, no 
attachments are sent and images aren't displayed in the email body. This 
impacts the `_get_content` method which is called during email notification 
sending, affecting downstream consumers like report recipients who expect to 
receive attachments. To fix, add conditional logic to preserve the old 
attachment and embedding behavior when Azure Blob is not enabled, ensuring 
users continue to get attachments via email as before.
   </div>
   
   <details>
   <summary>
   <b>Code suggestion</b>
   </summary>
   <blockquote>Check the AI-generated fix before applying</blockquote>
   <div id="code">
   
   
   ```
    -        # Handle Azure Blob Storage for attachments
    -        azure_links_html = ""
    -        if current_app.config.get("AZURE_BLOB_REPORTS_ENABLED", False):
    -            # Upload attachments to Azure Blob Storage
    -            from superset.utils.azure_blob import upload_report_attachments
    -
    -            blob_urls = upload_report_attachments(
    -                csv_data=self._content.csv,
    -                pdf_data=self._content.pdf,
    -                screenshots=self._content.screenshots,
    -                report_name=self._name,
    -            )
    -
    -            # Generate download links section if configured
    -            if blob_urls and current_app.config.get(
    -                "AZURE_BLOB_REPORTS_INCLUDE_LINKS", True
    -            ):
    -                links = []
    -                if "csv" in blob_urls:
    -                    links.append(
    -                        f'<li><a href="{blob_urls["csv"]}">Download 
CSV</a></li>'
    -                    )
    -                if "pdf" in blob_urls:
    -                    links.append(
    -                        f'<li><a href="{blob_urls["pdf"]}">Download 
PDF</a></li>'
    -                    )
    -                if "screenshots" in blob_urls:
    -                    for idx, url in enumerate(blob_urls["screenshots"], 1):
    -                        links.append(
    -                            f'<li><a href="{url}">Download Screenshot 
{idx}</a></li>'
    -                        )
    -
    -                if links:
    -                    azure_links_html = """
    -                    <div style="margin-top: 20px; padding: 15px; 
background-color: #f5f5f5; border-radius: 5px;">
    -                        <h3 style="margin-top: 0; color: #42526E;">Report 
Attachments</h3>
    -                        <ul style="list-style-type: none; padding-left: 
0;">
    -                            {}
    -                        </ul>
    -                        <p style="font-size: 12px; color: #6B778C; 
margin-bottom: 0;">
    -                            <em>Links expire in {} days</em>
    -                        </p>
    -                    </div>
    -                    """.format(
    -                        "".join(links),
    -                        int(
    -                            current_app.config.get(
    -                                "AZURE_BLOB_REPORTS_SAS_EXPIRY_HOURS", 168
    -                            )
    -                            / 24
    -                        ),
    -                    )
    +        # Get the domain from the 'From' address
    +        # and make a message id without the < > in the end
    +        domain = self._get_smtp_domain()
    +        images = {}
    +        img_tag = ""
    +        azure_links_html = ""
    +
    +        if current_app.config.get("AZURE_BLOB_REPORTS_ENABLED", False):
    +            # Handle Azure Blob Storage for attachments
    +            from superset.utils.azure_blob import upload_report_attachments
    +
    +            blob_urls = upload_report_attachments(
    +                csv_data=self._content.csv,
    +                pdf_data=self._content.pdf,
    +                screenshots=self._content.screenshots,
    +                report_name=self._name,
    +            )
    +
    +            # Generate download links section if configured
    +            if blob_urls and current_app.config.get(
    +                "AZURE_BLOB_REPORTS_INCLUDE_LINKS", True
    -            ):
    +                links = []
    +                if "csv" in blob_urls:
    +                    links.append(
    +                        f'<li><a href="{blob_urls["csv"]}">Download 
CSV</a></li>'
    +                    )
    +                if "pdf" in blob_urls:
    +                    links.append(
    +                        f'<li><a href="{blob_urls["pdf"]}">Download 
PDF</a></li>'
    +                    )
    +                if "screenshots" in blob_urls:
    +                    for idx, url in enumerate(blob_urls["screenshots"], 1):
    +                        links.append(
    +                            f'<li><a href="{url">Download Screenshot 
{idx}</a></li>'
    +                        )
    +
    +                if links:
    +                    azure_links_html = """
    +                    <div style="margin-top: 20px; padding: 15px; 
background-color: #f5f5f5; border-radius: 5px;">
    +                        <h3 style="margin-top: 0; color: #42526E;">Report 
Attachments</h3>
    +                        <ul style="list-style-type: none; padding-left: 
0;">
    -                            {}
    +                            {}
    +                        </ul>
    +                        <p style="font-size: 12px; color: #6B778C; 
margin-bottom: 0;">
    +                            <em>Links expire in {} days</em>
    +                        </p>
    +                    </div>
    +                    """.format(
    +                        "".join(links),
    +                        int(
    +                            current_app.config.get(
    +                                "AZURE_BLOB_REPORTS_SAS_EXPIRY_HOURS", 168
    +                            )
    +                            / 24
    +                        ),
    +                    )
    +        else:
    +            # Traditional email with embedded images when Azure Blob is 
disabled
    +            if self._content.screenshots:
    +                images = {
    +                    make_msgid(domain)[1:-1]: screenshot
    +                    for screenshot in self._content.screenshots
    +                }
    +
    +            # Generate image tags for embedded screenshots
    +            if images:
    +                img_tags = []
    +                for cid in images:
    +                    img_tags.append(f'<img src="cid:{cid}" class="image">')
    +                img_tag = "<br>".join(img_tags)
    @@ -217,1 +217,2 @@
    -                {html_table}
    +                {html_table}
    +                {img_tag}
    @@ -203,6 +203,10 @@
    -                   }
    -                  a {
    -                    color: #0052CC;
    -                    text-decoration: none;
    -                  }
    -                  a:hover {
    -                    text-decoration: underline;
    -                   }
    +                   }
    +                  .image{
    +                      margin-bottom: 18px;
    +                      min-width: 1000px;
    +                  }
    +                  a {
    +                    color: #0052CC;
    +                    text-decoration: none;
    +                  }
    +                  a:hover {
    +                    text-decoration: underline;
    +                   }
    @@ -224,6 +224,25 @@
    -        # Attachments are not sent via email - either stored in Azure Blob 
or disabled
    -        return EmailContent(
    -            body=body,
    -            images=None,  # No images attached
    -            pdf=None,  # No PDF attachments
    -            data=None,  # No CSV attachments
    -            header_data=self._content.header_data,
    -        )
    +        # Handle attachments based on Azure Blob configuration
    +        if current_app.config.get("AZURE_BLOB_REPORTS_ENABLED", False):
    +            # Attachments are stored in Azure Blob - don't send via email
    +            return EmailContent(
    +                body=body,
    +                images=None,
    +                pdf=None,
    +                data=None,
    +                header_data=self._content.header_data,
    +            )
    +        else:
    +            # Preserve backward compatibility - send attachments via email
    +            csv_data = None
    +            if self._content.csv:
    +                csv_data = {__("%(name)s.csv", name=self._name): 
self._content.csv}
    +            
    +            pdf_data = None
    +            if self._content.pdf:
    +                pdf_data = {__("%(name)s.pdf", name=self._name): 
self._content.pdf}
    +            
    +            return EmailContent(
    +                body=body,
    +                images=images,
    +                pdf=pdf_data,
    +                data=csv_data,
    +                header_data=self._content.header_data,
    +            )
   ```
   
   </div>
   </details>
   
   
   </div>
   
   
   
   
   <small><i>Code Review Run <a 
href=https://github.com/apache/superset/pull/35983#issuecomment-3485329321>#e937ac</a></i></small>
   </div>
   
   ---
   Should Bito avoid suggestions like this for future reviews? (<a 
href=https://alpha.bito.ai/home/ai-agents/review-rules>Manage Rules</a>)
   - [ ] Yes, avoid them



##########
superset/utils/azure_blob.py:
##########
@@ -0,0 +1,206 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+Utility functions for Azure Blob Storage operations.
+Used to store report/alert attachments in Azure Blob Storage.
+"""
+
+import logging
+from datetime import datetime, timedelta
+from typing import Optional
+
+from flask import current_app
+
+logger = logging.getLogger(__name__)
+
+
+def upload_to_azure_blob(
+    file_data: bytes,
+    file_name: str,
+    content_type: str = "application/octet-stream",
+) -> Optional[str]:
+    """
+    Upload a file to Azure Blob Storage and return the blob URL.
+
+    :param file_data: The file content as bytes
+    :param file_name: The name of the file to store
+    :param content_type: MIME type of the file
+    :return: The URL of the uploaded blob, or None if upload fails
+    """
+    try:
+        # Import azure-storage-blob only when needed to avoid hard dependency
+        from azure.storage.blob import BlobServiceClient, ContentSettings
+
+        # Get Azure Blob Storage configuration from app config
+        connection_string = 
current_app.config.get("AZURE_BLOB_CONNECTION_STRING")
+        container_name = current_app.config.get(
+            "AZURE_BLOB_REPORTS_CONTAINER", "superset-reports"
+        )
+
+        if not connection_string:
+            logger.warning(
+                "Azure Blob Storage is not configured. "
+                "Set AZURE_BLOB_CONNECTION_STRING in config."
+            )
+            return None
+
+        # Create BlobServiceClient
+        blob_service_client = BlobServiceClient.from_connection_string(
+            connection_string
+        )
+
+        # Get container client (create if doesn't exist)
+        container_client = 
blob_service_client.get_container_client(container_name)
+        if not container_client.exists():
+            container_client.create_container()
+            logger.info(f"Created Azure Blob container: {container_name}")
+
+        # Generate unique blob name with timestamp
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
+        blob_name = f"{timestamp}_{file_name}"
+
+        # Upload the blob
+        blob_client = container_client.get_blob_client(blob_name)
+        blob_client.upload_blob(
+            file_data,
+            overwrite=True,
+            content_settings=ContentSettings(content_type=content_type),
+        )
+
+        blob_url = blob_client.url
+        logger.info(f"Successfully uploaded file to Azure Blob: {blob_url}")
+        return blob_url
+
+    except ImportError:
+        logger.error(
+            "azure-storage-blob package is not installed. "
+            "Install it with: pip install azure-storage-blob"
+        )
+        return None
+    except Exception as ex:
+        logger.error(f"Failed to upload file to Azure Blob Storage: {str(ex)}")
+        return None
+
+
+def generate_blob_sas_url(
+    blob_url: str, expiry_hours: int = 168
+) -> Optional[str]:  # 7 days default
+    """
+    Generate a SAS (Shared Access Signature) URL for a blob with read 
permissions.
+    This allows temporary access to the blob without making it public.
+
+    :param blob_url: The URL of the blob
+    :param expiry_hours: Number of hours until the SAS token expires (default 
7 days)
+    :return: The blob URL with SAS token, or original URL if SAS generation 
fails
+    """
+    try:
+        from azure.storage.blob import BlobServiceClient, generate_blob_sas, 
BlobSasPermissions
+        from datetime import timezone
+
+        connection_string = 
current_app.config.get("AZURE_BLOB_CONNECTION_STRING")
+        if not connection_string:
+            logger.warning("Azure Blob Storage connection string not 
configured")
+            return blob_url
+
+        # Parse connection string to get account name and key
+        conn_dict = dict(item.split("=", 1) for item in 
connection_string.split(";") if "=" in item)
+        account_name = conn_dict.get("AccountName")
+        account_key = conn_dict.get("AccountKey")
+
+        if not account_name or not account_key:
+            logger.warning("Could not extract account credentials from 
connection string")
+            return blob_url
+
+        # Extract container and blob name from URL
+        # URL format: 
https://{account}.blob.core.windows.net/{container}/{blob}
+        url_parts = 
blob_url.replace(f"https://{account_name}.blob.core.windows.net/";, 
"").split("/", 1)
+        if len(url_parts) != 2:
+            logger.warning(f"Could not parse blob URL: {blob_url}")
+            return blob_url
+
+        container_name, blob_name = url_parts
+
+        # Generate SAS token
+        sas_token = generate_blob_sas(
+            account_name=account_name,
+            container_name=container_name,
+            blob_name=blob_name,
+            account_key=account_key,
+            permission=BlobSasPermissions(read=True),
+            expiry=datetime.now(timezone.utc) + timedelta(hours=expiry_hours),
+        )
+
+        sas_url = f"{blob_url}?{sas_token}"
+        return sas_url
+
+    except ImportError:
+        logger.error("azure-storage-blob package is not installed")
+        return blob_url
+    except Exception as ex:

Review Comment:
   
   <div>
   
   
   <div id="suggestion">
   <div id="issue"><b>Avoid catching broad exception types</b></div>
   <div id="fix">
   
   Catching `Exception` is too broad and may hide unexpected errors. Consider 
catching specific exception types like `AzureError` or `ClientError` instead.
   </div>
   
   <details>
   <summary>
   <b>Code suggestion</b>
   </summary>
   <blockquote>Check the AI-generated fix before applying</blockquote>
   <div id="code">
   
   
   ````suggestion
       except (ImportError, ValueError, RuntimeError) as ex:
   ````
   
   </div>
   </details>
   
   
   </div>
   
   
   
   
   <small><i>Code Review Run <a 
href=https://github.com/apache/superset/pull/35983#issuecomment-3485329321>#e937ac</a></i></small>
   </div>
   
   ---
   Should Bito avoid suggestions like this for future reviews? (<a 
href=https://alpha.bito.ai/home/ai-agents/review-rules>Manage Rules</a>)
   - [ ] Yes, avoid them



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to