Copilot commented on code in PR #57610:
URL: https://github.com/apache/airflow/pull/57610#discussion_r2761636653


##########
pyproject.toml:
##########
@@ -438,6 +441,7 @@ packages = []
     "apache-airflow-providers-http>=4.13.2",
     "apache-airflow-providers-imap>=3.8.0",
     "apache-airflow-providers-influxdb>=2.8.0",
+    "apache-airflow-providers-informatica",
     "apache-airflow-providers-jdbc>=4.5.2",

Review Comment:
   In the global providers dependency list, 
`apache-airflow-providers-informatica` is added without a minimum version, 
while most other providers are pinned with `>=...`. Consider pinning it 
consistently here as well (e.g. `>=0.1.0`) to avoid resolving an unexpected 
older version once it exists on PyPI.



##########
providers/informatica/docs/guides/configuration.rst:
##########
@@ -0,0 +1,74 @@
+
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Configuration
+=============
+
+This section describes how to configure the Informatica provider for Apache 
Airflow.
+
+Connection Setup
+----------------
+
+Create an HTTP connection in Airflow for Informatica EDC:
+
+1. **Connection Type**: ``http``
+2. **Host**: Your EDC server hostname
+3. **Port**: EDC server port (typically 9087)
+4. **Schema**: ``https`` or ``http``
+5. **Login**: EDC username

Review Comment:
   The docs instruct users to create an Airflow connection with type `http`, 
but this provider registers a dedicated connection type `informatica_edc` (see 
provider.yaml) and the hook’s `conn_type` is also `informatica_edc`. This 
inconsistency will confuse users and may prevent the UI from showing the right 
fields/hook mapping. Update the docs to reference `informatica_edc` (or, 
alternatively, change the provider’s registered connection type to `http`, but 
then provider.yaml/get_provider_info.py should match).



##########
providers/informatica/tests/unit/informatica/extractors/test_informatica.py:
##########
@@ -0,0 +1,71 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""
+Unit tests for InformaticaLineageExtractor covering all methods.
+"""
+
+from __future__ import annotations
+
+from unittest.mock import patch
+
+import pytest
+
+from airflow.models import Connection
+from airflow.providers.informatica.extractors.informatica import 
InformaticaLineageExtractor
+from airflow.providers.informatica.hooks.edc import InformaticaEDCHook
+
+
[email protected]
+def extractor():
+    informatica_hook = InformaticaEDCHook(informatica_edc_conn_id="test_conn")
+    return InformaticaLineageExtractor(edc_hook=informatica_hook)
+
+
[email protected](autouse=True)
+def setup_connections(create_connection_without_db):
+    create_connection_without_db(
+        Connection(
+            conn_id="test_conn",
+            conn_type="http",
+            host="testhost",
+            schema="https",

Review Comment:
   The unit test fixture creates the connection with `conn_type="http"`, but 
this provider registers a dedicated connection type `informatica_edc` 
(provider.yaml/get_provider_info.py) and the hook advertises `conn_type = 
"informatica_edc"`. Using the provider-specific conn_type in tests will better 
reflect real-world configuration and catch any conn-type-specific behavior in 
the UI/metadata.



##########
pyproject.toml:
##########
@@ -246,6 +246,9 @@ packages = []
 "influxdb" = [
     "apache-airflow-providers-influxdb>=2.8.0"
 ]
+"informatica" = [
+    "apache-airflow-providers-informatica"
+]

Review Comment:
   The new `informatica` extra does not pin a minimum provider version, while 
the surrounding extras do (e.g. `influxdb` uses `>=2.8.0`). To keep extras 
consistent and avoid pulling pre-0.1.0 builds once they exist, specify a 
minimum version such as `apache-airflow-providers-informatica>=0.1.0`.



##########
providers/informatica/src/airflow/providers/informatica/get_provider_info.py:
##########
@@ -0,0 +1,73 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# NOTE! THIS FILE IS AUTOMATICALLY GENERATED AND WILL BE OVERWRITTEN!
+#
+# IF YOU WANT TO MODIFY THIS FILE, YOU SHOULD MODIFY THE TEMPLATE
+# `get_provider_info_TEMPLATE.py.jinja2` IN the 
`dev/breeze/src/airflow_breeze/templates` DIRECTORY
+
+
+def get_provider_info():
+    return {
+        "package-name": "apache-airflow-providers-informatica",
+        "name": "Informatica Airflow",
+        "description": "`Informatica <https://www.informatica.com//>`__\n",
+        "integrations": [
+            {
+                "integration-name": "Informatica",
+                "external-doc-url": "https://www.informatica.com/";,
+                "logo": "/docs/integration-logos/informatica.png",
+                "tags": ["protocol"],
+            }
+        ],
+        "hooks": [
+            {"integration-name": "Informatica", "python-modules": 
["airflow.providers.informatica.hooks.edc"]}
+        ],
+        "connection-types": [
+            {
+                "hook-class-name": 
"airflow.providers.informatica.hooks.edc.InformaticaEDCHook",
+                "connection-type": "informatica_edc",
+            }
+        ],
+        "plugins": [
+            {
+                "name": "informatica",
+                "plugin-class": 
"airflow.providers.informatica.plugins.InformaticaProviderPlugin",
+            }
+        ],
+        "config": {
+            "informatica": {
+                "description": "This section applies settings for Informatica 
integration.\nMore about configuration and its precedence can be found in the 
`usage's 
guide\n<https://airflow.apache.org/docs/apache-airflow-providers-informatica/stable/guides/usage.html#transport-setup>`_.\n",
+                "options": {
+                    "disabled": {
+                        "description": "Disable sending events without 
uninstalling the Informatica Provider by setting this to true.\n",
+                        "type": "boolean",
+                        "example": None,
+                        "default": "False",
+                        "version_added": None,
+                    },
+                    "default_conn_id": {
+                        "description": "The default connection ID to use for 
Informatica operations.\n",
+                        "type": "string",
+                        "example": "informatica_edc_default",
+                        "default": "",
+                        "version_added": None,
+                    },

Review Comment:
   `get_provider_info.py` reports `default_conn_id` default as an empty string, 
which conflicts with the hook’s fallback (`informatica_edc_default`) and the 
docs. This file is generated from provider metadata; update the underlying 
provider.yaml/template so the generated default matches the real default 
connection id.



##########
providers/informatica/docs/index.rst:
##########
@@ -0,0 +1,181 @@
+
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+========================================
+``apache-airflow-providers-informatica``
+========================================
+
+.. toctree::
+   :hidden:
+   :maxdepth: 1
+   :caption: Basics
+
+   Home <self>
+   Security <security>
+   Changelog <changelog>
+
+.. toctree::
+   :hidden:
+   :maxdepth: 1
+   :caption: Guides
+
+   Usage <guides/usage>
+   API Reference <guides/api>
+   Configuration <guides/configuration>
+
+.. toctree::
+   :hidden:
+   :maxdepth: 1
+   :caption: References
+
+   Configuration <configurations-ref>
+   Python API <_api/airflow/providers/informatica/index>
+
+.. toctree::
+    :hidden:
+    :maxdepth: 1
+    :caption: Resources
+
+    PyPI Repository 
<https://pypi.org/project/apache-airflow-providers-informatica/>
+    Installing from sources <installing-providers-from-sources>
+
+.. toctree::
+    :hidden:
+    :maxdepth: 1
+    :caption: System tests
+
+    System Tests <_api/tests/system/informatica/index>
+
+.. toctree::
+    :hidden:
+    :maxdepth: 1
+    :caption: Commits
+
+    Detailed list of commits <commits>
+
+Apache Airflow Informatica Provider
+===================================
+
+**Note:** This provider is not officially maintained or endorsed by 
Informatica. It is a community-developed integration for Apache Airflow.
+
+The Informatica provider integrates Apache Airflow with Informatica Enterprise 
Data Catalog (EDC) for advanced data lineage tracking and asset discovery.
+
+Overview
+--------
+
+This provider enables automatic lineage extraction and tracking between 
Airflow tasks and Informatica EDC catalog objects. When tasks define inlets and 
outlets with EDC object URIs, the provider automatically:
+
+- Resolves object identifiers using EDC API
+- Creates lineage relationships between source and target objects
+- Integrates with Airflow's native lineage system
+
+Installation
+------------
+
+You can install this package on top of an existing Airflow installation via
+``pip install apache-airflow-providers-informatica``.
+For the minimum Airflow version supported, see ``Requirements`` below.
+
+
+Requirements
+------------
+
+The minimum Apache Airflow version supported by this provider distribution is 
``3.0.0``.
+
+==========================================  ==================
+PIP package                                 Version required
+==========================================  ==================
+``apache-airflow``                          ``>=3.0.0``
+``apache-airflow-providers-common-compat``  ``>=1.12.0``
+``apache-airflow-providers-http``           ``>=1.0.0``
+``attrs``                                   ``>=22.2``
+==========================================  ==================

Review Comment:
   The Requirements table lists `attrs>=22.2`, but this provider’s 
`pyproject.toml` does not declare `attrs` as a dependency and the code shown 
here doesn’t use it. Please remove `attrs` from the table (or add it as an 
explicit dependency if it’s actually required) to keep docs accurate.



##########
providers/informatica/README.rst:
##########
@@ -0,0 +1,183 @@
+
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+Apache Airflow Informatica Provider
+===================================
+
+This provider package contains integrations for `Informatica Enterprise Data 
Catalog (EDC) 
<https://www.informatica.com/products/data-governance/enterprise-data-catalog.html>`_
 to work with Apache Airflow.
+
+
+Features
+--------
+
+- **Airflow Integration**: Seamless integration with Airflow's lineage system 
using inlets and outlets.
+
+
+Installation
+------------
+
+.. code-block:: bash
+
+    pip install apache-airflow-providers-informatica
+
+
+
+Connection Setup
+~~~~~~~~~~~~~~~~
+
+Create an Informatica EDC connection in Airflow:
+
+    #. **Connection Type**: ``http``
+    #. **Host**: Your EDC server hostname
+    #. **Port**: EDC server port (typically 9087)
+    #. **Schema**: ``https`` or ``http``
+    #. **Login**: EDC username

Review Comment:
   README says to create a connection with type `http`, but the provider 
registers `informatica_edc` as its connection type (and the hook uses 
`conn_type = "informatica_edc"`). Please align the README with the provider 
metadata so users configure the correct connection type in the UI.



##########
providers/informatica/provider.yaml:
##########
@@ -0,0 +1,73 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+package-name: apache-airflow-providers-informatica
+name: Informatica Airflow
+description: |
+  `Informatica <https://www.informatica.com//>`__
+
+state: ready
+source-date-epoch: 1758787152
+# Note that those versions are maintained by release manager - do not update 
them manually
+# with the exception of case where other provider in sources has >= new 
provider version.
+# In such case adding >= NEW_VERSION and bumping to NEW_VERSION in a provider 
have
+# to be done in the same PR
+versions:
+  - 0.1.0
+
+integrations:
+  - integration-name: Informatica
+    external-doc-url: https://www.informatica.com/
+    logo: /docs/integration-logos/informatica.png
+    tags: [protocol]
+
+hooks:
+  - integration-name: Informatica
+    python-modules:
+      - airflow.providers.informatica.hooks.edc
+
+connection-types:
+  - hook-class-name: airflow.providers.informatica.hooks.edc.InformaticaEDCHook
+    connection-type: informatica_edc
+
+plugins:
+  - name: informatica
+    plugin-class: 
airflow.providers.informatica.plugins.InformaticaProviderPlugin
+
+config:
+  informatica:
+    description: |
+      This section applies settings for Informatica integration.
+      More about configuration and its precedence can be found in the `usage's 
guide
+      
<https://airflow.apache.org/docs/apache-airflow-providers-informatica/stable/guides/usage.html#transport-setup>`_.
+
+    options:
+      disabled:
+        description: |
+          Disable sending events without uninstalling the Informatica Provider 
by setting this to true.
+        type: boolean
+        example: ~
+        default: "False"
+        version_added: ~
+      default_conn_id:
+        description: |
+          The default connection ID to use for Informatica operations.
+        type: string
+        example: "informatica_edc_default"
+        default: ""
+        version_added: ~

Review Comment:
   The provider metadata declares the `default_conn_id` option default as an 
empty string, but the hook falls back to `informatica_edc_default` and the 
docs/examples use that value. Setting the metadata default to the actual 
default (`informatica_edc_default`) will keep generated docs/config references 
consistent.



##########
providers/informatica/pyproject.toml:
##########
@@ -0,0 +1,125 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# NOTE! THIS FILE IS AUTOMATICALLY GENERATED AND WILL BE OVERWRITTEN!
+
+# IF YOU WANT TO MODIFY THIS FILE EXCEPT DEPENDENCIES, YOU SHOULD MODIFY THE 
TEMPLATE
+# `pyproject_TEMPLATE.toml.jinja2` IN the 
`dev/breeze/src/airflow_breeze/templates` DIRECTORY
+[build-system]
+requires = ["flit_core==3.12.0"]
+build-backend = "flit_core.buildapi"
+
+[project]
+name = "apache-airflow-providers-informatica"
+version = "0.1.0"
+description = "Provider package apache-airflow-providers-informatica for 
Apache Airflow"
+readme = "README.rst"
+license = "Apache-2.0"
+license-files = ['LICENSE', 'NOTICE']
+authors = [
+    {name="Apache Software Foundation", email="[email protected]"},
+]
+maintainers = [
+    {name="Apache Software Foundation", email="[email protected]"},
+]
+keywords = [ "airflow-provider", "informatica", "airflow", "integration" ]
+classifiers = [
+    "Development Status :: 5 - Production/Stable",
+    "Environment :: Console",
+    "Environment :: Web Environment",
+    "Intended Audience :: Developers",
+    "Intended Audience :: System Administrators",
+    "Framework :: Apache Airflow",
+    "Framework :: Apache Airflow :: Provider",
+    "Programming Language :: Python :: 3.10",
+    "Programming Language :: Python :: 3.11",
+    "Programming Language :: Python :: 3.12",
+    "Programming Language :: Python :: 3.13",
+    "Topic :: System :: Monitoring",
+]
+requires-python = ">=3.10"
+
+# The dependencies should be modified in place in the generated file.
+# Any change in the dependencies is preserved when the file is regenerated
+# Make sure to run ``prek update-providers-dependencies --all-files``
+# After you modify the dependencies, and rebuild your Breeze CI image with 
``breeze ci-image build``
+dependencies = [
+    "apache-airflow>=3.0.0",
+    "apache-airflow-providers-common-compat>=1.12.0",
+    "apache-airflow-providers-http>=1.0.0"
+]

Review Comment:
   `apache-airflow-providers-http>=1.0.0` is far below the minimum HTTP 
provider version used by this repo for the `apache-airflow[http]` extra (root 
pyproject.toml requires `apache-airflow-providers-http>=4.13.2`). With 
`apache-airflow>=3.0.0`, allowing very old http provider versions risks 
installing an incompatible HttpHook implementation. Please bump the minimum 
http provider requirement to at least the repo’s declared minimum (and ideally 
to the minimum version known compatible with Airflow 3).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to