Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Hi, I have finally been able to resolve the import issues and all DAGs are running correctly. Below I detail the steps I followed: 1) I deleted my current virtual environment and reinstalled Apache Airflow: pip install apache-airflow 2) I installed the latest version released yesterday of Apache-Airflow-Providers-Google: pip install apache-airflow-providers-google==14.0.0rc1 3) At this point when starting the Airflow scheduler I get the following import errors: File "/home/iscipar/Projects/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/google/cloud/transfers/s3_to_gcs.py", line 27, in from airflow.providers.amazon.aws.hooks.s3 import S3Hook ModuleNotFoundError: No module named 'airflow.providers.amazon' File "/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/google/cloud/transfers/azure_blob_to_gcs.py", line 28, in from airflow.providers.microsoft.azure.hooks.wasb import WasbHook ModuleNotFoundError: No module named 'airflow.providers.microsoft' 4) After analyzing the exception I have understood that it is necessary to install the amazon and microsoft-azure providers. In this way I have installed the versions of both providers released yesterday as well: pip install apache-airflow-providers-amazon==9.4.0rc1 pip install apache-airflow-providers-microsoft-azure==12.2.0rc1 After these package installations, as I said before, I have already managed to get the DAGs working. In principle, I see that with these RC versions of the providers they work, although in the future I will logically update to the final versions when they are released. Regards, GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12285209 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Thanks for the info. GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12253181 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user kacpermuda added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc The bug is in google provider, not Airflow core, so you should wait for next release of google provider. I think a release process has already started, so today or tomorrow a new version should appear [here](https://pypi.org/project/apache-airflow-providers-google/#history), with rc1 suffix. You can then try installing it with pip, as you would with any other version (f.e. `pip install apache-airflow-providers-google==12.1.0rc1`) and see if it resolved your problem. GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12251930 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Hello, I see that the new version Apache Airflow 2.10.5 has been released, but I don't find the resolution of this error included in the bug fixes. Do I understand that perhaps this resolution will be included in the next version or the correction of this error is already included in this 2.10.5? https://github.com/apache/airflow/releases/tag/2.10.5 GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12251614 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Ok thanks for the help. GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12206065 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user kacpermuda added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc As i mentioned in the top comment, the problem `does not relate to python version used`, at least I saw it on multiple python versions. So no, imho it would not disappear simply after changing the python version. GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12197225 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc A question. If I create a virtual environment with a python version lower than 3.12, would the error disappear? Or is the error not related to the python version? GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12193498 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Hi. Ok I'll try to recreate the virtual environment and reinstall the packages. Thanks for the help. Regards, GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12180748 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user kacpermuda added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Not really sure how this could happen, was not able to reproduce. Maybe you implemented some manual changes to the source code? I've created a fresh venv, installed the latest airflow, google provider and openlineage provider from pypi and all the errors were gone. The providers release process is described [here](https://github.com/apache/airflow/blob/main/PROVIDERS.rst), not sure when the next release will happen. GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12138349 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user Traviscal added a comment to the discussion: TypeError when
importing operators from airflow.providers.google.cloud.operators.dataproc
> ### Apache Airflow Provider(s)
>
> google
>
> ### Versions of Apache Airflow Providers
>
> 12.0.0
>
> ### Apache Airflow version
>
> 2.10.4
>
> ### Operating System
>
> Ubuntu 24.04.1 LTS
>
> ### Deployment
>
> Virtualenv installation
>
> ### Deployment details
>
> _No response_
>
> ### What happened
>
> When importing any of the existing operators in
> airflow.providers.google.cloud.operators.dataproc (for example
> DataprocCreateClusterOperator, DataprocDeleteClusterOperator or
> DataprocSubmitJobOperator) at the beginning of a dag, a dag import error
> occurs when starting the airflow scheduler.
>
> ### What you think should happen instead
>
> When I start the airflow scheduler the following error occurs:
>
> [2025-02-02T19:12:46.029+0100] {logging_mixin.py:190} INFO -
> [2025-02-02T19:12:46.028+0100] {dagbag.py:387} ERROR - Failed to import:
> /home/iscipar/Projects/airflow-tutorial/dags/6_dataproc_airflow.py
> Traceback (most recent call last):
> File
> "/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/models/dagbag.py",
> line 383, in parse
> loader.exec_module(new_module)
> File "", line 995, in exec_module
> File "", line 488, in _call_with_frames_removed
> File "/home/iscipar/Proyectos/airflow-tutorial/dags/6_dataproc_airflow.py",
> line 5, in
> from airflow.providers.google.cloud.operators.dataproc import (
> File
> "/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/google/cloud/operators/dataproc.py",
> line 57, in
> from airflow.providers.google.cloud.openlineage.utils import (
> File
> "/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/google/cloud/openlineage/utils.py",
> line 204, in
> class BigQueryJobRunFacet(RunFacet):
> TypeError: function() argument 'code' must be code, not str
>
> ### How to reproduce
>
> The error is reproduced simply by adding the following import inside the code
> of a dag without importing the source code of the tasks of said dag:
>
> ```
> from airflow.providers.google.cloud.operators.dataproc import (
> DataprocCreateClusterOperator,
> DataprocDeleteClusterOperator,
> DataprocSubmitJobOperator,
> )
> ```
>
> ### Anything else
>
> _No response_
>
> ### Are you willing to submit PR?
>
> - [ ] Yes I am willing to submit a PR!
>
> ### Code of Conduct
>
> - [x] I agree to follow this project's [Code of
> Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
>
Dm @Traviscal
GitHub link:
https://github.com/apache/airflow/discussions/46478#discussioncomment-12130751
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing
operators from airflow.providers.google.cloud.operators.dataproc
Hi,
I have installed the apache-airflow-providers-openlineage package and added the
section with the openlineage in lowercase.
After doing this, the error is still produced in the DAGs that import the
BigQueryInsertJobOperator but now the exception is the following different one
related to not finding the name _BigQueryOpenLineageMixin.
I think the best thing to do is perhaps to wait for the next versions of the
apache-airflow packages to be released with all the fixes you have added.
Any approximate date for this?
[2025-02-11T00:00:18.542+0100] {logging_mixin.py:190} INFO -
[2025-02-11T00:00:18.539+0100] {dagbag.py:387} ERROR - Failed to import:
/home/iscipar/Projects/airflow-tutorial/dags/2_simple_storage_bigquery.py
Traceback (most recent call last):
File
"/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/models/dagbag.py",
line 383, in parse
loader.exec_module(new_module)
File "", line 995, in exec_module
File "", line 488, in _call_with_frames_removed
File
"/home/iscipar/Proyectos/airflow-tutorial/dags/2_simple_storage_bigquery.py",
line 2, in
from airflow.providers.google.cloud.operators.bigquery import
BigQueryInsertJobOperator
File
"/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/google/cloud/operators/bigquery.py",
line 47, in
from airflow.providers.google.cloud.openlineage.mixins import
_BigQueryOpenLineageMixin
ImportError: cannot import name '_BigQueryOpenLineageMixin' from
'airflow.providers.google.cloud.openlineage.mixins'
(/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/google/cloud/openlineage/mixins.py)
GitHub link:
https://github.com/apache/airflow/discussions/46478#discussioncomment-12130465
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user kacpermuda edited a comment on the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Hi, could you share a bit more details on how did you `update the sources corresponding to the PR` ? If you are just manually copying the changes from my PR, you should also include changes made to `apache-airflow-providers-common-compat` package since last release. The (unreleased) google provider, that I've added this fix to, will **require** the (unreleased) `apache-airflow-providers-common-compat>=1.4.0` that has some new methods, including `inject_transport_information_into_spark_properties`. I've tested my fix PR with the latest code from google and common.compat provider and it worked just fine. FYI i used [breeze](https://github.com/apache/airflow/blob/main/dev/breeze/doc/README.rst), an awesome tool that helps you develop airflow locally, to generate the wheels for the unreleased providers (using `breeze release-management prepare-provider-packages` command). GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12115374 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user kacpermuda edited a comment on the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Hi, could you share a bit more details on how did you `update the sources corresponding to the PR` ? If you are just manually copying the changes from my PR, you should also include changes made to `apache-airflow-providers-common-compat` package since last release. The (unreleased) google provider, that I've added this fix to, will **require** the (unreleased) `apache-airflow-providers-common-compat>=1.4.0` that has some new methods, including `inject_transport_information_into_spark_properties`. I've tested my fix PR with the latest code from google and common.compat provider and it worked just fine. FTI i used [breeze](https://github.com/apache/airflow/blob/main/dev/breeze/doc/README.rst), an awesome tool that helps you develop airflow locally, to generate the wheels for the unreleased providers (using `breeze release-management prepare-provider-packages` command). GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12115374 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user kacpermuda added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Hi, could you share a bit more details on how did you `updated the sources corresponding to the PR` ? If you are just manually copying the changes from my PR, you should also include changes made to `apache-airflow-providers-common-compat` package. The (unreleased) google provider, that I've added this fix to, will **require** the (unreleased) `apache-airflow-providers-common-compat>=1.4.0` that has some new methods, including `inject_transport_information_into_spark_properties`. I've tested my fix PR with the latest code from google and common.compat provider and it worked just fine. FTI i used [breeze](https://github.com/apache/airflow/blob/main/dev/breeze/doc/README.rst), an awesome tool that helps you develop airflow locally, to generate the wheels for the unreleased providers (using `breeze release-management prepare-provider-packages` command). GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12115374 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user kacpermuda added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Hi, just to be sure - did you also install the latest `apache-airflow-providers-openlineage` package (or added it to your requirements.txt)? I've re-created the environment with the error described in the issue, then installed the package, disabled it with env variable and it solved the problem for me. Scheduler is running fine, DAGs are working just fine. Please also try naming the section in lowercase: `openlineage`. When trying to run a task with your config, i got the error `configparser.NoSectionError: No section: 'openlineage'` and it was gone after i changed it to lowercase. GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12114980 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing
operators from airflow.providers.google.cloud.operators.dataproc
Hello.
I would like to comment that I have updated the sources corresponding to the PR
in my virtual environment to check if the error was resolved and this was the
result:
Although the error has disappeared in the dag that imported the dataproc
operators, the dags that imported the BigQueryInsertJobOperator have now
stopped working, producing the following new exception.
Can you check if, as it seems, the resolution of the error has affected the
import of this other operator?
[2025-02-08T19:30:17.251+0100] {logging_mixin.py:190} INFO -
[2025-02-08T19:30:17.249+0100] {dagbag.py:387} ERROR - Failed to import:
/home/iscipar/Projects/airflow-tutorial/dags/2_simple_storage_bigquery.py
Traceback (most recent call last):
File
"/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/models/dagbag.py",
line 383, in parse
loader.exec_module(new_module)
File "", line 995, in exec_module
File "", line 488, in _call_with_frames_removed
File
"/home/iscipar/Proyectos/airflow-tutorial/dags/2_simple_storage_bigquery.py",
line 2, in
from airflow.providers.google.cloud.operators.bigquery import
BigQueryInsertJobOperator
File
"/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/google/cloud/operators/bigquery.py",
line 47, in
from airflow.providers.google.cloud.openlineage.mixins import
_BigQueryOpenLineageMixin
File
"/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/google/cloud/openlineage/mixins.py",
line 39, in
from airflow.providers.google.cloud.openlineage.utils import (
File
"/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/google/cloud/openlineage/utils.py",
line 40, in
from airflow.providers.common.compat.openlineage.utils.spark import (
ImportError: cannot import name
'inject_transport_information_into_spark_properties' from
'airflow.providers.common.compat.openlineage.utils.spark'
(/home/iscipar/Proyectos/airflow-tutorial/airflow_env/lib/python3.12/site-packages/airflow/providers/common/compat/openlineage/utils/spark.py)
GitHub link:
https://github.com/apache/airflow/discussions/46478#discussioncomment-12104714
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Hi, I have added the following lines to my airflow.cfg in this way and when starting the airflow scheduler the same error still occurs. Maybe this would not be correct? ``` [OpenLineage] # Disable sending events without uninstalling the OpenLineage Provider by setting this to true. # # Variable: AIRFLOW__OPENLINEAGE__DISABLED # disabled = True ``` GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12099396 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user kacpermuda added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc PR #46561 should fix this. GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12095527 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user kacpermuda added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc I think this one is on me, does not relate to python version used and is most likely related to putting the import in the top level code in the dataproc module. Even though it's using compat provider underneath, I think there is a problem with how we implemented the NoOp in `airflow.providers.common.compat.openlineage.facet`. We implemented it as a no-op **function**, so when we try to base **class** on it, we get the above error. It came out in latest google provider as this is the first time, where we are adding some OL-related feature that is NOT contained to `get_openlineage_facets_on_...` methods (where we are sure the OL is accessible). I'll provide a fix PR shortly, so that we can release it in the next google provider version. @iscipar For now the only quick fix i can think of is installing the latest version of `apache-airflow-providers-openlineage` package together with your google provider. This way the RunFacet class will be available and import will not fail. Also, if you don't want to use OpenLineage make sure to [disable it](https://airflow.apache.org/docs/apache-airflow-providers-openlineage/stable/configurations-ref.html#disabled) f.e. by setting env variable `AIRFLOW__OPENLINEAGE__DISABLED=True`. This way, there will be no influence on your code, no additional actions from the OpenLineage provider, and the imports should not fail. Sorry for the trouble, let me know if it fixed the problem. GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12093283 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user boring-cyborg[bot] added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12072298 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user potiuk added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Converted to a discusison but maybe @kacpermuda or @mobuchowski might help if you continue to have problems after updating . GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12072371 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user potiuk added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc I think you mighthave old version of dependencies. Please use latest constraints for 2.10.4 to check if you upgraded to the correct versions of deps - especially openlineage provider GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12072324 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user iscipar added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc Hi. This is my first time opening an issue in this community. I see that there are two solutions: - downgrade my python version - update the package with the fix from the pull request As I prefer the second option, would a new version of the package with the fix be available now or do I have to wait? What npm install command would I have to run? Thanks, GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12072301 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
Re: [D] TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc [airflow]
GitHub user mohamedmeqlad99 added a comment to the discussion: TypeError when importing operators from airflow.providers.google.cloud.operators.dataproc I think that Airflow's Google provider package is using code that is incompatible with Python 3.12. This is likely due to changes in Python 3.12 affecting the exec() function or types.CodeType I suggest Downgrade to Python 3.10 or 3.11 GitHub link: https://github.com/apache/airflow/discussions/46478#discussioncomment-12072300 This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
