[GitHub] [airflow] vchiapaikeo commented on issue #28343: BigQueryColumnCheckOperator doesn't actually implement use_legacy_sql kwarg

2023-01-08 Thread GitBox


vchiapaikeo commented on issue #28343:
URL: https://github.com/apache/airflow/issues/28343#issuecomment-1375006740

   Created this PR to fix the runtime / type error: 
https://github.com/apache/airflow/pull/28796


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] vchiapaikeo commented on issue #28343: BigQueryColumnCheckOperator doesn't actually implement use_legacy_sql kwarg

2023-01-06 Thread GitBox


vchiapaikeo commented on issue #28343:
URL: https://github.com/apache/airflow/issues/28343#issuecomment-1374350497

   Not entirely related but there seems to be a bug on main here:
   
   ```py
   failed_tests(
   f"Column: {col}\n\tCheck: {check},\n\tCheck Values: 
{check_values}\n"
   for col, checks in self.column_mapping.items()
   for check, check_values in checks.items()
   if not check_values["success"]
   )
   ```
   
https://github.com/apache/airflow/blob/main/airflow/providers/google/cloud/operators/bigquery.py#L614-L619
   
   `failed_tests` is a list and is not callable. It seems like the method to 
extend is missing.
   
   ```
   [2023-01-07, 01:46:36 UTC] {taskinstance.py:1797} ERROR - Task failed with 
exception
   Traceback (most recent call last):
 File "/opt/airflow/airflow/providers/google/cloud/operators/bigquery.py", 
line 616, in execute
   for col, checks in self.column_mapping.items()
   TypeError: 'list' object is not callable
   ```
   
   This was introduced here (@denimalpaca):
   
https://github.com/apache/airflow/commit/87eb46bbc69c20148773d72e990fbd5d20076342#diff-529929b4ca60ce73b8da0f45d8a5c43c2d4e391b913fe78b39892899f812951eR616-R621
   
   After changing this to the previous working code (`failed_tests.extend`), I 
**cannot** reproduce this bug. I added a JSON column to my table and with 
use_legacy_sql=True, I get an exception as expected:
   
   https://user-images.githubusercontent.com/9200263/211126788-b866bc06-1c96-423c-a034-685c2f0ceae1.png";>
   
   
   And with use_legacy_sql=False, things work as expected:
   
   https://user-images.githubusercontent.com/9200263/211126846-885e0114-695b-40ce-9596-9a71c655595d.png";>
   
   DAG I used to test:
   
   ```py
   from airflow import DAG
   
   from airflow.providers.google.cloud.operators.bigquery import 
BigQueryColumnCheckOperator
   
   DEFAULT_TASK_ARGS = {
   "owner": "gcp-data-platform",
   "retries": 1,
   "retry_delay": 10,
   "start_date": "2022-08-01",
   }
   
   with DAG(
   max_active_runs=1,
   concurrency=2,
   catchup=False,
   schedule_interval="@daily",
   dag_id="test_bigquery_column_check",
   default_args=DEFAULT_TASK_ARGS,
   ) as dag:
   
   basic_column_quality_checks = BigQueryColumnCheckOperator(
   task_id="check_columns",
   table="my-project.vchiapaikeo.test1",
   use_legacy_sql=False,
   column_mapping={
   "col1": {"min": {"greater_than": 0}},
   },
   )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [airflow] vchiapaikeo commented on issue #28343: BigQueryColumnCheckOperator doesn't actually implement use_legacy_sql kwarg

2023-01-02 Thread GitBox


vchiapaikeo commented on issue #28343:
URL: https://github.com/apache/airflow/issues/28343#issuecomment-1369225227

   I’ll let @VladaZakharova take a stab at it first. Going back to work 
tomorrow so might not have time for the next few days. Please let me know if I 
can help further though @VladaZakharova !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org