[VOTE] Release Apache Airflow Helm Chart 1.13.0 based on 1.13.0rc1

2024-03-01 Thread Jed Cunningham
Hello Apache Airflow Community,

This is a call for the vote to release Helm Chart version 1.13.0.

The release candidate is available at:
https://dist.apache.org/repos/dist/dev/airflow/helm-chart/1.13.0rc1/

airflow-chart-1.13.0-source.tar.gz - is the "main source release" that
comes with INSTALL instructions.
airflow-1.13.0.tgz - is the binary Helm Chart release.

Public keys are available at: https://www.apache.org/dist/airflow/KEYS

For convenience "index.yaml" has been uploaded (though excluded from
voting), so you can also run the below commands.

helm repo add apache-airflow-dev
https://dist.apache.org/repos/dist/dev/airflow/helm-chart/1.13.0rc1/
helm repo update
helm install airflow apache-airflow-dev/airflow

airflow-1.13.0.tgz.prov - is also uploaded for verifying Chart Integrity,
though not strictly required for releasing the artifact based on ASF
Guidelines.

$ helm gpg verify airflow-1.13.0.tgz
gpg: Signature made Fri Mar  1 21:16:51 2024 MST
gpg:using RSA key E1A1E984F55B8F280BD9CBA20BB7163892A2E48E
gpg:issuer "jedcunning...@apache.org"
gpg: Good signature from "Jed Cunningham "
[ultimate]
plugin: Chart SHA verified.
sha256:23155cf90b66c8ec6d49d2060686f90d23329eecf71c5368b1f0b06681b816cc

The vote will be open for at least 72 hours (2024-03-05 04:35 UTC) or until
the necessary number of votes is reached.

https://www.timeanddate.com/countdown/to?iso=20240305T0435=136=cursive

Please vote accordingly:

[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove with the reason

Only votes from PMC members are binding, but members of the community are
encouraged to test the release and vote with "(non-binding)".

Consider this my (binding) +1.

For license checks, the .rat-excludes files is included, so you can run the
following to verify licenses (just update your path to rat):

tar -xvf airflow-chart-1.13.0-source.tar.gz
cd airflow-chart-1.13.0
java -jar apache-rat-0.13.jar chart -E .rat-excludes

Please note that the version number excludes the `rcX` string, so it's now
simply 1.13.0. This will allow us to rename the artifact without modifying
the artifact checksums when we actually release it.

The status of testing the Helm Chart by the community is kept here:
https://github.com/apache/airflow/issues/37844

Thanks,
Jed


RE: Bad mixing of decorated and classic operators (users shooting themselves in their foot)

2024-03-01 Thread Blain David
It's certainly possible to check from where a python method is being called 
using traceback.

I do think prohibiting the execute method of an operator being called manually 
would be a good idea, I've also came accross this in multiple DAG's and this is 
ugly and looks like a hack.
Maybe in the beginning we could just print a warning message, but once a minor 
or major release occurs, then we could raise an exception instead.

We could refactor the BaseOperatorMeta class so we check from where the execute 
method is being called.

Maybe there are other ways to achieve the same but this is one of the possible 
solutions I came up with.

Below an example of how it could be done together with a unit test to verify 
it's behaviour:

import traceback
from abc import ABCMeta
from typing import Any
from unittest import TestCase
from unittest.mock import Mock, patch, MagicMock

import pendulum
from airflow import AirflowException, DAG, settings
from airflow.models import TaskInstance, DagRun
from airflow.models.baseoperator import BaseOperatorMeta, BaseOperator
from airflow.utils.context import Context
from airflow.utils.state import DagRunState
from assertpy import assert_that
from mockito import mock, ANY, when, KWARGS
from sqlalchemy.orm import Session, Query


def executor_safeguard():
def decorator(func):
def wrapper(*args, **kwargs):
# TODO: here we would need some kind of switch to detect if we are 
in testing mode or not as
# we want to be able to execute the execute method of the operators 
directly from within unit tests
caller_frame = traceback.extract_stack()[-2]  # Get the caller 
frame excluding the current frame
if caller_frame.name == "_execute_task" and "taskinstance" in 
caller_frame.filename:
return func(*args, **kwargs)
raise AirflowException(f"Method {func.__name__} cannot be called 
from {caller_frame.name}")
return wrapper
return decorator


def patched_base_operator_meta_new(cls, clsname, bases, clsdict):
execute_method = clsdict.get("execute")
if callable(execute_method) and not getattr(execute_method, 
'__isabstractmethod__', False):
clsdict["execute"] = executor_safeguard()(execute_method)
return ABCMeta.__new__(cls, clsname, bases, clsdict)


# for demo purposes, I'm patching the __new__ magic method of BaseOperatorMeta
BaseOperatorMeta.__new__ = patched_base_operator_meta_new


class HelloWorldOperator(BaseOperator):
called = False

def execute(self, context: Context) -> Any:
HelloWorldOperator.called = True
return f"Hello {self.owner}!"


class IterableSession(Session):
def __next__(self):
pass


class ExecutorSafeguardTestCase(TestCase):

def test_executor_safeguard_when_unauthorized(self):
with self.assertRaises(AirflowException):
dag = DAG(dag_id="hello_world")
context = mock(spec=Context)

HelloWorldOperator(task_id="task_id", 
dag=dag).execute(context=context)

@patch("sqlalchemy.orm.Session.__init__")
def test_executor_safeguard_when_authorized(self, mock_session: MagicMock):
query = mock(spec=Query)
when(query).filter_by(**KWARGS).thenReturn(query)
when(query).filter(ANY).thenReturn(query)
when(query).scalar().thenAnswer(lambda: "dag_run_id")
when(query).delete()
session = mock(spec=IterableSession)
when(session).query(ANY).thenReturn(query)
when(session).scalar(ANY)
when(session).__iter__().thenAnswer(lambda: iter({}))
when(session).commit()
when(session).close()
when(session).execute(ANY)
when(session).add(ANY)
when(session).flush()
when(settings).Session().thenReturn(session)

mock_session.return_value = session

dag = DAG(dag_id="hello_world")
TaskInstance.get_task_instance = Mock(return_value=None)
when(TaskInstance).get_task_instance(
dag_id="hello_world",
task_id="hello_operator",
run_id="run_id",
map_index=-1,
select_columns=True,
lock_for_update=False,
session=session,
).thenReturn(None)

operator = HelloWorldOperator(task_id="hello_operator", dag=dag)

assert_that(operator.called).is_false()

task_instance = TaskInstance(task=operator, run_id="run_id")
task_instance.task_id = "hello_operator"
task_instance.dag_id = "hello_world"
task_instance.dag_run = DagRun(run_id="run_id", dag_id="hello_world", 
execution_date=pendulum.now(), state=DagRunState.RUNNING)
task_instance._run_raw_task(test_mode=True, session=session)

assert_that(operator.called).is_true()

Maybe start a pull request for this one?  What do you guys think?

Kind regards,
David

-Original Message-
From: Andrey Anshin 
Sent: Tuesday, 27 February 2024 14:36
To: dev@airflow.apache.org

[ANNOUNCE] Apache Airflow NewsletterFebruary 2024

2024-03-01 Thread Briana Okyere
Hey All,

The February Issue of the Apache Airflow Newsletter is out! Please read it
here 

Or subscribe here to get it directly in your inbox <
https://apache.us14.list-manage.com/subscribe?u=fe7ef7a8dbb32933f30a10466=65cb5665fa
>

-- 
Briana Okyere
Community Manager
Astronomer


Assistance Needed: Community Over Code Connections

2024-03-01 Thread Brian Proffitt
Good day,

As we approach the season of ASF events, we are seeking assistance from
some Apache PMC committers to discover contact information for the various
organizations that use and contribute to those projects, in the hopes of
partnering with them on event sponsorship. This will help make Community
Over Code successful and obtain recognition for these organizartion's open
source works.

Specifically, we would like to reach out to marketing leaders or associates
who work for these companies, who are known to be Superset users and
participants:

Airbnb
Applied Materials
Slack
The Walt Disney Company
Walmart
Zoom

If you work for one of these organizations, or have connections within
them, it would be greatly appreciated if you could discover
contract information for marketing leaders, specifically those who work
around event sponsorship and/or event management.

There is no need to make a pitch to these people; we can do that. If you
pass along the contact information to me, our Conferences team can take
care of the rest.

Thank you in advance for your help!
BKP

Brian Proffitt
VP, Marketing & Publicity
VP, Conferences