Hello everyone,

Tl;DR; It's a long one, so I know it will take a long time to digest and
answer, but I would like to start a discussion about some of the aspects of
vulnerabilities of our dependencies that might bubble up to us and how we
should approach handling those. I would love to hear your opinions on the
issue and on a proposal I have.

*A bit of context: *

In Airflow we have 670+ (!) dependencies (including transitive ones!). We
have a pretty sophisticated process of automated updates and we are working
on robust SBOM generation and publishing (work funded by Sovereign Tech
Funding some of us got as we announced last week).
But there is one thing that we would also like to work on in the future - -
VEX (Vulnerability Exchange) and generally the way we react to security
issues that are public in our dependencies.

I think we should agree on a common approach in ASF because I believe it
might have some legal and "public facing" consequences. Also might have
impact on the future CRA processes and the like,

Quite recently we started to receive more and more requests from our users
asking us "what are you going to do with this and that public vulnerability
- already announced and fixed here?, are we  affected or not?". And
frankly, I think we have no good understanding of what we as PMC members
could (and should!) do in many cases. While some cases are easy and we can
just follow advisories released there (and in many cases it happens
automatically via dependabot etc - there are cases we cannot do that
(mostly because other dependencies of ours block us from doing it.

*Examples:*

Just to give two recent examples that we got asked by our users (via
different channels - some via security@, some by private conversations):

1) CVE-2023-47248  - pyarrow vulnerability
https://nvd.nist.gov/vuln/detail/CVE-2023-47248
2) CVE-2023-46136 - Werkzeug vulnerability
https://nvd.nist.gov/vuln/detail/CVE-2023-46136

In the case 1), we are relying on Apache Beam (another dependency of ours),
so we cannot just "update". But this case is easy, because they are also
committed to fix it in next release and there is a workaround available, so
this is as simple as adding the workaround/hotfix and scheduling it for
releasing it in the next version - which we just did
https://github.com/apache/airflow/pull/35650 - and we will automatically
upgrade once beam release their version that "frees" PyArrow upgrade - life
is good, unicorns and rainbows, everyone is happy. This is easy to handle
within our current work, can be easily done by volunteers and does not
require a lot of effort (and with automation most of it happens as chores -
sometimes even without us being even engaged).

*Problem:*

Case 2) is quite a bit more complex - because we rely on much more complex
work there (upgrading connexion - another dependency of ours) - which we
already have an issue for https://github.com/apache/airflow/pull/34366 but
it requires a lot of effort and we believe it is not urgent and we wait for
someone to volunteer and do the job.

We have no particular incentive to it "fast" and also we THINK (but cannot
be 100% sure) this particular vulnerabilty does not affect us. From the
quick analysis and looking at the scenarios in the CVE, the assesment of
ours is "we **think** we are not affected". But - as usual with the
security stuff, we cannot be sure. Maybe there is a scenario that we are
completely unaware of, that this CVE might be exploited. And we will not
know it for sure unless someone performs a more deep audit and analysis and
gives us enough confidence to say "yeah we are not affected".

However our users already started to pressure us and almost threatened that
they will start "Opening public issues with the CVE clearly stated as an
issue to solve". The problem is that those users have their users
(commercial) that demand answers from them ("are we safe?")  and (as
entirely expected) they will come to us and demand us to either a) fix it
or b) tell them authoritatively that they are not affected.

*What can we do about it as PMC?*

If we know that we are affected, that's entirely clear - we know the
scenario, we know it affects us, we create our own CVE, we fix it (in
whatever way we need) and publish - even if it takes a lot of effort, this
is what - eventually - we are supposed to do. But the problem in this case
is - we do not know if we are affected and just finding it out is difficult
and costly and requires somewhat of a researcher mindset, time and effort
to try to find an exploitation scenario. We are certainly not ready and
willing to take on that task.

In the case above - if someone volunteers to upgrade connexion, this is
cool, we will support, review, and help to make the contribution happen -
but it's not "expected" from the PMC and it's not "urgent" in any way. It
happens with its own pace, and likely there will be volunteers (maybe even
among the PMC or committers) who will eventually do that following the
"usual" incentives they have

So if someone asks - we could potentially say "We are fine and not affected
- this issue waits to be implemented". But there is one problem - can we
really openly, publicly say "we are not affected?" Should we officially
state it somewhere ? What happens if the users WILL open an issue about it?
Should we - as PMC authoritatively say "It's OK - you are not affected"?.

*Legal problem*

I  think (and I would look for advice here) we cannot officially and
authoritatively say "we are not affected" in the way it can be used in a
commercial setting. I think our users - cannot really make business
decisions based on such statements. Simply because that would make us
(effectively ASF)  legally responsible for such advice. You know all the
disclaimers (I Am Not a Lawyer, and "do not treat it as a legal advice")  -
and I think in this case there is a huge potential of being legally
responsible for telling our users "we know about the vulnerability and yes,
we are not affected - you can safely use our software even if this
dependency is vulnerable, it does not affect us". I think this is really
dangerous.

On the other hand - the users will (rightfully) expect some
acknowledgment about the issue when they ask us and they should know what
are their options. I think for our commercial users, it's blocking them a
bit if they do not know what they can do to answer their users. They will
continue pushing us so that we take the cost of the "uncertainty" and
either fix the problem (cost of fixing it) regardless if we should do it or
to let them know that they are safe (this is where VEX  - vulnerability
exchange that we will publish comes into play) that we will take the cost
on assessing if we are affected or not and also cost of risk of telling
them "you are not affected".

I think we will have more and more pressure (and with CRA and the like)
from commercial users to bear all the cost - either by fixing or taking the
legal risk.

*What should we do ?*

Well. I have an idea what we could do.  I am not sure how legal or possible
it is, but I think we should do something like "middle ground" where we
give our commercial users a chance and information on how they can
participate in the costs by either helping to fix the issue (if they want
to clear the CVE from their checklist) or helping to assess if we are
vulnerable.
My simple proposal would be (whether in the form of VEX or just a "Known
dependency issues" page) where we could publish information similar to the
below (and I Am Not A Lawyer so likely it could be formulated better). And
possibly VEX information we publish could be similar.

-----
We are aware of the CVE-2023-46136 vulnerability in Werkzeug. We currently
have no reports with scenarios exploiting it and we believe it is unlikely
it affects Airflow.

However, we cannot be 100% sure if Airflow is not affected in the way you
can base your business decisions on. If you want help and to be sure and
use it in your business decisions, we invite our commercial users to help
us with assessing it:

Here are the ways you can help:
* Contribute to https://github.com/apache/airflow/issues/35234 issue that
allows us to upgrade the dependency
* You can also perform a security audit and send us reports which might
help us to fine-tune our answer
* This can be either an exploitation scenario or results of an audit with
results of making an assesment it's not vulnerable, please contact
secur...@airflow.apache.org following the policies:
https://github.com/apache/airflow/security/policy
-----

I believe it would be a nice middle-ground and it's an opener for the
future. That would make it possible that not only individual
companies/commercial users, but also other third parties could base their
actions on. For example if there are 1000s of Airflow users who want to
have certainty, this will be a clear way how they could put together some
money and - for example - pay someone to implement the issue. Or pay
researchers to make an audit.

WDYT?

J.

Reply via email to