Thanks Jarek for clarification and information. I will create specific PRs and we can discuss there. If I come up with bigger change idea then I will file AIP.
On Mon, 4 Oct 2021, 13:27 Jarek Potiuk, <[email protected]> wrote: > I think you need to avoid the "generalizations" and dive into which > particular methods do you have in mind. I think if you pin-point some > methods that are indeed "internal" and "protected" - they could be > deprecated, and later removed and if you can find some, I'd encourage you > to PR those. > > There are already some methods in DAG that are protected or "internal" . > But most of the methods I could see there are "meant" to be publicly > available - DAG is the public interface Airflow exposed to manage > all-things-dag, so the vast majority of those methods is > intentionally public and should be used by whoever uses DAG as an object. I > do not know all the methods by-heart, so maybe there are some that are > really "internal" though. > > Also even if there are some methods already "public" there > 'unintentionally", we have to be really careful not to rename/hide such > methods because someone, somewhere could have used them already and we can > break someone's DAGs. This is pretty much the public API of Airflow - and > the most important one - used by everyone who uses Airflow so changing that > one should be really, really, really careful. > > But if you can pinpoint some methods that are clearly private and we all > agree that it is very likely to be made private - we can deprecate such > methods and remove them in Airflow 3.0 (which will be ~ 6/12 months from > now). > We have the rule that we cannot remove stuff from the "Public" part of > Airflow as we are strictly following Semver, so our approach is that we > deprecate such methods/classes and remove them in the next "major" version. > > If you can review the code and clearly pin-point those individual methods > that could be made private, then simply create PRs which deprecates such > methods (Ideally one-per-method) and we can likely discuss if it makes or > does not make sense to make those methods internal/protected. > > J. > > > On Mon, Oct 4, 2021 at 10:12 AM Khalid Mammadov <[email protected]> > wrote: > >> Another thing I was thinking that to move out those methods altogether to >> a separate class or module and so execution of a dag or those methods is >> done from different place and then Dag class will be very slim and will >> only be used to describe definition of a Dag. >> >> On Mon, 4 Oct 2021, 08:33 Khalid Mammadov, <[email protected]> >> wrote: >> >>> I was thinking to follow as you mentioned the very few Pythonic >>> notations for private methods i.e. just add an underscore in front of it >>> this example to signal a user/dev this is private and an internal >>> implementation and not part of public API or otherwise it's. >>> >>> On Sun, 3 Oct 2021, 23:50 Jarek Potiuk, <[email protected]> wrote: >>> >>>> Well. Not sure what else you'd expect, I wonder ? >>>> >>>> This is for sure unexpected and not reasonable use of it. There are >>>> best practices to follow and things to avoid when you run top-level python >>>> code >>>> https://airflow.apache.org/docs/apache-airflow/stable/best-practices.html#top-level-python-code >>>> but if you want you can do anything. >>>> This is Python, we can't prevent you from doing pretty much >>>> anything with it if you want. There are no "truly" private methods in >>>> Python. Also you can do that: >>>> >>>> ``` >>>> while True: >>>> pass >>>> ``` >>>> >>>> and you will get your CPU at 100% too, >>>> >>>> J,. >>>> >>>> On Sun, Oct 3, 2021 at 11:22 PM Khalid Mammadov < >>>> [email protected]> wrote: >>>> >>>>> Hi Devs, >>>>> >>>>> >>>>> I was reviewing DAG class and noticed that almost all it's methods are >>>>> public. >>>>> >>>>> So, one can do something like below: >>>>> >>>>> with DAG(...) as dag: t1 = BashOperator(...) >>>>> >>>>> run_id = DagRun.generate_run_id(DagRunType.MANUAL, >>>>> datetime.utcnow()) >>>>> >>>>> >>>>> # This one works OK and create a DagRun for the Scheduler to pick up >>>>> dag.create_dagrun(state=DagRunState.QUEUED, run_id=run_id) >>>>> >>>>> # OR EVEN DO BELOW - which caused my laptop to run on 100% CPU >>>>> # dag.run() >>>>> >>>>> And I was wondering if this is intentional and/or expected behavior? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Khalid >>>>> >>>>
