[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362053242
 
 

 ##
 File path: UPDATING.md
 ##
 @@ -57,6 +57,31 @@ https://developers.google.com/style/inclusive-documentation
 
 -->
 
+### Chain and cross_downstream moved from helpers to BaseOperator
+
+The chain and cross_downstream methods are now moved to 
airflow.models.baseoperator module from
 
 Review comment:
   It's already solved as well :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362014321
 
 

 ##
 File path: airflow/utils/helpers.py
 ##
 @@ -142,83 +135,6 @@ def as_flattened_list(iterable):
 return [e for i in iterable for e in i]
 
 
-def chain(*tasks):
 
 Review comment:
   Agree. Static was bad idea.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362014068
 
 

 ##
 File path: UPDATING.md
 ##
 @@ -57,6 +57,31 @@ https://developers.google.com/style/inclusive-documentation
 
 -->
 
+### Chain and cross_downstream moved from helpers to BaseOperator
+
+The chain and cross_downstream methods are now moved to 
airflow.models.baseoperator module from
+airflow.utils.helpers module where they belong.
+
+Helpers module is supposed to contain standalone helper methods that can be 
imported by all classes.
+The `chain` method and `cross_downstream` method both use BaseOperator and if 
any other package imports
+any of the helpers classes it automatically creates an implicit dependency to 
BaseOperator which
 
 Review comment:
   Yep. I also reworded it slightly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362014133
 
 

 ##
 File path: UPDATING.md
 ##
 @@ -57,6 +57,31 @@ https://developers.google.com/style/inclusive-documentation
 
 -->
 
+### Chain and cross_downstream moved from helpers to BaseOperator
+
+The chain and cross_downstream methods are now moved to 
airflow.models.baseoperator module from
+airflow.utils.helpers module where they belong.
+
+Helpers module is supposed to contain standalone helper methods that can be 
imported by all classes.
+The `chain` method and `cross_downstream` method both use BaseOperator and if 
any other package imports
+any of the helpers classes it automatically creates an implicit dependency to 
BaseOperator which
+can often lead to cyclic dependencies.
+
+More infomration in 
[AIFLOW-6392](https://issues.apache.org/jira/browse/AIRFLOW-6392)
+
+In Airflow 1.10 you imported those two methods like this:
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362013565
 
 

 ##
 File path: UPDATING.md
 ##
 @@ -57,6 +57,31 @@ https://developers.google.com/style/inclusive-documentation
 
 -->
 
+### Chain and cross_downstream moved from helpers to BaseOperator
+
+The chain and cross_downstream methods are now moved to 
airflow.models.baseoperator module from
+airflow.utils.helpers module where they belong.
 
 Review comment:
   Yep. I will also update the description slightly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362013100
 
 

 ##
 File path: airflow/models/baseoperator.py
 ##
 @@ -670,6 +672,7 @@ def __deepcopy__(self, memo):
 
 for k, v in self.__dict__.items():
 if k not in shallow_copy:
+# noinspection PyArgumentList
 
 Review comment:
   I am afraid so. The check is super useful because it detects spelling 
mistakes in parameters or removed parameters. 
   
   This is actually a known problem because the memo parameter is by mistake 
missing in skeleton definition for python skeletons (from typeshed I believe). 
The issue is described
   here: 
https://intellij-support.jetbrains.com/hc/en-us/community/posts/36181224-deepcopy-unexpected-memo-argument
   and reported here (unresolved):  
https://youtrack.jetbrains.com/issue/PY-29664 
   
   I have not found a good way of silencing it only for deepcopy, so I propose 
we can leave it as it is.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362011152
 
 

 ##
 File path: airflow/models/baseoperator.py
 ##
 @@ -862,6 +865,7 @@ def clear(self,
 tasks += [
 t.task_id for t in self.get_flat_relatives(upstream=False)]
 
+# noinspection PyUnresolvedReferences
 
 Review comment:
   It's the in_ operator on task_id. I will try to find another way of 
excluding those SQLAlchemy magic operators as this is a problem that repeats 
often. For now I remove it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362010644
 
 

 ##
 File path: tests/models/test_baseoperator.py
 ##
 @@ -241,7 +243,46 @@ def test_default_resources(self):
 task = DummyOperator(task_id="default-resources")
 self.assertIsNone(task.resources)
 
+# noinspection PyUnresolvedReferences
 
 Review comment:
   https://issues.apache.org/jira/browse/AIRFLOW-6406


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362009498
 
 

 ##
 File path: tests/models/test_baseoperator.py
 ##
 @@ -241,7 +243,46 @@ def test_default_resources(self):
 task = DummyOperator(task_id="default-resources")
 self.assertIsNone(task.resources)
 
+# noinspection PyUnresolvedReferences
 
 Review comment:
   It's the qty field. The type hints are missing for Resource type clases and 
Intellij derives them from __init__ (names of __init__ parameters are the same 
even though they are different type long <> Resource. I will remove it and 
create an issue to fix type hints in the Resource classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r362007720
 
 

 ##
 File path: airflow/utils/helpers.py
 ##
 @@ -28,15 +28,8 @@
 import psutil
 from jinja2 import Template
 
+from airflow import AirflowException
 
 Review comment:
   Yeah. That is not yet enforced :). The change is in a separate PR to do so.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-30 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r361961106
 
 

 ##
 File path: airflow/utils/helpers.py
 ##
 @@ -142,83 +135,6 @@ def as_flattened_list(iterable):
 return [e for i in iterable for e in i]
 
 
-def chain(*tasks):
 
 Review comment:
   > The existence of the idea of creating an automatic tool does not affect 
whether we should or should not make changes.
   
   I don't agree. By having automated tools we might decide that we 
deliberately not keep backwards compatibility in some cases more easily  - 
especially if those changes can easily and automatically be migrated in the 
existing DAGs. 
   
   "Backwards compatibility" is simply not as important for 2.0 as it was in 
1.10. This decision has already been made. We have a number of backwards 
incompatible changes, there is no going back.
   
   > Others look much less used, although there may be someone who finds these 
functions. But in my opinion, this is the choice of the lesser evil.
   I am not sure which one is lesser. If we automate migration for chain method 
it's not evil at all for migration. Especially that finally it makes sense for 
the chain method to be in baseoperator package (or in some form of 
operator_helpers but I prefer not to use helpers/managers/etc. if not 
neccessary). It makes much more sense for the other helpers method to be there 
- they were there originally, simply adding chains and the other method here 
later was a mistake that we want to correct now. For me more evil is to have 
bad naming in future version of airflow.
   
   >> It is not possible to import a single static method into a file. You must 
always import the whole class. A similar opinion was expressed in "Google 
Python Style Guide":
   
   Agree - will make it module function in baseoperator. it makes more sense. 
It comes from the old Java background where I prefer to have functions in 
classes.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-29 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r361908129
 
 

 ##
 File path: airflow/utils/helpers.py
 ##
 @@ -142,83 +135,6 @@ def as_flattened_list(iterable):
 return [e for i in iterable for e in i]
 
 
-def chain(*tasks):
 
 Review comment:
   I agree it's a bit sad (that's why I added UPDATING.md) but we already have 
a number of incompatibilities and we plan to have a migration tool to handle 
this https://issues.apache.org/jira/browse/AIRFLOW-6390 (and such automated 
conversion will be easy to add).
   
   There are other functions that look useful - validate_key, reduce_in_chunks, 
as_tuple, as_flattened_list. I see them as being used by the users - they are 
not base-operator dependent but still useful either in custom operators or 
dags. 
   
   I could of course move the other methods elsewhere, but I think there is an 
advantage of moving it (and any future similar methods) to BaseOperator. 
   
   I wonder what others think about it ? Is it worth to introduce such 
incompatibility or not? 
   
   I am happy to do both but I think it's quite a bit cleaner as it is now. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [airflow] potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove cyclic dependency baseoperator <-> helpers

2019-12-29 Thread GitBox
potiuk commented on a change in pull request #6950: [AIRFLOW-6392] Remove 
cyclic dependency baseoperator <-> helpers
URL: https://github.com/apache/airflow/pull/6950#discussion_r361906152
 
 

 ##
 File path: airflow/utils/helpers.py
 ##
 @@ -142,83 +135,6 @@ def as_flattened_list(iterable):
 return [e for i in iterable for e in i]
 
 
-def chain(*tasks):
-r"""
-Given a number of tasks, builds a dependency chain.
-Support mix airflow.models.BaseOperator and 
List[airflow.models.BaseOperator].
-If you want to chain between two List[airflow.models.BaseOperator], have to
-make sure they have same length.
-
-chain(t1, [t2, t3], [t4, t5], t6)
-
-is equivalent to
-
-  / -> t2 -> t4 \
-t1   -> t6
-  \ -> t3 -> t5 /
-
-t1.set_downstream(t2)
-t1.set_downstream(t3)
-t2.set_downstream(t4)
-t3.set_downstream(t5)
-t4.set_downstream(t6)
-t5.set_downstream(t6)
-
-:param tasks: List of tasks or List[airflow.models.BaseOperator] to set 
dependencies
-:type tasks: List[airflow.models.BaseOperator] or 
airflow.models.BaseOperator
-"""
-from airflow.models.baseoperator import BaseOperator
-for index, up_task in enumerate(tasks[:-1]):
-down_task = tasks[index + 1]
-if isinstance(up_task, BaseOperator):
-up_task.set_downstream(down_task)
-elif isinstance(down_task, BaseOperator):
-down_task.set_upstream(up_task)
-else:
-if not isinstance(up_task, Iterable) or not isinstance(down_task, 
Iterable):
-raise TypeError(
-'Chain not supported between instances of {up_type} and 
{down_type}'.format(
-up_type=type(up_task), down_type=type(down_task)))
-elif len(up_task) != len(down_task):
-raise AirflowException(
-'Chain not supported different length Iterable but get 
{up_len} and {down_len}'.format(
-up_len=len(up_task), down_len=len(down_task)))
-else:
-for up, down in zip(up_task, down_task):
-up.set_downstream(down)
-
-
-def cross_downstream(from_tasks, to_tasks):
-r"""
-Set downstream dependencies for all tasks in from_tasks to all tasks in 
to_tasks.
-E.g.: cross_downstream(from_tasks=[t1, t2, t3], to_tasks=[t4, t5, t6])
-Is equivalent to:
-
-t1 --> t4
-   \ /
-t2 -X> t5
-   / \
-t3 --> t6
-
-t1.set_downstream(t4)
-t1.set_downstream(t5)
-t1.set_downstream(t6)
-t2.set_downstream(t4)
-t2.set_downstream(t5)
-t2.set_downstream(t6)
-t3.set_downstream(t4)
-t3.set_downstream(t5)
-t3.set_downstream(t6)
-
-:param from_tasks: List of tasks to start from.
-:type from_tasks: List[airflow.models.BaseOperator]
-:param to_tasks: List of tasks to set as downstream dependencies.
-:type to_tasks: List[airflow.models.BaseOperator]
-"""
-for task in from_tasks:
-task.set_downstream(to_tasks)
-
-
 def pprinttable(rows):
 
 Review comment:
   It's unrelated :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services