[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-18 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690458047



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   @ashb 
   Just confirm that the `dag timezone` is the value that is inited here? 
   
https://github.com/apache/airflow/blob/e99624d2f17f0aadcb992827e9e721f5e6c978d8/airflow/models/dag.py#L326
   Can we use the `tzinfo` in the `execution_date` directly? 
   Or we need to add a parameter `dag_timezone` in the method `generate_run_id` 
and set the value when calling this method?
   Or transform the date to dag timezone datetime outside?

##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   @ashb 
   I checked the codes, we can not use the `tzinfo` in the execution_date 
directly, because it has been transformed into UTC already.
   
![image](https://user-images.githubusercontent.com/7628879/129824951-cf0ff5bd-916e-42af-82eb-1c9315afbeab.png)
   
   So we have two ways to implement:
   1. change the method generate_run_id from static method to class method
   2. add a parameter `timezone` in the method `generate_run_id` and the 
default value is settings.TIMEZONE.
   
   Which one do you prefer? Or there is something I miss that we can have a 
better choice?
   Let me know what do you think.
   Thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690845281



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   @ashb 
   I checked the codes, we can not use the `tzinfo` in the execution_date 
directly, because it has been transformed into UTC already.
   
![image](https://user-images.githubusercontent.com/7628879/129824951-cf0ff5bd-916e-42af-82eb-1c9315afbeab.png)
   
   So we have two ways to implement:
   1. change the method generate_run_id from static method to class method
   2. add a parameter `timezone` in the method `generate_run_id` and the 
default value is settings.TIMEZONE.
   
   Which one do you prefer? Or there is something I miss that we can have a 
better choice?
   Let me know what do you think.
   Thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690458047



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   @ashb 
   Just confirm that the `dag timezone` is the value that is inited here? 
   
https://github.com/apache/airflow/blob/e99624d2f17f0aadcb992827e9e721f5e6c978d8/airflow/models/dag.py#L326
   Can we use the `tzinfo` in the `execution_date` directly? 
   Or we need to add a parameter `dag_timezone` in the method `generate_run_id` 
and set the value when calling this method?
   Or transform the date to dag timezone datetime outside?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690458047



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   @ashb 
   Just confirm that the `dag timezone` is the value that is inited here? 
   
https://github.com/apache/airflow/blob/e99624d2f17f0aadcb992827e9e721f5e6c978d8/airflow/models/dag.py#L326
   Can we use the `tzinfo` in the `execution_date` directly? 
   Or we need to add a parameter `dag_timezone` in the method `generate_run_id` 
and set the value when calling this method?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690458047



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   Just confirm that the `dag timezone` is the value that is inited here? 
   
https://github.com/apache/airflow/blob/e99624d2f17f0aadcb992827e9e721f5e6c978d8/airflow/models/dag.py#L326
   Can we use the `tzinfo` in the `execution_date` directly? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690458047



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   Just confirm that the `dag timezone` is the value that is inited here? 
   
https://github.com/apache/airflow/blob/e99624d2f17f0aadcb992827e9e721f5e6c978d8/airflow/models/dag.py#L326




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690455602



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   @ashb 
   I thought we can get the dag timezone by the `start_date` or `end_date`, but 
actually, the two values may be `None` if the user doesn't set them.
   So the question is that which `timezone` do you mean by `dag timezone`? 
   The timezone of `execution_date`?  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690455602



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   @ashb 
   I thought we can get the dag timezone by the `start_date` or `end_date`, but 
actually, the two values may be `None if the user doesn't set them.
   So the question is that which `timezone` do you mean? 
   The timezone of `execution_date`?  

##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   @ashb 
   I thought we can get the dag timezone by the `start_date` or `end_date`, but 
actually, the two values may be `None` if the user doesn't set them.
   So the question is that which `timezone` do you mean? 
   The timezone of `execution_date`?  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690407650



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   good idea. It will be a small change for the project.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690407650



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   good idea. It will be a small change for the project.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690407650



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   good idea. It leads to a small change for the project.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690398707



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)
+is_dst = local_time.is_dst()
+if is_dst:
+return f"{run_type}__{local_time.isoformat()}__DST"

Review comment:
   Actually, I don't know the living habits who are under DST, because we 
don't use the DST in China.
   I just think it will be much clearer if we add `__DST`.
   I will remove it tomorrow




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-17 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r690394802



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)

Review comment:
   it makes sense. 
   But we have to change the method from static_method to class method.
   I will change it tomorrow




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-16 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r689988346



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)
+is_dst = local_time.is_dst()
+if is_dst:
+return f"{run_type}__{local_time.isoformat()}__DST"
+else:
+return f"{run_type}__{local_time.isoformat()}"
+else:
+return f"{run_type}__{execution_date.isoformat()}"

Review comment:
   Sure. Post the checks before I determine changing these
   
   As I checked the source codes, there may be two risks after changing these 
codes:
   1. As there is a UNIQUE_KEY(dag_id,run_id) in the database, we must make 
sure that the two values are unique. 
   1. If the user changes the server config `localize_dag_run_id` from `True` 
to `False` or they change the `default_timezone` frequently, the dag `run_id` 
may be conflict. 
   
   For the first risk issue, I use `pendulum` to make sure the local_times are 
unique during the DST changing. And I tried to did the full test and add the 
test codes into PR.
   For the second risk issue, I think it is only a low-rate case. Even the user 
changes the `default_timezone`, the dag `run_id` is accurate to six decimal 
places, it is really hard to make the `run_id` conflict.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] LionelZhao28 commented on a change in pull request #17502: localize the run_id in dagrun

2021-08-16 Thread GitBox


LionelZhao28 commented on a change in pull request #17502:
URL: https://github.com/apache/airflow/pull/17502#discussion_r689988346



##
File path: airflow/models/dagrun.py
##
@@ -286,7 +287,17 @@ def find(
 @staticmethod
 def generate_run_id(run_type: DagRunType, execution_date: datetime) -> str:
 """Generate Run ID based on Run Type and Execution Date"""
-return f"{run_type}__{execution_date.isoformat()}"
+from airflow.configuration import conf
+local_run_id = conf.getboolean("core", "localize_dag_run_id", 
fallback=False)
+if local_run_id:
+local_time = 
pendulum.instance(execution_date).astimezone(tz=settings.TIMEZONE)
+is_dst = local_time.is_dst()
+if is_dst:
+return f"{run_type}__{local_time.isoformat()}__DST"
+else:
+return f"{run_type}__{local_time.isoformat()}"
+else:
+return f"{run_type}__{execution_date.isoformat()}"

Review comment:
   Sure. Post the checks before I determine changing these
   
   As I checked the source codes, there may be two risks after changing these 
codes:
   1. As there is a UNIQUE_KEY(dag_id,run_id) in the database, we must make 
sure that the two values are unique. 
   1. If the user changes the server config `localize_dag_run_id` from `True` 
to `False` or they change the `default_timezone` frequently, the dag `run_id` 
may be conflict. 
   
   For the first risk issue, I use `pendulum` to make sure the local_times are 
unique during the DST changing. And I tried to did the full test and add the 
test codes into PR.
   For the second issue, I think it is only a low-rate case. Even the user 
changes the `default_timezone`, the dag `run_id` is accurate to six decimal 
places, it is really hard to make the `run_id` conflict.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org