[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-08-27 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r478797281



##
File path: airflow/api_connexion/schemas/task_instance_schema.py
##
@@ -0,0 +1,147 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from typing import List, NamedTuple
+
+from marshmallow import Schema, fields, ValidationError, post_load
+from marshmallow_sqlalchemy import SQLAlchemySchema, auto_field
+from sqlalchemy import and_
+
+from airflow.api_connexion.schemas.enum_schemas import TaskInstanceStateField
+from airflow.api_connexion.schemas.sla_miss_schema import SlaMissSchema
+from airflow.models import DagRun, TaskInstance, SlaMiss
+from airflow.utils.session import create_session
+
+
+class TaskInstanceSchema(Schema):
+"""Task instance schema"""
+
+task_id = fields.Str()
+dag_id = fields.Str()
+execution_date = fields.DateTime()
+start_date = fields.DateTime()
+end_date = fields.DateTime()
+duration = fields.Float()
+state = TaskInstanceStateField()
+_try_number = fields.Int(data_key="try_number")
+max_tries = fields.Int()
+hostname = fields.Str()
+unixname = fields.Str()
+pool = fields.Str()
+pool_slots = fields.Int()
+queue = fields.Str()
+priority_weight = fields.Int()
+operator = fields.Str()
+queued_dttm = fields.DateTime(data_key="queued_when")
+pid = fields.Int()
+executor_config = fields.Str()
+sla_miss = fields.Method("get_sla_miss")
+
+@staticmethod
+def get_sla_miss(obj: TaskInstance):
+with create_session() as session:
+sla_miss = session.query(SlaMiss).filter(

Review comment:
   I'll try to look at it tomorrow.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-08-27 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r478797187



##
File path: airflow/api_connexion/schemas/task_instance_schema.py
##
@@ -0,0 +1,147 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from typing import List, NamedTuple
+
+from marshmallow import Schema, fields, ValidationError, post_load
+from marshmallow_sqlalchemy import SQLAlchemySchema, auto_field
+from sqlalchemy import and_
+
+from airflow.api_connexion.schemas.enum_schemas import TaskInstanceStateField
+from airflow.api_connexion.schemas.sla_miss_schema import SlaMissSchema
+from airflow.models import DagRun, TaskInstance, SlaMiss
+from airflow.utils.session import create_session
+
+
+class TaskInstanceSchema(Schema):
+"""Task instance schema"""
+
+task_id = fields.Str()
+dag_id = fields.Str()
+execution_date = fields.DateTime()
+start_date = fields.DateTime()
+end_date = fields.DateTime()
+duration = fields.Float()
+state = TaskInstanceStateField()
+_try_number = fields.Int(data_key="try_number")
+max_tries = fields.Int()
+hostname = fields.Str()
+unixname = fields.Str()
+pool = fields.Str()
+pool_slots = fields.Int()
+queue = fields.Str()
+priority_weight = fields.Int()
+operator = fields.Str()
+queued_dttm = fields.DateTime(data_key="queued_when")
+pid = fields.Int()
+executor_config = fields.Str()
+sla_miss = fields.Method("get_sla_miss")
+
+@staticmethod
+def get_sla_miss(obj: TaskInstance):
+with create_session() as session:
+sla_miss = session.query(SlaMiss).filter(

Review comment:
   if you need it, we can start using relationships. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-08-24 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r476043263



##
File path: airflow/api_connexion/schemas/task_instance_schema.py
##
@@ -0,0 +1,147 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from typing import List, NamedTuple
+
+from marshmallow import Schema, fields, ValidationError, post_load
+from marshmallow_sqlalchemy import SQLAlchemySchema, auto_field
+from sqlalchemy import and_
+
+from airflow.api_connexion.schemas.enum_schemas import TaskInstanceStateField
+from airflow.api_connexion.schemas.sla_miss_schema import SlaMissSchema
+from airflow.models import DagRun, TaskInstance, SlaMiss
+from airflow.utils.session import create_session
+
+
+class TaskInstanceSchema(Schema):
+"""Task instance schema"""
+
+task_id = fields.Str()
+dag_id = fields.Str()
+execution_date = fields.DateTime()
+start_date = fields.DateTime()
+end_date = fields.DateTime()
+duration = fields.Float()
+state = TaskInstanceStateField()
+_try_number = fields.Int(data_key="try_number")
+max_tries = fields.Int()
+hostname = fields.Str()
+unixname = fields.Str()
+pool = fields.Str()
+pool_slots = fields.Int()
+queue = fields.Str()
+priority_weight = fields.Int()
+operator = fields.Str()
+queued_dttm = fields.DateTime(data_key="queued_when")
+pid = fields.Int()
+executor_config = fields.Str()
+sla_miss = fields.Method("get_sla_miss")
+
+@staticmethod
+def get_sla_miss(obj: TaskInstance):
+with create_session() as session:
+sla_miss = session.query(SlaMiss).filter(

Review comment:
   Here we have an n+1 problem. I think we can fetch it much more 
efficiently if we use the more advanced features of SQLAlchemy.
   https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-07-01 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r448590758



##
File path: airflow/api_connexion/endpoints/task_instance_endpoint.py
##
@@ -14,23 +14,110 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from typing import Any, List, Optional, Tuple
 
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8132
+from sqlalchemy import and_, func, or_
 
+from airflow.api_connexion.exceptions import NotFound
+from airflow.api_connexion.parameters import format_datetime, format_parameters
+from airflow.api_connexion.schemas.task_instance_schema import (
+TaskInstanceCollection, task_instance_collection_schema, 
task_instance_schema,
+)
+from airflow.models.dagrun import DagRun as DR
+from airflow.models.taskinstance import TaskInstance as TI
+from airflow.utils.session import provide_session
+
+
+@provide_session
+def get_task_instance(dag_id: str, dag_run_id: str, task_id: str, 
session=None):
+query = (
+session.query(TI)
+.filter(TI.dag_id == dag_id)
+.join(DR, and_(TI.dag_id == DR.dag_id, TI.execution_date == 
DR.execution_date))
+.filter(DR.run_id == dag_run_id)
+.filter(TI.task_id == task_id)
+)
+
+task_instance = query.one_or_none()
+
+if task_instance is None:
+raise NotFound("Task instance not found")
+
+return task_instance_schema.dump(task_instance)
 
-def get_task_instance():
-"""
-Get a task instance
-"""
-raise NotImplementedError("Not implemented yet.")
 
+def _apply_array_filter(query, key, values):
+if values is not None:
+query = query.filter(or_(*[key == v for v in values]))
+return query
 
-def get_task_instances():
+
+def _apply_range_filter(query, key, value_range: Tuple[Any, Any]):
+gte_value, lte_value = value_range
+if gte_value is not None:
+query = query.filter(key >= gte_value)
+if lte_value is not None:
+query = query.filter(key <= lte_value)
+return query
+
+
+@format_parameters(
+{
+'start_date_gte': format_datetime,
+'start_date_lte': format_datetime,
+'execution_date_gte': format_datetime,
+'execution_date_lte': format_datetime,
+'end_date_gte': format_datetime,
+'end_date_lte': format_datetime,
+}
+)
+@provide_session
+def get_task_instances(
+limit: int,
+dag_id: Optional[str] = None,
+dag_run_id: Optional[str] = None,
+execution_date_gte: Optional[str] = None,
+execution_date_lte: Optional[str] = None,
+start_date_gte: Optional[str] = None,
+start_date_lte: Optional[str] = None,
+end_date_gte: Optional[str] = None,
+end_date_lte: Optional[str] = None,
+duration_gte: Optional[float] = None,
+duration_lte: Optional[float] = None,
+state: Optional[str] = None,
+pool: Optional[List[str]] = None,
+queue: Optional[List[str]] = None,
+offset: Optional[int] = None,
+session=None,

Review comment:
   It is not correct with pylint if I remember correctly. I didn't specify 
the type here because SQLAlchemy does a lot of magic, and mypy likes to get 
lost when he is aware of its existence.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-07-01 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r448588204



##
File path: airflow/api_connexion/endpoints/task_instance_endpoint.py
##
@@ -14,23 +14,110 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from typing import Any, List, Optional, Tuple
 
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8132
+from sqlalchemy import and_, func, or_
 
+from airflow.api_connexion.exceptions import NotFound
+from airflow.api_connexion.parameters import format_datetime, format_parameters
+from airflow.api_connexion.schemas.task_instance_schema import (
+TaskInstanceCollection, task_instance_collection_schema, 
task_instance_schema,
+)
+from airflow.models.dagrun import DagRun as DR
+from airflow.models.taskinstance import TaskInstance as TI
+from airflow.utils.session import provide_session
+
+
+@provide_session
+def get_task_instance(dag_id: str, dag_run_id: str, task_id: str, 
session=None):
+query = (
+session.query(TI)
+.filter(TI.dag_id == dag_id)
+.join(DR, and_(TI.dag_id == DR.dag_id, TI.execution_date == 
DR.execution_date))
+.filter(DR.run_id == dag_run_id)
+.filter(TI.task_id == task_id)
+)
+
+task_instance = query.one_or_none()
+
+if task_instance is None:
+raise NotFound("Task instance not found")
+
+return task_instance_schema.dump(task_instance)
 
-def get_task_instance():
-"""
-Get a task instance
-"""
-raise NotImplementedError("Not implemented yet.")
 
+def _apply_array_filter(query, key, values):
+if values is not None:
+query = query.filter(or_(*[key == v for v in values]))
+return query
 
-def get_task_instances():
+
+def _apply_range_filter(query, key, value_range: Tuple[Any, Any]):
+gte_value, lte_value = value_range
+if gte_value is not None:
+query = query.filter(key >= gte_value)
+if lte_value is not None:
+query = query.filter(key <= lte_value)
+return query
+
+
+@format_parameters(
+{
+'start_date_gte': format_datetime,
+'start_date_lte': format_datetime,
+'execution_date_gte': format_datetime,
+'execution_date_lte': format_datetime,
+'end_date_gte': format_datetime,
+'end_date_lte': format_datetime,
+}
+)
+@provide_session
+def get_task_instances(
+limit: int,
+dag_id: Optional[str] = None,
+dag_run_id: Optional[str] = None,
+execution_date_gte: Optional[str] = None,
+execution_date_lte: Optional[str] = None,
+start_date_gte: Optional[str] = None,
+start_date_lte: Optional[str] = None,
+end_date_gte: Optional[str] = None,
+end_date_lte: Optional[str] = None,
+duration_gte: Optional[float] = None,
+duration_lte: Optional[float] = None,
+state: Optional[str] = None,
+pool: Optional[List[str]] = None,
+queue: Optional[List[str]] = None,
+offset: Optional[int] = None,
+session=None,
+):
 """
-Get list of task instances of DAG.
+Get list of a task instances
 """
-raise NotImplementedError("Not implemented yet.")
+query = session.query(TI)
+
+if dag_id is not None:
+query = query.filter(TI.dag_id == dag_id)
+if dag_run_id is not None:
+query = query.join(DR, and_(TI.dag_id == DR.dag_id, TI.execution_date 
== DR.execution_date))
+query = query.filter(DR.run_id == dag_run_id)
+
+query = _apply_range_filter(
+query, key=DR.execution_date, value_range=(execution_date_gte, 
execution_date_lte)
+)
+query = _apply_range_filter(query, key=DR.start_date, 
value_range=(start_date_gte, start_date_lte))
+query = _apply_range_filter(query, key=DR.end_date, 
value_range=(end_date_gte, end_date_lte))
+query = _apply_range_filter(query, key=DR.end_date, 
value_range=(end_date_gte, end_date_lte))

Review comment:
   Good point. I will remove it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-07-01 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r448588204



##
File path: airflow/api_connexion/endpoints/task_instance_endpoint.py
##
@@ -14,23 +14,110 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from typing import Any, List, Optional, Tuple
 
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8132
+from sqlalchemy import and_, func, or_
 
+from airflow.api_connexion.exceptions import NotFound
+from airflow.api_connexion.parameters import format_datetime, format_parameters
+from airflow.api_connexion.schemas.task_instance_schema import (
+TaskInstanceCollection, task_instance_collection_schema, 
task_instance_schema,
+)
+from airflow.models.dagrun import DagRun as DR
+from airflow.models.taskinstance import TaskInstance as TI
+from airflow.utils.session import provide_session
+
+
+@provide_session
+def get_task_instance(dag_id: str, dag_run_id: str, task_id: str, 
session=None):
+query = (
+session.query(TI)
+.filter(TI.dag_id == dag_id)
+.join(DR, and_(TI.dag_id == DR.dag_id, TI.execution_date == 
DR.execution_date))
+.filter(DR.run_id == dag_run_id)
+.filter(TI.task_id == task_id)
+)
+
+task_instance = query.one_or_none()
+
+if task_instance is None:
+raise NotFound("Task instance not found")
+
+return task_instance_schema.dump(task_instance)
 
-def get_task_instance():
-"""
-Get a task instance
-"""
-raise NotImplementedError("Not implemented yet.")
 
+def _apply_array_filter(query, key, values):
+if values is not None:
+query = query.filter(or_(*[key == v for v in values]))
+return query
 
-def get_task_instances():
+
+def _apply_range_filter(query, key, value_range: Tuple[Any, Any]):
+gte_value, lte_value = value_range
+if gte_value is not None:
+query = query.filter(key >= gte_value)
+if lte_value is not None:
+query = query.filter(key <= lte_value)
+return query
+
+
+@format_parameters(
+{
+'start_date_gte': format_datetime,
+'start_date_lte': format_datetime,
+'execution_date_gte': format_datetime,
+'execution_date_lte': format_datetime,
+'end_date_gte': format_datetime,
+'end_date_lte': format_datetime,
+}
+)
+@provide_session
+def get_task_instances(
+limit: int,
+dag_id: Optional[str] = None,
+dag_run_id: Optional[str] = None,
+execution_date_gte: Optional[str] = None,
+execution_date_lte: Optional[str] = None,
+start_date_gte: Optional[str] = None,
+start_date_lte: Optional[str] = None,
+end_date_gte: Optional[str] = None,
+end_date_lte: Optional[str] = None,
+duration_gte: Optional[float] = None,
+duration_lte: Optional[float] = None,
+state: Optional[str] = None,
+pool: Optional[List[str]] = None,
+queue: Optional[List[str]] = None,
+offset: Optional[int] = None,
+session=None,
+):
 """
-Get list of task instances of DAG.
+Get list of a task instances
 """
-raise NotImplementedError("Not implemented yet.")
+query = session.query(TI)
+
+if dag_id is not None:
+query = query.filter(TI.dag_id == dag_id)
+if dag_run_id is not None:
+query = query.join(DR, and_(TI.dag_id == DR.dag_id, TI.execution_date 
== DR.execution_date))
+query = query.filter(DR.run_id == dag_run_id)
+
+query = _apply_range_filter(
+query, key=DR.execution_date, value_range=(execution_date_gte, 
execution_date_lte)
+)
+query = _apply_range_filter(query, key=DR.start_date, 
value_range=(start_date_gte, start_date_lte))
+query = _apply_range_filter(query, key=DR.end_date, 
value_range=(end_date_gte, end_date_lte))
+query = _apply_range_filter(query, key=DR.end_date, 
value_range=(end_date_gte, end_date_lte))

Review comment:
   Good point. Itt should be execution_date.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-07-01 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r448587966



##
File path: airflow/api_connexion/endpoints/task_instance_endpoint.py
##
@@ -14,23 +14,110 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from typing import Any, List, Optional, Tuple
 
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8132
+from sqlalchemy import and_, func, or_
 
+from airflow.api_connexion.exceptions import NotFound
+from airflow.api_connexion.parameters import format_datetime, format_parameters
+from airflow.api_connexion.schemas.task_instance_schema import (
+TaskInstanceCollection, task_instance_collection_schema, 
task_instance_schema,
+)
+from airflow.models.dagrun import DagRun as DR
+from airflow.models.taskinstance import TaskInstance as TI
+from airflow.utils.session import provide_session
+
+
+@provide_session
+def get_task_instance(dag_id: str, dag_run_id: str, task_id: str, 
session=None):
+query = (
+session.query(TI)
+.filter(TI.dag_id == dag_id)
+.join(DR, and_(TI.dag_id == DR.dag_id, TI.execution_date == 
DR.execution_date))
+.filter(DR.run_id == dag_run_id)
+.filter(TI.task_id == task_id)
+)
+
+task_instance = query.one_or_none()
+
+if task_instance is None:
+raise NotFound("Task instance not found")
+
+return task_instance_schema.dump(task_instance)
 
-def get_task_instance():
-"""
-Get a task instance
-"""
-raise NotImplementedError("Not implemented yet.")
 
+def _apply_array_filter(query, key, values):
+if values is not None:
+query = query.filter(or_(*[key == v for v in values]))
+return query
 
-def get_task_instances():
+
+def _apply_range_filter(query, key, value_range: Tuple[Any, Any]):
+gte_value, lte_value = value_range
+if gte_value is not None:
+query = query.filter(key >= gte_value)
+if lte_value is not None:
+query = query.filter(key <= lte_value)
+return query
+
+
+@format_parameters(
+{
+'start_date_gte': format_datetime,
+'start_date_lte': format_datetime,
+'execution_date_gte': format_datetime,
+'execution_date_lte': format_datetime,
+'end_date_gte': format_datetime,
+'end_date_lte': format_datetime,
+}
+)
+@provide_session
+def get_task_instances(
+limit: int,
+dag_id: Optional[str] = None,
+dag_run_id: Optional[str] = None,
+execution_date_gte: Optional[str] = None,
+execution_date_lte: Optional[str] = None,
+start_date_gte: Optional[str] = None,
+start_date_lte: Optional[str] = None,
+end_date_gte: Optional[str] = None,
+end_date_lte: Optional[str] = None,
+duration_gte: Optional[float] = None,
+duration_lte: Optional[float] = None,
+state: Optional[str] = None,
+pool: Optional[List[str]] = None,
+queue: Optional[List[str]] = None,
+offset: Optional[int] = None,
+session=None,

Review comment:
   We use connexion, which fills these parameters based on the API 
specification.
   
https://connexion.readthedocs.io/en/latest/request.html#automatic-parameter-handling





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-07-01 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r448587313



##
File path: airflow/api_connexion/endpoints/task_instance_endpoint.py
##
@@ -14,23 +14,110 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from typing import Any, List, Optional, Tuple
 
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8132
+from sqlalchemy import and_, func, or_
 
+from airflow.api_connexion.exceptions import NotFound
+from airflow.api_connexion.parameters import format_datetime, format_parameters
+from airflow.api_connexion.schemas.task_instance_schema import (
+TaskInstanceCollection, task_instance_collection_schema, 
task_instance_schema,
+)
+from airflow.models.dagrun import DagRun as DR
+from airflow.models.taskinstance import TaskInstance as TI
+from airflow.utils.session import provide_session
+
+
+@provide_session
+def get_task_instance(dag_id: str, dag_run_id: str, task_id: str, 
session=None):
+query = (
+session.query(TI)
+.filter(TI.dag_id == dag_id)
+.join(DR, and_(TI.dag_id == DR.dag_id, TI.execution_date == 
DR.execution_date))
+.filter(DR.run_id == dag_run_id)
+.filter(TI.task_id == task_id)
+)
+
+task_instance = query.one_or_none()
+
+if task_instance is None:
+raise NotFound("Task instance not found")
+
+return task_instance_schema.dump(task_instance)
 
-def get_task_instance():
-"""
-Get a task instance
-"""
-raise NotImplementedError("Not implemented yet.")
 
+def _apply_array_filter(query, key, values):

Review comment:
   I do not know if I understand correctly. Can you say more?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-07-01 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r448586807



##
File path: airflow/api_connexion/endpoints/task_instance_endpoint.py
##
@@ -14,23 +14,110 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from typing import Any, List, Optional, Tuple
 
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8132
+from sqlalchemy import and_, func, or_
 
+from airflow.api_connexion.exceptions import NotFound
+from airflow.api_connexion.parameters import format_datetime, format_parameters
+from airflow.api_connexion.schemas.task_instance_schema import (
+TaskInstanceCollection, task_instance_collection_schema, 
task_instance_schema,
+)
+from airflow.models.dagrun import DagRun as DR
+from airflow.models.taskinstance import TaskInstance as TI
+from airflow.utils.session import provide_session
+
+
+@provide_session
+def get_task_instance(dag_id: str, dag_run_id: str, task_id: str, 
session=None):
+query = (
+session.query(TI)
+.filter(TI.dag_id == dag_id)
+.join(DR, and_(TI.dag_id == DR.dag_id, TI.execution_date == 
DR.execution_date))
+.filter(DR.run_id == dag_run_id)
+.filter(TI.task_id == task_id)
+)
+
+task_instance = query.one_or_none()
+
+if task_instance is None:
+raise NotFound("Task instance not found")

Review comment:
   Yes. This is handled by connexion.
   https://connexion.readthedocs.io/en/latest/exceptions.html





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [airflow] mik-laj commented on a change in pull request #9597: [WIP] Add read-only endpoints for task instances

2020-07-01 Thread GitBox


mik-laj commented on a change in pull request #9597:
URL: https://github.com/apache/airflow/pull/9597#discussion_r448586439



##
File path: airflow/api_connexion/endpoints/task_instance_endpoint.py
##
@@ -14,23 +14,110 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+from typing import Any, List, Optional, Tuple
 
-# TODO(mik-laj): We have to implement it.
-# Do you want to help? Please look at: 
https://github.com/apache/airflow/issues/8132
+from sqlalchemy import and_, func, or_
 
+from airflow.api_connexion.exceptions import NotFound
+from airflow.api_connexion.parameters import format_datetime, format_parameters
+from airflow.api_connexion.schemas.task_instance_schema import (
+TaskInstanceCollection, task_instance_collection_schema, 
task_instance_schema,
+)
+from airflow.models.dagrun import DagRun as DR
+from airflow.models.taskinstance import TaskInstance as TI
+from airflow.utils.session import provide_session
+
+
+@provide_session
+def get_task_instance(dag_id: str, dag_run_id: str, task_id: str, 
session=None):

Review comment:
   Inside Airflow, we use execution_date as the primary identifier. 
However, we voted to use dag_run_id in the API
   
https://lists.apache.org/thread.html/rd4be3829627dcef8b40314c62c041f460992786f3bfcc634d25a6664%40%3Cdev.airflow.apache.org%3E





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org