amoghrajesh commented on PR #37214:
URL: https://github.com/apache/airflow/pull/37214#issuecomment-1931415839
Right now I am testing with the DAG:
```
from airflow import DAG
from airflow.providers.mongo.hooks.mongo import MongoHook
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2024, 2, 7),
'catchup': False,
}
def retrieve_users_from_mongodb():
mongo_hook = MongoHook(conn_id='mongo_conn')
query = {}
users_data = mongo_hook.find('users', query)
for user in users_data:
print(user)
with DAG('mongo_dag', default_args=default_args, schedule_interval=None) as
dag:
# Task to retrieve users from MongoDB
retrieve_users_task = PythonOperator(
task_id='retrieve_users_from_mongodb',
python_callable=retrieve_users_from_mongodb
)
# Define the task dependencies
retrieve_users_task
```
Connection:

For some reason, the connection is not going through, to mongodb
Error log:
```
e8786b12ac04
*** Found local files:
*** *
/root/airflow/logs/dag_id=mongo_dag/run_id=manual__2024-02-07T07:07:35.163848+00:00/task_id=retrieve_users_from_mongodb/attempt=1.log
[2024-02-07, 12:37:37 IST] {taskinstance.py:1980} INFO - Dependencies all
met for dep_context=non-requeueable deps ti=
[2024-02-07, 12:37:37 IST] {taskinstance.py:1980} INFO - Dependencies all
met for dep_context=requeueable deps ti=
[2024-02-07, 12:37:37 IST] {taskinstance.py:2194} INFO - Starting attempt 1
of 1
[2024-02-07, 12:37:37 IST] {taskinstance.py:2215} INFO - Executing on
2024-02-07 07:07:35.163848+00:00
[2024-02-07, 12:37:37 IST] {standard_task_runner.py:60} INFO - Started
process 1243 to run task
[2024-02-07, 12:37:37 IST] {standard_task_runner.py:87} INFO - Running:
['airflow', 'tasks', 'run', 'mongo_dag', 'retrieve_users_from_mongodb',
'manual__2024-02-07T07:07:35.163848+00:00', '--job-id', '920', '--raw',
'--subdir', 'DAGS_FOLDER/mongo_test.py', '--cfg-path', '/tmp/tmpkg6apbfb']
[2024-02-07, 12:37:37 IST] {standard_task_runner.py:88} INFO - Job 920:
Subtask retrieve_users_from_mongodb
[2024-02-07, 12:37:37 IST] {task_command.py:423} INFO - Running on host
e8786b12ac04
[2024-02-07, 12:37:37 IST] {taskinstance.py:2515} INFO - Exporting env vars:
AIRFLOW_CTX_DAG_EMAIL='[email protected]' AIRFLOW_CTX_DAG_OWNER='airflow'
AIRFLOW_CTX_DAG_ID='mongo_dag'
AIRFLOW_CTX_TASK_ID='retrieve_users_from_mongodb'
AIRFLOW_CTX_EXECUTION_DATE='2024-02-07T07:07:35.163848+00:00'
AIRFLOW_CTX_TRY_NUMBER='1'
AIRFLOW_CTX_DAG_RUN_ID='manual__2024-02-07T07:07:35.163848+00:00'
[2024-02-07, 12:37:37 IST] {base.py:83} INFO - Using connection ID 'myconn'
for task execution.
[2024-02-07, 12:38:08 IST] {taskinstance.py:2737} ERROR - Task failed with
exception
Traceback (most recent call last):
File "/opt/airflow/airflow/models/taskinstance.py", line 446, in
_execute_task
result = _execute_callable(context=context, **execute_callable_kwargs)
File "/opt/airflow/airflow/models/taskinstance.py", line 416, in
_execute_callable
return execute_callable(context=context, **execute_callable_kwargs)
File "/opt/airflow/airflow/operators/python.py", line 199, in execute
return_value = self.execute_callable()
File "/opt/airflow/airflow/operators/python.py", line 216, in
execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/files/dags/mongo_test.py", line 28, in retrieve_users_from_mongodb
for user in users_data:
File "/usr/local/lib/python3.8/site-packages/pymongo/cursor.py", line
1264, in next
if len(self.__data) or self._refresh():
File "/usr/local/lib/python3.8/site-packages/pymongo/cursor.py", line
1155, in _refresh
self.__session = self.__collection.database.client._ensure_session()
File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py",
line 1823, in _ensure_session
return self.__start_session(True, causal_consistency=False)
File "/usr/local/lib/python3.8/site-packages/pymongo/mongo_client.py",
line 1766, in __start_session
self._topology._check_implicit_session_support()
File "/usr/local/lib/python3.8/site-packages/pymongo/topology.py", line
573, in _check_implicit_session_support
self._check_session_support()
File "/usr/local/lib/python3.8/site-packages/pymongo/topology.py", line
589, in _check_session_support
self._select_servers_loop(
File "/usr/local/lib/python3.8/site-packages/pymongo/topology.py", line
259, in _select_servers_loop
raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: 127.0.0.1:27017: [Errno 111]
Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms,
connectTimeoutMS: 20000.0ms), Timeout: 30s, Topology Description: ]>
[2024-02-07, 12:38:08 IST] {taskinstance.py:1152} INFO - Marking task as
FAILED. dag_id=mongo_dag, task_id=retrieve_users_from_mongodb,
execution_date=20240207T070735, start_date=20240207T070737,
end_date=20240207T070808
[2024-02-07, 12:38:08 IST] {standard_task_runner.py:107} ERROR - Failed to
execute job 920 for task retrieve_users_from_mongodb (127.0.0.1:27017: [Errno
111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms,
connectTimeoutMS: 20000.0ms), Timeout: 30s, Topology Description: ]>; 1243)
[2024-02-07, 12:38:08 IST] {local_task_job_runner.py:234} INFO - Task exited
with return code 1
[2024-02-07, 12:38:08 IST] {taskinstance.py:3318} INFO - 0 downstream tasks
scheduled from follow-on schedule check
```
Point to note is that when I use pymongo directly, it works
```
from pymongo import MongoClient
mongo_host = "127.0.0.1"
mongo_port = 27017
mongo_database = "test"
mongo_collection = "users"
client = MongoClient(mongo_host, mongo_port)
db = client[mongo_database]
collection = db[mongo_collection]
query_result = collection.find_one({"name": "John"})
print("Query result:", query_result)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]