All users reporting issues with stuck tasks have been resolved. There is no evidence of a Pulp defect that this time. Given that, I am closing the two bugs used to track the "stuck tasks" issues[0][1].
If you experience stuck tasks please do e-mail the list or open a new bug, but be sure that you're running the latest python-kombu rpm as discussed here[2]. [0]: https://pulp.plan.io/issues/1363 [1]: https://pulp.plan.io/issues/1838 [2]: https://www.redhat.com/archives/pulp-list/2016-April/msg00020.html -Brian On 04/15/2016 03:26 PM, Brian Bouterse wrote: > I'm very interested in determining if there is a legitimate issue with > the Pulp tasking system or not. I need more information from the users > observing these issues. > > If you observe a task that is stuck at Waiting please ping 'bmbouter' in > #pulp and I will syncrhonously ask you for the info needed to get to the > bottom of this. > > FWIW, we are tracking this issue here[0], but the current reports look > more like environmental issues and not legitimate Pulp bugs. > > [0]: https://pulp.plan.io/issues/1838 > > -Brian > > On 04/12/2016 04:14 PM, Paul Jochum wrote: >> Slight correction: (in the middle of my email): >> >> I did grep /var/log/httpd/ssl_access_log *for my ta**sk**-id >> (**f3a4a828-1a98-4d30-989f-a8f62acc29f9)* in /var/log/* and >> /var/log/*/*, but it only found it in /var/log/httpd/ssl_access_log and >> /var/log/httpd/ssl_request_log (lines like: >> /var/log/httpd/ssl_access_log:135.1.215.52 - - [12/Apr/2016:15:04:17 >> -0500] "GET /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/ >> HTTP/1.1" 200 579 >> >> regards, >> Paul >> On 04/12/2016 03:09 PM, EXT Paul Jochum wrote: >>> Hi Brian: >>> >>> I am running into the same problem of "waiting", so here are my >>> outputs (let me know if you need anything else): >>> >>> -sh-4.1# pulp-admin -vv tasks details --task-id >>> f3a4a828-1a98-4d30-989f-a8f62acc29f9 >>> +----------------------------------------------------------------------+ >>> Task Details >>> +----------------------------------------------------------------------+ >>> >>> 2016-04-12 14:59:46,539 - DEBUG - sending GET request to >>> /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/ >>> 2016-04-12 14:59:46,602 - INFO - GET request to >>> /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/ with >>> parameters None >>> 2016-04-12 14:59:46,602 - INFO - Response status : 200 >>> >>> 2016-04-12 14:59:46,602 - INFO - Response body : >>> { >>> "exception": null, >>> "task_type": "pulp.server.managers.repo.sync.sync", >>> "_href": "/pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/", >>> "task_id": "f3a4a828-1a98-4d30-989f-a8f62acc29f9", >>> "tags": [ >>> "pulp:repository:rhel-i386-server-optional-6.5.z-debuginfo", >>> "pulp:action:sync" >>> ], >>> "finish_time": null, >>> "_ns": "task_status", >>> "start_time": null, >>> "traceback": null, >>> "spawned_tasks": [], >>> "progress_report": {}, >>> "queue": "None.dq", >>> "state": "waiting", >>> "worker_name": null, >>> "result": null, >>> "error": null, >>> "_id": { >>> "$oid": "570d1727e75b4adec6bd0d58" >>> }, >>> "id": "570d1727e75b4adec6bd0d58" >>> } >>> >>> Operations: sync >>> Resources: rhel-i386-server-optional-6.5.z-debuginfo (repository) >>> State: Waiting >>> Start Time: Unstarted >>> Finish Time: Incomplete >>> Result: Incomplete >>> Task Id: f3a4a828-1a98-4d30-989f-a8f62acc29f9 >>> Progress Report: >>> >>> >>> -sh-4.1# ps -awfux | grep celery >>> Warning: bad syntax, perhaps a bogus '-'? See >>> /usr/share/doc/procps-3.2.8/FAQ >>> root 16147 0.0 0.0 6444 704 pts/2 S+ 15:01 0:00 \_ grep >>> celery >>> apache 977 0.1 0.1 561608 25908 ? Sl Apr09 6:15 >>> /usr/bin/python /usr/bin/celery beat >>> --app=pulp.server.async.celery_instance.celery >>> --scheduler=pulp.server.async.scheduler.Scheduler >>> --workdir=/var/run/pulp/ -f /var/log/pulp/celerybeat.log -l INFO >>> --detach --pidfile=/var/run/pulp/celerybeat.pid >>> apache 9558 0.0 0.4 553760 58972 ? Sl 10:40 0:05 >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n resource_mana...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO -Q resource_manager >>> --logfile=/var/log/pulp/resource_manager.log >>> --pidfile=/var/run/pulp/resource_manager.pid >>> apache 9602 0.4 0.3 472928 50512 ? Sl 10:40 1:16 \_ >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n resource_mana...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO -Q resource_manager >>> --logfile=/var/log/pulp/resource_manager.log >>> --pidfile=/var/run/pulp/resource_manager.pid >>> apache 9792 0.0 0.4 553976 58924 ? Sl 10:41 0:06 >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-0.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-0.pid >>> apache 9865 0.0 0.3 472916 49692 ? S 10:41 0:00 \_ >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-0.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-0.pid >>> apache 9807 0.0 0.4 553976 58920 ? Sl 10:41 0:06 >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-1.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-1.pid >>> apache 9887 0.0 0.3 472916 49688 ? S 10:41 0:00 \_ >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-1.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-1.pid >>> apache 9826 0.0 0.4 553564 58896 ? Sl 10:41 0:06 >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-2.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-2.pid >>> apache 9893 0.0 0.3 472924 49684 ? S 10:41 0:00 \_ >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-2.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-2.pid >>> apache 9847 0.0 0.4 553976 58912 ? Sl 10:41 0:06 >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-3.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-3.pid >>> apache 9917 0.0 0.3 472916 49692 ? S 10:41 0:00 \_ >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-3.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-3.pid >>> apache 9871 0.0 0.4 553752 58896 ? Sl 10:41 0:06 >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-4.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-4.pid >>> apache 9945 0.0 0.3 472920 49684 ? S 10:41 0:00 \_ >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-4.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-4.pid >>> apache 9896 0.0 0.4 553988 58932 ? Sl 10:41 0:06 >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-5.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-5.pid >>> apache 9982 0.0 0.3 472924 49700 ? S 10:41 0:00 \_ >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-5.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-5.pid >>> apache 9918 0.0 0.4 553972 58916 ? Sl 10:41 0:06 >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-6.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-6.pid >>> apache 9990 0.0 0.3 472912 49680 ? S 10:41 0:00 \_ >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-6.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-6.pid >>> apache 9940 0.0 0.4 553756 58892 ? Sl 10:41 0:06 >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-7.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-7.pid >>> apache 10013 0.0 0.3 472920 49684 ? S 10:41 0:00 \_ >>> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 >>> -n reserved_resource_worke...@lss-pulp01.ih.lucent.com --events >>> --app=pulp.server.async.app --loglevel=INFO >>> --logfile=/var/log/pulp/reserved_resource_worker-7.log >>> --pidfile=/var/run/pulp/reserved_resource_worker-7.pid >>> -sh-4.1# >>> >>> >>> I did grep /var/log/httpd/ssl_access_log in /var/log/* and >>> /var/log/*/*, but it only found it in /var/log/httpd/ssl_access_log >>> and /var/log/httpd/ssl_request_log (lines like: >>> /var/log/httpd/ssl_access_log:135.1.215.52 - - [12/Apr/2016:15:04:17 >>> -0500] "GET /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/ >>> HTTP/1.1" 200 579 >>> >>> >>> -sh-4.1# rpm -q -a | grep -i pulp | sort >>> mod_wsgi-3.4-2.pulp.el6.x86_64 >>> pulp-admin-client-2.8.1-1.el6.noarch >>> pulp-puppet-admin-extensions-2.8.1-1.el6.noarch >>> pulp-puppet-plugins-2.8.1-1.el6.noarch >>> pulp-rpm-admin-extensions-2.8.1-1.el6.noarch >>> pulp-rpm-plugins-2.8.1-1.el6.noarch >>> pulp-selinux-2.8.1-1.el6.noarch >>> pulp-server-2.8.1-1.el6.noarch >>> python-isodate-0.5.0-4.pulp.el6.noarch >>> python-kombu-3.0.33-4.pulp.el6.noarch >>> python-pulp-bindings-2.8.1-1.el6.noarch >>> python-pulp-client-lib-2.8.1-1.el6.noarch >>> python-pulp-common-2.8.1-1.el6.noarch >>> python-pulp-oid_validation-2.8.1-1.el6.noarch >>> python-pulp-puppet-common-2.8.1-1.el6.noarch >>> python-pulp-repoauth-2.8.1-1.el6.noarch >>> python-pulp-rpm-common-2.8.1-1.el6.noarch >>> >>> >>> -sh-4.1# uname -a >>> Linux lss-pulp01.ih.lucent.com 2.6.32-042stab113.21 #1 SMP Wed Mar 23 >>> 11:05:25 MSK 2016 x86_64 x86_64 x86_64 GNU/Linux >>> >>> >>> -sh-4.1# more /etc/*release* >>> :::::::::::::: >>> /etc/redhat-release >>> :::::::::::::: >>> Scientific Linux release 6.7 (Carbon) >>> :::::::::::::: >>> /etc/system-release >>> :::::::::::::: >>> Scientific Linux release 6.7 (Carbon) >>> :::::::::::::: >>> /etc/system-release-cpe >>> :::::::::::::: >>> cpe:/o:redhat:enterprise_linux:6.7:ga >>> -sh-4.1# >>> >>> On 04/11/2016 03:32 PM, Brian Bouterse wrote: >>>> To anyone who has a stuck task running Pulp 2.6 or newer, can you open a >>>> bug or e-mail the list with the following. The full output from: >>>> >>>> pulp-admin -vv tasks details --task-id <the_stuck_task_UUID> >>>> >>>> Post the full output, but observe the value of "worker_name" similar to >>>> what was suggested here[0]. >>>> >>>> Can you also include the `sudo ps -awfux | grep celery` output to show >>>> your running process? >>>> >>>> Also can you grep the logs for that UUID and include any lines that you >>>> see as output too. >>>> >>>> [0]: https://www.redhat.com/archives/pulp-list/2016-April/msg00026.html >>>> >>>> Thanks, >>>> Brian >>>> >>>> >>>> On 04/11/2016 01:32 PM, Mihai Ibanescu wrote: >>>>> I have a similar problem on 2.7.1 with rabbitmq. >>>>> >>>>> Occasionally a task will get "lost": >>>>> >>>>> pulp-admin tasks details --task-id 622041ac-e9e4-4a15-bd7c-7c98a17782e0 >>>>> +----------------------------------------------------------------------+ >>>>> >>>>> Task Details >>>>> +----------------------------------------------------------------------+ >>>>> >>>>> >>>>> Operations: publish >>>>> Resources: thirdparty-snapshot-rpm-latest (repository) >>>>> State: Waiting >>>>> Start Time: Unstarted >>>>> Finish Time: Incomplete >>>>> Result: Incomplete >>>>> Task Id: 622041ac-e9e4-4a15-bd7c-7c98a17782e0 >>>>> Progress Report: >>>>> >>>>> >>>>> I have seen more than one getting in this state. >>>>> >>>>> I can cancel the task, but I'd like to find out if I can tickle >>>>> something to re-process the queue. Other tasks do get correctly >>>>> processed. >>>>> >>>>> If there's something you would like me to do in order to debug this >>>>> further, please let me know. >>>>> >>>>> Thank you! >>>>> Mihai >>>>> >>>>> On Fri, Apr 8, 2016 at 9:27 AM, Brian Bouterse <bbout...@redhat.com >>>>> <mailto:bbout...@redhat.com>> wrote: >>>>> >>>>> There is a known issue [0] introduced in 2.8.0 which will cause >>>>> your >>>>> tasks to become stuck at waiting. This occurs after Pulp runs >>>>> normally >>>>> for hours or days. >>>>> >>>>> A fix is completed but won't be released until Pulp 2.8.3 is >>>>> released. >>>>> The good news is, you can resolve that issue now by upgrading >>>>> python-kombu on your existing 2.8.0, 2.8.1, or 2.8.2 system. >>>>> Manually >>>>> install python-kombu-3.0.33-5.pulp for your distro on your system, >>>>> restart Pulp services and you'll be all set. All of the links >>>>> below come >>>>> out of the Pulp build system[1] >>>>> >>>>> EL6: http://tinyurl.com/j5sy5zh >>>>> EL7: http://tinyurl.com/jppwcdv >>>>> F22: http://tinyurl.com/hp7z4sq >>>>> F23: http://tinyurl.com/zplzua3 >>>>> F24: update is in -testing repo [2] >>>>> rawhide: already available in stable >>>>> >>>>> These are the same versions of python-kombu that you'll receive >>>>> when >>>>> 2.8.3 comes out so upgrading this package will keep your >>>>> packages on the >>>>> normal upgrade path. These newer python-kombu versions are >>>>> compatible >>>>> with 2.8.0, 2.8.1, and 2.8.2 versions of Pulp. >>>>> >>>>> [0]: https://pulp.plan.io/issues/1801 >>>>> [1]: http://koji.katello.org/koji/packageinfo?packageID=665 >>>>> [2]: >>>>> https://bodhi.fedoraproject.org/updates/FEDORA-2016-ec038bbf19 >>>>> >>>>> -Brian >>>>> >>>>> _______________________________________________ >>>>> Pulp-list mailing list >>>>> Pulp-list@redhat.com <mailto:Pulp-list@redhat.com> >>>>> https://www.redhat.com/mailman/listinfo/pulp-list >>>>> >>>>> >>>> _______________________________________________ >>>> Pulp-list mailing list >>>> Pulp-list@redhat.com >>>> https://www.redhat.com/mailman/listinfo/pulp-list >>> >>> _______________________________________________ >>> Pulp-list mailing list >>> Pulp-list@redhat.com >>> https://www.redhat.com/mailman/listinfo/pulp-list >> > > _______________________________________________ > Pulp-list mailing list > Pulp-list@redhat.com > https://www.redhat.com/mailman/listinfo/pulp-list > _______________________________________________ Pulp-list mailing list Pulp-list@redhat.com https://www.redhat.com/mailman/listinfo/pulp-list