That actually sounds normal if work is being dispatched slowly into Pulp. If you expect two workers, and the /status/ API shows two workers, then it should be healthy. I wrote some on the youtube question about this also: https://www.youtube.com/watch?v=PpinNWOpksA&lc=UgyHs_RFkeLbU6L9HeR4AaABAg.8_qLVyV5tza8_qMzDLvKrK
On Wed, Dec 6, 2017 at 2:31 PM, Deej Howard <[email protected]> wrote: > I used the qpid-stat -q utilility on my installation, and > I saw something that confused me. I would have expected the > resource_manager queue to have more message traffic as compared to my > workers, but this is not the case, and in fact one of my two workers seems > to have no message traffic at all. I suspect this indicates some sort of > misconfiguration somewhere, does that sound correct? > > > > [root@7d53bac13e28 /]# qpid-stat -q > > Queues > > queue dur autoDel excl > msg msgIn msgOut bytes bytesIn bytesOut cons bind > > ============================================================ > ===================================================================== > > …extra output omitted for brevity… > > celery > Y 0 206 206 0 171k 171k 2 > 2 > > celeryev.911e1280-9618-40bb-a54f-813db11d4d3e > Y 0 96.9k 96.9k 0 78.3m 78.3m 1 2 > > pulp.task > Y 0 0 0 0 0 0 3 > 1 > > [email protected] > Y 0 0 0 0 0 0 1 2 > > [email protected] Y > Y 0 0 0 0 0 0 1 2 > > [email protected] > Y 0 0 0 0 0 0 1 2 > > [email protected] Y > Y 0 1.07k 1.07k 0 1.21m 1.21m 1 2 > > resource_manager Y > 0 533 533 0 820k 820k 1 2 > > resource_manager@resource_manager.celery.pidbox > Y 0 0 0 0 0 0 1 2 > > resource_manager@resource_manager.dq Y > Y 0 0 0 0 0 0 1 2 > > > > The pulp-admin status output definitely shows both workers and the > resource_manager as being “discovered”, so what gives? > > > > *From:* Deej Howard [mailto:[email protected]] > *Sent:* Tuesday, December 05, 2017 6:42 PM > *To:* 'Dennis Kliban' <[email protected]> > *Cc:* 'pulp-list' <[email protected]> > *Subject:* RE: [Pulp-list] Need help/advice with import tasks > intermittently causing a time-out condition > > > > That video was very useful, Dennis – thanx for passing it > on! > > > > It sounds like the solution to the problem I’m seeing lies > with the client-side operations, based on the repo reservation methodology > that is in place. It would really be useful if there were some sort of API > call that could be made so the client code could decide if the operation > were just hung due to network issues (and abort or otherwise handle that > state), or if there is an active repo reservation in place that is waiting > to clear before the operation can proceed. I can also appreciate that this > has at least the potential of changing dynamically from the viewpoint of a > client’s operations (because the repo reservation can be put on/taken off > for other tasks that are already in the queue), and it would be good for > the client to be able to determine that its task is progressing (or not) as > far as getting assigned/executed. Sounds like I need to dig deeper into > what I can accomplish with API (or REST) to get a better idea of the exact > status of the import operation and basing decisions more on that status > rather than just “30 attempts every 2 seconds”. > > If nothing else, I now have a better understanding and > some additional troubleshooting tools to track down exactly what is (and is > not) going on! > > > > *From:* Dennis Kliban [mailto:[email protected] <[email protected]>] > *Sent:* Tuesday, December 05, 2017 1:07 PM > *To:* Deej Howard <[email protected]> > *Cc:* pulp-list <[email protected]> > *Subject:* Re: [Pulp-list] Need help/advice with import tasks > intermittently causing a time-out condition > > > > The tasking system in Pulp locks a repository during an import of a > content unit. If clients are uploading content to the same repository, the > import operation has to wait for any previous imports to the same repo to > complete. It's possible that you are not waiting long enough. Unfortunately > this portion of Pulp is not well documented, however, there is a 40 minute > video[0] on YouTube that provides insight into how the tasking system works > and how to troubleshoot it. > > [0] https://youtu.be/PpinNWOpksA > > > > On Tue, Dec 5, 2017 at 12:43 PM, Deej Howard <[email protected]> > wrote: > > Hi, I’m hoping someone can help me solve a strange problem > I’m having with my Pulp installation, or at least give me a good idea where > I should look further to get it solved. The most irritating aspect of the > problem is that it doesn’t reliably reproduce. > > The failure condition is realized when a client is adding > a new artifact. In all cases, the client is able to successfully “upload” > the artifact to Pulp (successful according to the response from the Pulp > server). The problem comes in at the next step where the client directs > Pulp to “import” the uploaded artifact, and then awaits a successful task > result before proceeding. This is set up within a loop; up to 30 queries > for a successful response to the import task are made, with a 2-second > interval between queries. If the import doesn’t succeed within those > constraints, the operation is treated as having timed-out, and further > actions with that artifact (specifically, a publish operation) are > abandoned. Many times that algorithm works with no problem at all, but far > too often, that successful response is not received within the 30 > iterations. It surprises me that there would be a failure at this point, > actually – I wouldn’t expect an “import” operation to be very complicated > or take a lot of time (but I’m certainly not intimate with the details of > Pulp implementation either). Is it just a case that my expectations of the > “import” operation are unreasonable, and I should relax the loop parameters > to allow more attempts/more time between attempts for this to succeed? As > I’ve mentioned, this doesn’t always fail, I’d even go so far as to claim > that it succeeds “most of the time”, but I need more consistency than that > for this to be deemed production-worthy. > > I’ve tried monitoring operations using pulp-admin to make > sure that tasks are being managed properly (they seem to be, but I’m not > yet any sort of Pulp expert), and I’ve also monitored the Apache mod_status > output to see if there is anything obvious (there’s not, but I’m no Apache > expert either). I’ve also found nothing obvious in any Pulp log output. > I’d be deeply grateful if anyone can offer any sort of wisdom, help or > advice on this issue, I’m at the point where I’m not sure where to look > next to get this resolved. I’d seriously hate to have to abandon Pulp > because I can’t get it to perform consistently and reliably (not only > because of the amount of work this would represent, but because I like > working with Pulp and appreciate what it has to offer). > > > > I have managed to put together a test case that seems to > reliably demonstrate the problem – sort of. This test case uses 16 clients > running in parallel, each of which has from 1-10 artifacts to upload (most > clients have only 5). When I say that it “sort of” demonstrates the > problem, the most recent run failed on 5 of those clients (all with the > condition mentioned above), while the previous run failed on 8, and the one > before that on 9, with no consistency of which client will fail to upload > which artifact. > > > > Other observations: > > - Failure conditions don’t seem to have anything to do with the > client’s platform, geographical location, or be attached to a specific > client. > - One failure on a client doesn’t imply the next attempt from that > same client will also fail, in fact, more often than not it doesn’t. > - Failure conditions don’t seem to have anything to do with the > artifact being uploaded. > - There is no consistency around which artifact fails to upload (it’s > not always the first artifact from a client, or the third, etc.) > > > > Environment Details > > - Pulp 2.14.3 using Docker containers based on Centos 7: one > Apache/Pulp API container, one Qpid message broker container, one Mongo DB > container, one Celery worker management container, one resource > manager/task assignment container, and two Pulp worker containers. All > containers are running within a single Docker host, dedicated to only > Pulp-related operations. The diagram at http://docs.pulpproject.org/ > en/2.14/user-guide/scaling.html > <http://docs.pulpproject.org/en/2.14/user-guide/scaling.html> was used > as a guide for this setup. > - Ubuntu/Mac/Windows-based clients are using a Java application plugin > to do artifact uploads. Clients are dispersed across multiple geographical > sites, including the same site where the Pulp server resides. > - Artifacts are company-proprietary (configured as a Pulp plugin), but > essentially are a single ZIP file with attached metadata for tracking and > management purposes. > > > _______________________________________________ > Pulp-list mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/pulp-list > > > > _______________________________________________ > Pulp-list mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/pulp-list >
_______________________________________________ Pulp-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-list
