[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

Antoine "hashar" Musso  changed:

   What|Removed |Added

Summary|[worked around] Zuul is |Zuul is ultra slow post
   |ultra slow post Jenkins |Jenkins upgrade
   |upgrade |

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-09 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #10 from Antoine "hashar" Musso  ---
I have pinged wikitech-l about the two crashes that happened today related to
that issue :
http://lists.wikimedia.org/pipermail/wikitech-l/2013-May/069261.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #11 from Antoine "hashar" Musso  ---
This happened again overnight. Will attach some debugging output.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #12 from Antoine "hashar" Musso  ---
Created attachment 12286
  --> https://bugzilla.wikimedia.org/attachment.cgi?id=12286&action=edit
jstack -F -m -l  using the PID of the stalled thread

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #13 from Antoine "hashar" Musso  ---
Created attachment 12287
  --> https://bugzilla.wikimedia.org/attachment.cgi?id=12287&action=edit
jmap using the PID of the stalled thread

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #14 from Antoine "hashar" Musso  ---
Created attachment 12288
  --> https://bugzilla.wikimedia.org/attachment.cgi?id=12288&action=edit
Jenkins threaddump 1

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #15 from Antoine "hashar" Musso  ---
Created attachment 12289
  --> https://bugzilla.wikimedia.org/attachment.cgi?id=12289&action=edit
Jenkins threaddump 2

Jenkins has an internal thread dumper which is available via the web interface
https://integration.wikimedia.org/ci/threadDump and dumped in the
/var/log/jenkins/jenkins.log file.  The stalled thread is still 24524

The two thread dumps attached have been taken at 10 minutes interval.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #16 from Antoine "hashar" Musso  ---
The stalled thread as the id 24524 (or 0x5fcc ). That shows in both thread
dumps as:

"VM Thread" prio=10 tid=0x7ff398079000 nid=0x5fcc runnable

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #17 from Antoine "hashar" Musso  ---
Last week Timo had the Java Melody plugin installed on Jenkins which let us
monitor the java process. The page entry is
https://integration.wikimedia.org/ci/monitoring

We can get heap history via
https://integration.wikimedia.org/ci/monitoring?part=heaphisto

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #18 from Antoine "hashar" Musso  ---
After a bit more investigation, our build.xml files contains references to
org.jvnet.hudson.plugins.DownstreamBuildViewAction which was a plugin to
display the dependencies between builds.  When upgrading Jenkins I have removed
that plugin but the build files still contains references to it such as:



  


When Jenkins parse the build history, it will detect this is no more used and
will log a message which is stored in memory.  The data can be discarded
manually  via the 'Manage old data'
https://integration.wikimedia.org/ci/administrativeMonitor/OldData/manage

With the thousands of build history, the old data store turns out to have too
many item for Java heap size and that most probably cause the memory
exhaustion.


The slow way to solve that is to ask Jenkins to parse the history files for a
job then manually discard all data.   The fastest / hard way would be to parse
all the build.xml and remove the XML snippet.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #19 from Antoine "hashar" Musso  ---
And the super long filtering command:


find . -name build.xml \
 -exec fgrep -q 'DownstreamBuild' {} \; \
 -exec sed -i-back \
 '/   
/d'
{} \;

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #20 from Antoine "hashar" Musso  ---
running above tool on mediawiki-core-lint

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #21 from Antoine "hashar" Musso  ---
Launched for all jobs using  find . -wholename '*/builds/*/builds.xml'

That is running in a tmux window on gallium.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #22 from Antoine "hashar" Musso  ---
All build.xml now have the org.jvnet.hudson.plugins.DownstreamBuildViewAction
snippet removed.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-10 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #23 from Antoine "hashar" Musso  ---
Sent a summary on wikitech-l
http://lists.wikimedia.org/pipermail/wikitech-l/2013-May/069303.html

I will monitor this bug till monday included then probably close this.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-13 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

Antoine "hashar" Musso  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #24 from Antoine "hashar" Musso  ---
That issue is solved for now. The build.xml rewrite fixed it.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

Antoine "hashar" Musso  changed:

   What|Removed |Added

   Priority|Unprioritized   |Immediate
 Status|NEW |ASSIGNED
   Assignee|wikibugs-l@lists.wikimedia. |has...@free.fr
   |org |

--- Comment #1 from Antoine "hashar" Musso  ---
yeah for working over night :-]

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #2 from Antoine "hashar" Musso  ---
The workaround did get rid of the calls to the slow API.

Looking at the Apache logs in /var/log/apache2/integration_access.log , I have
noticed it took up to 8 seconds to update a job description. That is something
being done by Zuul very often and prevents it from doing anything else.

Jenkins had at least a thread at 100% CPU and the main process (java) was
reporting 100% CPU too.   I have restarted Jenkins.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-05-02 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #3 from Antoine "hashar" Musso  ---
Tips to investigate:

tail -f /var/log/apache2/integration_access.log | grep Python

That will show the Zuul requests to Jenkins.  The time is when the query
started, so any backward time will indicated a query that took a long time.  On
completion, query send a 302 which in turns triggers a GET from Jenkins.

Exemple:

 02/May/2013:23:17:53
 "POST /ci/job/parsoid-parsertests-run-harder/65/submitDescription 302"
 02/May/2013:23:17:53
 "GET /ci/job/parsoid-parsertests-run-harder/65/ 200"

That one has been fast.


Before triggering a job, Zuul attempts to lookup whether the job exist. It does
that based on the python-jenkins module which hit an API call which is very
slow on our setup.I have overridden the existence check with Gerrit Change
#62095.


For now it seems Jenkins is more or less working after the last restart.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-12-18 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #29 from Gerrit Notification Bot  ---
Change 102465 merged by Hashar:
WMF: less build description updates

https://gerrit.wikimedia.org/r/102465

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-12-18 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

Gerrit Notification Bot  changed:

   What|Removed |Added

 Status|RESOLVED|PATCH_TO_REVIEW
 Resolution|FIXED   |---

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-12-18 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #28 from Gerrit Notification Bot  ---
Change 102465 had a related patch set uploaded by Hashar:
WMF: less build description updates

https://gerrit.wikimedia.org/r/102465

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-12-18 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

Antoine "hashar" Musso  changed:

   What|Removed |Added

 Status|PATCH_TO_REVIEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #30 from Antoine "hashar" Musso  ---
Gerrit change 102465 deployed in production and apparently definitely solved
the issue.

Summary sent on wikitech-l
http://lists.wikimedia.org/pipermail/wikitech-l/2013-December/073703.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-09-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

Gerrit Notification Bot  changed:

   What|Removed |Added

 Status|RESOLVED|PATCH_TO_REVIEW
 Resolution|FIXED   |---

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-09-30 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #25 from Gerrit Notification Bot  ---
Change 86812 had a related patch set uploaded by Matanya:
Removed pstack package since bug 48025 is resolved.

https://gerrit.wikimedia.org/r/86812

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-10-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

Antoine "hashar" Musso  changed:

   What|Removed |Added

 Status|PATCH_TO_REVIEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #26 from Antoine "hashar" Musso  ---
Unrelated change.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l


[Bug 48025] Zuul is ultra slow post Jenkins upgrade

2013-10-01 Thread bugzilla-daemon
https://bugzilla.wikimedia.org/show_bug.cgi?id=48025

--- Comment #27 from Gerrit Notification Bot  ---
Change 86812 merged by Dzahn:
Removed pstack package since bug 48025 is resolved.

https://gerrit.wikimedia.org/r/86812

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l