[JIRA] [core] (JENKINS-1948) Intermittent slave disconnections with secondary symptoms
Title: Message Title Jesse Glick assigned an issue to Oleg Nenashev Jenkins / JENKINS-1948 Intermittent slave disconnections with secondary symptoms Change By: Jesse Glick Assignee: Oleg Nenashev Add Comment This message was sent by Atlassian JIRA (v6.4.2#64017-sha1:e244265) -- You received this message because you are subscribed to the Google Groups "Jenkins Issues" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-1948) Intermittent slave disconnections with secondary symptoms
Daniel Beck updated JENKINS-1948 Intermittent slave disconnections with secondary symptoms Change By: Daniel Beck (15/Sep/14 1:08 AM) Labels: remoting robustnessslave This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
[JIRA] [core] (JENKINS-1948) Intermittent slave disconnections with secondary symptoms
Fatih Degirmenci commented on JENKINS-1948 Intermittent slave disconnections with secondary symptoms It would be good if the error message printed to console could give a bit more info regarding what the "real" proble is, rather than just saying unable to delete script file. What's the problem behind this preventing Jenkins from deleting the file? Is it connectivity issue, etc and so on. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[JIRA] [core] (JENKINS-1948) Intermittent slave disconnections with secondary symptoms
Fatih Degirmenci edited a comment on JENKINS-1948 Intermittent slave disconnections with secondary symptoms There are different ways of fixing this issue and in our case, rebooting slaves help. But the other people solved the issue by changing SSH settings on slaves, changing java version used for connecting slaves, etc. (as explained in JENKINS-12235 and other tickets on this issue.) So it is crucial for everyone that this issue is solved. Of course there are some things which Jenkins can not solve such as issues with the slaves themselves, etc. If this is the case, it would be good if Jenkins tells us a bit more. The error message printed to console could tell us what the "real" problem is, rather than just saying unable to delete the script file. This misleads us and we start troubleshooting unrelated things rather than just checking the slave and rebooting it if necessary. So, having an indication regarding what's the problem behind this preventing Jenkins from deleting the file could improve things. (is it connectivity issue, etc and so on.) This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[JIRA] [core] (JENKINS-1948) Intermittent slave disconnections with secondary symptoms
Guy Rozendorn commented on JENKINS-1948 Intermittent slave disconnections with secondary symptoms This week we changed all our 80± slaves from using the SSHLauncher to use the CommandLauncher, which launches strace -t -s 4096 ssh ..., which the following lines in .ssh/config: TCPKeepAlive yes ServerAliveInterval 10 ServerAliveCountMax 10 LogLevel DEBUG The reason using strace is to get a clue if the connection is dropped first, or the master decides it is dead. In one of the job executions (which started at 00:00:56 we get this in the log: Started by timer Building remotely on host-ci66 in workspace /root/jenkins/workspace/mainline-bdist-develop Deleting project workspace... Checkout:mainline-bdist-develop / /root/jenkins/workspace/mainline-bdist-develop - hudson.remoting.Channel@27ecbe67:host-ci66 Using strategy: Default Last Built Revision: Revision ff29df8b003dde47573ddfb0b463351baee6dea3 (origin/develop) FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:722) at hudson.FilePath.act(FilePath.java:894) at hudson.FilePath.act(FilePath.java:878) at hudson.plugins.git.GitSCM.determineRevisionToBuild(GitSCM.java:942) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1108) at hudson.model.AbstractProject.checkout(AbstractProject.java:1369) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:676) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:581) at hudson.model.Run.execute(Run.java:1593) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:242) Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:782) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2595) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1315) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) and the slave.log ends with this: 00:00:56 select(7, [3 4], [3 5], NULL, {10, 0}) = 4 (in [3 4], out [3 5], left {9, 97}) 00:00:56 read(4, "\10\0\245\231\255B,\304\304\7\23\3\0\\7\0\0d\0\0\0hudson/plugins/git/browser/FisheyeGitRepositoryBrowser$FisheyeGitRepositoryBrowserDescriptor$1.class\265U{O\23A\20\377m)\\\251\247\205\"\240@\364\264UK\21\17|\240\370\226\332\"Z5\341\225h\214\361\350-\355\312q[\357\256\5\376\364c\370-0\0214\232\370\1\374P\306\331k!\0264m\2567\2733\263\363\370\315\314\336\217\237_\277\3\230\304\223n\34\303\250ze\343H`L\303\3058\242\30\217\23\347R\f\246\242\23qR\274\34\303\25E\257j\270\246\230S\32\256k\270\301p\254n9\351U\341Z\316\262\345\3248C\262\370\326\252[\246c\271es!\360\204[\276\305\320\25T\204\237\236\320@\353s\5\341W\370\26\237\25\301\257J_\4\322\333\232\361\344\206\317\275\207\334/y\242J\34\6}\316u\271\227s,\337\347\303\353b\245f\373\3225\253N\255,\\\337,\213\300\\i\0343\0171\231\376'w*\306\333\302\25\301]\6?\323^W\177\0024\272\314\20\315I\233\340K\24\205\313\237\325\326W\270\267h\2558!\240\262\244\320\365\204\3327\231Q\5(\3\30\336\2645\330\364$!\323Y\252\360\322\32\303\251\314\350\236\263Z \34\263 \275u\252\272\260\255@H\227\24\31E\326\337HOHs\356y~\263\304\253MY|\177\343k\270M\373\5Y\363J\274
[JIRA] [core] (JENKINS-1948) Intermittent slave disconnections with secondary symptoms
Guy Rozendorn edited a comment on JENKINS-1948 Intermittent slave disconnections with secondary symptoms This week we changed all our 80± slaves from using the SSHLauncher to use the CommandLauncher, which launches strace -t -s 4096 ssh ..., which the following lines in .ssh/config: TCPKeepAlive yes ServerAliveInterval 10 ServerAliveCountMax 10 LogLevel DEBUG The reason using strace is to get a clue if the connection is dropped first, or the master decides it is dead. In one of the job executions (which started at 00:00:56 we get this in the log: Started by timer Building remotely on host-ci66 in workspace /root/jenkins/workspace/mainline-bdist-develop Deleting project workspace... Checkout:mainline-bdist-develop / /root/jenkins/workspace/mainline-bdist-develop - hudson.remoting.Channel@27ecbe67:host-ci66 Using strategy: Default Last Built Revision: Revision ff29df8b003dde47573ddfb0b463351baee6dea3 (origin/develop) FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel hudson.remoting.RequestAbortedException: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.call(Request.java:174) at hudson.remoting.Channel.call(Channel.java:722) at hudson.FilePath.act(FilePath.java:894) at hudson.FilePath.act(FilePath.java:878) at hudson.plugins.git.GitSCM.determineRevisionToBuild(GitSCM.java:942) at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1108) at hudson.model.AbstractProject.checkout(AbstractProject.java:1369) at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:676) at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:88) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:581) at hudson.model.Run.execute(Run.java:1593) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:242) Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected termination of the channel at hudson.remoting.Request.abort(Request.java:299) at hudson.remoting.Channel.terminate(Channel.java:782) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69) Caused by: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50) Caused by: java.io.EOFException at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2595) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1315) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369) at hudson.remoting.Command.readFrom(Command.java:92) at hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48) and the slave.log ends with this: 00:00:56 select(7, [3 4], [3 5], NULL, {10, 0}) = 4 (in [3 4], out [3 5], left {9, 97}) 00:00:56 read(4, "\10\0\245\231\255B,\304\304\7\23\3\0\\7\0\0d\0\0\0hudson/plugins/git/browser/FisheyeGitRepositoryBrowser$FisheyeGitRepositoryBrowserDescriptor$1.class\265U{O\23A\20\377m)\\\251\247\205\"\240@\364\264UK\21\17|\240\370\226\332\"Z5\341\225h\214\361\350-\355\312q[\357\256\5\376\364c\370-0\0214\232\370\1\374P\306\331k!\0264m\2567\2733\263\363\370\315\314\336\217\237_\277\3\230\304\223n\34\303\250ze\343H`L\303\3058\242\30\217\23\347R\f\246\242\23qR\274\34\303\25E\257j\270\246\230S\32\256k\270\301p\254n9\351U\341Z\316\262\345\3248C\262\370\326\252[\246c\271es!\360\204[\276\305\320\25T\204\237\236\320@\353s\5\341W\370\26\237\25\301\257J_\4\322\333\232\361\344\206\317\275\207\334/y\242J\34\6}\316u\271\227s,\337\347\303\353b\245f\373\3225\253N\255,\\\337,\213\300\\i\0343\0171\231\376'w*\306\333\302\25\301]\6?\323^W\177\0024\272\314\20\315I\233\340K\24\205\313\237\325\326W\270\267h\2558!\240\262\244\320\365\204\3327\231Q\5(\3\30\336\2645\330\364$!\323Y\252\360\322\32\303\251\314\350\236\263Z \34\263 \275u\252\272\260\255@H\227\24\31E\326\337HOHs\356y~\263\304\253MY|\177\343k\270M\373\5Y\363J\274
[JIRA] [core] (JENKINS-1948) Intermittent slave disconnections with secondary symptoms
jminne commented on JENKINS-1948 Intermittent slave disconnections with secondary symptoms I'm also seeing this only on an OSX slave. Windows and linux slaves are fine. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
[JIRA] [core] (JENKINS-1948) Intermittent slave disconnections with secondary symptoms
Jesse Glick updated JENKINS-1948 Intermittent slave disconnections with secondary symptoms Change By: Jesse Glick (14/Feb/14 2:45 PM) Summary: intermittent:failstoremovetemporaryfileonremote. Intermittentslavedisconnectionswithsecondarysymptoms Labels: robustnessslave Priority: Major Critical Component/s: core Component/s: other This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira -- You received this message because you are subscribed to the Google Groups Jenkins Issues group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-issues+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.