[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734022#comment-13734022 ] Kevin Kluge commented on CLOUDSTACK-105: Kevin Kluge is no longer with Citrix. His responsibilities are currently shared by the following people: Ram Chinta, ram.chi...@citrix.com, for CloudPlatform in the India time zone. Will Chan, will.c...@citrix.com, for CloudPlatform in the U.S. time zone. Vijay Natarajan, vijay.natara...@citrix.com, for CloudPortal. Please re-send your mail to one of these people if assistance is required. > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.2.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734021#comment-13734021 ] Stephen Turner commented on CLOUDSTACK-105: --- Thank you for your email. I am on vacation from Friday 2nd to Friday 16th August. I shall read your email when I return on Monday 19th. In the mean time, please contact my manager, andrew.hal...@citrix.com. Thank you very much, -- Stephen Turner Sr Manager, XenServer Citrix > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.2.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664171#comment-13664171 ] Kevin Kluge commented on CLOUDSTACK-105: Jason's comment from Dec 5 strongly suggests this is a XenServer 6.0.2 bug. I have raised the issue with the XenServer engineers. I do not have a public bug ticket to share. I don't think there is much that CloudStack can do about this issue so I have moved it to the Third Party bug component. It would be helpful if we were able to determine that people running XS 6.1.0 did or did not see this. I never heard of this in XS 5.6.* so I suspect it is new for XS 6.0.*. > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664265#comment-13664265 ] Stephen Turner commented on CLOUDSTACK-105: --- Could someone who can reproduce this attach a XenServer bugtool (aka Server Status Report from XenCenter) taken immediately after it occurs? That will enable us to track this down. Thank you. > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664327#comment-13664327 ] Caleb Call commented on CLOUDSTACK-105: --- I'll be happy to attach the dump but this isn't something that just happens. It's constantly happening. In order to avoid our servers from crashing, we have to have a cronjob that removes any of these files that are older than a couple days old. I also don't think this is necessarily a Xenserver bug, maybe a Xenserver under CloudStack as without joining Xenserver to CloudStack, this never happens. Once it's joined, it starts happening. I'm also have a suspicion it's being caused by this script /etc/xapi.d/plugins/vmops and in particular, this part of that script (sorry, I'm sure jira is going to munge this output): def setLinkLocalIP(session, args): brName = args['brName'] try: cmd = ["ip", "route", "del", "169.254.0.0/16"] txt = util.pread2(cmd) except: txt = '' try: cmd = ["ifconfig", brName, "169.254.0.1", "netmask", "255.255.0.0"] txt = util.pread2(cmd) except: try: cmd = ['cat', '/etc/xensource/network.conf'] result = util.pread2(cmd) except: return 'can not cat network.conf' if result.lower() == "bridge": try: cmd = ["brctl", "addbr", brName] txt = util.pread2(cmd) except: pass else: try: cmd = ["ovs-vsctl", "add-br", brName] txt = util.pread2(cmd) except: pass try: cmd = ["ifconfig", brName, "169.254.0.1", "netmask", "255.255.0.0"] txt = util.pread2(cmd) except: pass try: cmd = ["ip", "route", "add", "169.254.0.0/16", "dev", brName, "src", "169.254.0.1"] txt = util.pread2(cmd) except: txt = '' txt = 'success' return txt > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664354#comment-13664354 ] Stephen Turner commented on CLOUDSTACK-105: --- Thanks for that, Caleb. We'll look into that. > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664376#comment-13664376 ] ASF subversion and git services commented on CLOUDSTACK-105: Commit c342fa69a827a00aa3297a6cc9f6a15e7f711cd9 in branch refs/heads/4.1 from [~anthonyxu] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=c342fa6 ] CLOUDSTACK-105, there is a trailing '\n' after 'bridge', need to remove it before checking if XS is in bridge mode or ovs mode > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664382#comment-13664382 ] Anthony Xu commented on CLOUDSTACK-105: --- Caleb, Can you check if above checkin fixes this issue? you can manaually patch /etc/xapi.d/plugins/vmops in your XS host. --- a/scripts/vm/hypervisor/xenserver/vmops +++ b/scripts/vm/hypervisor/xenserver/vmops @@ -267,7 +267,7 @@ def setLinkLocalIP(session, args): except: return 'can not cat network.conf' -if result.lower() == "bridge": +if result.lower().strip() == "bridge": try: cmd = ["brctl", "addbr", brName] txt = util.pread2(cmd) Anthony > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664399#comment-13664399 ] Anthony Xu commented on CLOUDSTACK-105: --- BTW , you need to switch to bridge mode if you want to use Security Group. Execute below in XS host, xe-switch-network-backend bridge > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664455#comment-13664455 ] Caleb Call commented on CLOUDSTACK-105: --- I've applied that and commented out our cron to remove these old files. I'll give it a couple hours and check to see if we have any new files. I've also verified we are setup with bridge already across all our nodes: [root@cloud-hv03 ~]# cat /etc/xensource/network.conf bridge [root@cloud-hv03 ~]# > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13666121#comment-13666121 ] Stephen Turner commented on CLOUDSTACK-105: --- Thanks, Caleb. Any news from your experiment? > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672975#comment-13672975 ] Stephen Turner commented on CLOUDSTACK-105: --- Any news on this? I'm keen to hear if it's sorted your problem! Thanks. > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673217#comment-13673217 ] Caleb Call commented on CLOUDSTACK-105: --- Sorry for the delayed response. I just checked this morning and I don't have any of these files hanging around. I'd say this likely fixed the issue. > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CLOUDSTACK-105) /tmp/stream-unix.####.###### stale sockets causing inodes to run out on Xenserver
[ https://issues.apache.org/jira/browse/CLOUDSTACK-105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673287#comment-13673287 ] Stephen Turner commented on CLOUDSTACK-105: --- Thanks, Caleb, that's good news. > /tmp/stream-unix..## stale sockets causing inodes to run out on > Xenserver > - > > Key: CLOUDSTACK-105 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-105 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Third-Party Bugs >Affects Versions: pre-4.0.0 > Environment: Xenserver 6.0.2 > Cloudstack 3.0.2 >Reporter: Caleb Call >Assignee: Devdeep Singh > Fix For: 4.1.0 > > Attachments: messages > > > We came across an interesting issue in one of our clusters. We ran out of > inodes on all of our cluster members (since when does this happen in 2012?). > When this happened, it in turn made the / filesystem a read-only filesystem > which in turn made all the hosts go in to emergency maintenance mode and as a > result get marked down by Cloudstack. We found that it was caused by > hundreds of thousands of stale socket files in /tmp named > "stream-unix..##". To resolve the issue, we had to delete those > stale socket files (find /tmp -name "*stream*" -mtime +7 -exec rm -v {} \;), > then kill and restart xapi, then correct the emergency maintenance mode. > These hosts had only been up for 45 days before this issue occurred. > In our scouring of the interwebs, the only other instance we've been able to > find of this (or similar) happening is in the same setup we are currently > running. Xenserver 6.0.2 with CS 3.0.2. Do these stream-unix sockets have > anything to do with Cloudstack? I would think if this was a Xenserver issue > (bug), there would be a lot more on the internet about this happening. For a > temporary workaround, we've added a cronjob to cleanup these files but we'd > really like to address the actual issue that's causing these sockets to > become stale and not get cleaned-up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira