Single host VM limit when using RBD

2013-01-17 Thread Matthew Anderson
I've run into a limit on the maximum number of RBD backed VM's that I'm able to 
run on a single host. I have 20 VM's (21 RBD volumes open) running on a single 
host and when booting the 21st machine I get the below error from libvirt/QEMU. 
I'm able to shut down a VM and start another in it's place so there seems to be 
a hard limit on the amount of volumes I'm able to have open.  I did some 
googling and the error 11 from pthread_create seems to mean 'resource 
unavailable' so I'm probably running into a thread limit of some sort. I did 
try increasing the max_thread kernel option but nothing changed. I moved a few 
VM's to a different empty host and they start with no issues at all.

This machine has 4 OSD's running on it in addition to the 20 VM's. Kernel 
3.7.1. Ceph 0.56.1 and QEMU 1.3.0. There is currently 65GB of 96GB free ram and 
no swap. 

Can anyone suggest where the limit might be or anything I can do to narrow down 
the problem?

Thanks
-Matt
-

Error starting domain: internal error Process exited while reading console log 
output: char device redirected to /dev/pts/23
Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In 
function 'void Thread::create(size_t)' thread 7f4eb5a65960 time 2013-01-17 
02:32:58.096437
common/Thread.cc: 110: FAILED assert(ret == 0)
ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
1: (()+0x2aaa8f) [0x7f4eb2de8a8f]
2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575]
3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc]
4: (()+0xa0290) [0x7f4eb5b27290]
5: (()+0x879dd) [0x7f4eb5b0e9dd]
6: (()+0x87c1b) [0x7f4eb5b0ec1b]
7: (()+0x87ae1) [0x7f4eb5b0eae1]
8: (()+0x87d50) [0x7f4eb5b0ed50]
9: (()+0xb37b2) [0x7f4eb5b3a7b2]
10: (()+0x1e83eb) [0x7f4eb5c6f3eb]
11: (()+0x1ab54a) [0x7f4eb5c3254a]
12: (main()+0x9da) [0x7f4eb5c72a3a]
13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd]
14: (()+0x710b9) [0x7f4eb5af80b9]
NOTE: a copy of the executable, or `objdump -rdS executable` is needed to 
interpret this.
terminate called after

Traceback (most recent call last):
  File /usr/share/virt-manager/virtManager/asyncjob.py, line 96, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File /usr/share/virt-manager/virtManager/asyncjob.py, line 117, in tmpcb
    callback(*args, **kwargs)
  File /usr/share/virt-manager/virtManager/domain.py, line 1090, in startup
    self._backend.create()
  File /usr/lib/python2.7/dist-packages/libvirt.py, line 620, in create
    if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self)
libvirtError: internal error Process exited while reading console log output: 
char device redirected to /dev/pts/23
Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In 
function 'void Thread::create(size_t)' thread 7f4eb5a65960 time 2013-01-17 
02:32:58.096437
common/Thread.cc: 110: FAILED assert(ret == 0)
ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
1: (()+0x2aaa8f) [0x7f4eb2de8a8f]
2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575]
3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc]
4: (()+0xa0290) [0x7f4eb5b27290]
5: (()+0x879dd) [0x7f4eb5b0e9dd]
6: (()+0x87c1b) [0x7f4eb5b0ec1b]
7: (()+0x87ae1) [0x7f4eb5b0eae1]
8: (()+0x87d50) [0x7f4eb5b0ed50]
9: (()+0xb37b2) [0x7f4eb5b3a7b2]
10: (()+0x1e83eb) [0x7f4eb5c6f3eb]
11: (()+0x1ab54a) [0x7f4eb5c3254a]
12: (main()+0x9da) [0x7f4eb5c72a3a]
13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd]
14: (()+0x710b9) [0x7f4eb5af80b9]
NOTE: a copy of the executable, or `objdump -rdS executable` is needed to 
interpret this.
terminate called after

--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Single host VM limit when using RBD

2013-01-17 Thread Andrey Korolyov
Hi Matthew,

Seems to a low value in /proc/sys/kernel/threads-max value.

On Thu, Jan 17, 2013 at 12:37 PM, Matthew Anderson
matth...@base3.com.au wrote:
 I've run into a limit on the maximum number of RBD backed VM's that I'm able 
 to run on a single host. I have 20 VM's (21 RBD volumes open) running on a 
 single host and when booting the 21st machine I get the below error from 
 libvirt/QEMU. I'm able to shut down a VM and start another in it's place so 
 there seems to be a hard limit on the amount of volumes I'm able to have 
 open.  I did some googling and the error 11 from pthread_create seems to mean 
 'resource unavailable' so I'm probably running into a thread limit of some 
 sort. I did try increasing the max_thread kernel option but nothing changed. 
 I moved a few VM's to a different empty host and they start with no issues at 
 all.

 This machine has 4 OSD's running on it in addition to the 20 VM's. Kernel 
 3.7.1. Ceph 0.56.1 and QEMU 1.3.0. There is currently 65GB of 96GB free ram 
 and no swap.

 Can anyone suggest where the limit might be or anything I can do to narrow 
 down the problem?

 Thanks
 -Matt
 -

 Error starting domain: internal error Process exited while reading console 
 log output: char device redirected to /dev/pts/23
 Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In 
 function 'void Thread::create(size_t)' thread 7f4eb5a65960 time 2013-01-17 
 02:32:58.096437
 common/Thread.cc: 110: FAILED assert(ret == 0)
 ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
 1: (()+0x2aaa8f) [0x7f4eb2de8a8f]
 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575]
 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc]
 4: (()+0xa0290) [0x7f4eb5b27290]
 5: (()+0x879dd) [0x7f4eb5b0e9dd]
 6: (()+0x87c1b) [0x7f4eb5b0ec1b]
 7: (()+0x87ae1) [0x7f4eb5b0eae1]
 8: (()+0x87d50) [0x7f4eb5b0ed50]
 9: (()+0xb37b2) [0x7f4eb5b3a7b2]
 10: (()+0x1e83eb) [0x7f4eb5c6f3eb]
 11: (()+0x1ab54a) [0x7f4eb5c3254a]
 12: (main()+0x9da) [0x7f4eb5c72a3a]
 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd]
 14: (()+0x710b9) [0x7f4eb5af80b9]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed to 
 interpret this.
 terminate called after

 Traceback (most recent call last):
   File /usr/share/virt-manager/virtManager/asyncjob.py, line 96, in 
 cb_wrapper
 callback(asyncjob, *args, **kwargs)
   File /usr/share/virt-manager/virtManager/asyncjob.py, line 117, in tmpcb
 callback(*args, **kwargs)
   File /usr/share/virt-manager/virtManager/domain.py, line 1090, in startup
 self._backend.create()
   File /usr/lib/python2.7/dist-packages/libvirt.py, line 620, in create
 if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self)
 libvirtError: internal error Process exited while reading console log output: 
 char device redirected to /dev/pts/23
 Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In 
 function 'void Thread::create(size_t)' thread 7f4eb5a65960 time 2013-01-17 
 02:32:58.096437
 common/Thread.cc: 110: FAILED assert(ret == 0)
 ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
 1: (()+0x2aaa8f) [0x7f4eb2de8a8f]
 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575]
 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc]
 4: (()+0xa0290) [0x7f4eb5b27290]
 5: (()+0x879dd) [0x7f4eb5b0e9dd]
 6: (()+0x87c1b) [0x7f4eb5b0ec1b]
 7: (()+0x87ae1) [0x7f4eb5b0eae1]
 8: (()+0x87d50) [0x7f4eb5b0ed50]
 9: (()+0xb37b2) [0x7f4eb5b3a7b2]
 10: (()+0x1e83eb) [0x7f4eb5c6f3eb]
 11: (()+0x1ab54a) [0x7f4eb5c3254a]
 12: (main()+0x9da) [0x7f4eb5c72a3a]
 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd]
 14: (()+0x710b9) [0x7f4eb5af80b9]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed to 
 interpret this.
 terminate called after

 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Single host VM limit when using RBD

2013-01-17 Thread Matthew Anderson
Hi Audrey,

I did try your suggestion beforehand and it doesn't appear to fix the issue. 

[root@KVM04 ~]# cat  /proc/sys/kernel/threads-max 
2549635
[root@KVM04 ~]# echo 5549635  /proc/sys/kernel/threads-max
[root@KVM04 ~]# virsh start EX03
error: Failed to start domain EX03
error: internal error Process exited while reading console log output: char 
device redirected to /dev/pts/23
Thread::try_create(): pthread_create failed with error 11common/Thread.cc: In 
function 'void Thread::create(size_t)' thread 7f5ec9706960 time 2013-01-17 
16:46:50.935681
common/Thread.cc: 110: FAILED assert(ret == 0)
 ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
 1: (()+0x2aaa8f) [0x7f5ec6a89a8f]
 2: (SafeTimer::init()+0x95) [0x7f5ec6973575]
 3: (librados::RadosClient::connect()+0x72c) [0x7f5ec69099dc]
 4: (()+0xa0290) [0x7f5ec97c8290]
 5: (()+0x879dd) [0x7f5ec97af9dd]
 6: (()+0x87c1b) [0x7f5ec97afc1b]
 7: (()+0x87ae1) [0x7f5ec97afae1]
 8: (()+0x87d50) [0x7f5ec97afd50]
 9: (()+0xb37b2) [0x7f5ec97db7b2]
 10: (()+0x1e83eb) [0x7f5ec99103eb]
 11: (()+0x1ab54a) [0x7f5ec98d354a]
 12: (main()+0x9da) [0x7f5ec9913a3a]
 13: (__libc_start_main()+0xfd) [0x7f5ec5755cdd]
 14: (()+0x710b9) [0x7f5ec97990b9]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed to 
interpret this.
terminate called after
    


-Original Message-
From: Andrey Korolyov [mailto:and...@xdel.ru] 
Sent: Thursday, 17 January 2013 4:42 PM
To: Matthew Anderson
Cc: ceph-devel@vger.kernel.org
Subject: Re: Single host VM limit when using RBD

Hi Matthew,

Seems to a low value in /proc/sys/kernel/threads-max value.

On Thu, Jan 17, 2013 at 12:37 PM, Matthew Anderson matth...@base3.com.au 
wrote:
 I've run into a limit on the maximum number of RBD backed VM's that I'm able 
 to run on a single host. I have 20 VM's (21 RBD volumes open) running on a 
 single host and when booting the 21st machine I get the below error from 
 libvirt/QEMU. I'm able to shut down a VM and start another in it's place so 
 there seems to be a hard limit on the amount of volumes I'm able to have 
 open.  I did some googling and the error 11 from pthread_create seems to mean 
 'resource unavailable' so I'm probably running into a thread limit of some 
 sort. I did try increasing the max_thread kernel option but nothing changed. 
 I moved a few VM's to a different empty host and they start with no issues at 
 all.

 This machine has 4 OSD's running on it in addition to the 20 VM's. Kernel 
 3.7.1. Ceph 0.56.1 and QEMU 1.3.0. There is currently 65GB of 96GB free ram 
 and no swap.

 Can anyone suggest where the limit might be or anything I can do to narrow 
 down the problem?

 Thanks
 -Matt
 -

 Error starting domain: internal error Process exited while reading 
 console log output: char device redirected to /dev/pts/23
 Thread::try_create(): pthread_create failed with error 
 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 
 7f4eb5a65960 time 2013-01-17 02:32:58.096437
 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 
 (e4a541624df62ef353e754391cbbb707f54b16f7)
 1: (()+0x2aaa8f) [0x7f4eb2de8a8f]
 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575]
 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc]
 4: (()+0xa0290) [0x7f4eb5b27290]
 5: (()+0x879dd) [0x7f4eb5b0e9dd]
 6: (()+0x87c1b) [0x7f4eb5b0ec1b]
 7: (()+0x87ae1) [0x7f4eb5b0eae1]
 8: (()+0x87d50) [0x7f4eb5b0ed50]
 9: (()+0xb37b2) [0x7f4eb5b3a7b2]
 10: (()+0x1e83eb) [0x7f4eb5c6f3eb]
 11: (()+0x1ab54a) [0x7f4eb5c3254a]
 12: (main()+0x9da) [0x7f4eb5c72a3a]
 13: (__libc_start_main()+0xfd) [0x7f4eb1ab4cdd]
 14: (()+0x710b9) [0x7f4eb5af80b9]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed to 
 interpret this.
 terminate called after

 Traceback (most recent call last):
   File /usr/share/virt-manager/virtManager/asyncjob.py, line 96, in 
 cb_wrapper
 callback(asyncjob, *args, **kwargs)
   File /usr/share/virt-manager/virtManager/asyncjob.py, line 117, in tmpcb
 callback(*args, **kwargs)
   File /usr/share/virt-manager/virtManager/domain.py, line 1090, in startup
 self._backend.create()
   File /usr/lib/python2.7/dist-packages/libvirt.py, line 620, in create
 if ret == -1: raise libvirtError ('virDomainCreate() failed', 
 dom=self)
 libvirtError: internal error Process exited while reading console log 
 output: char device redirected to /dev/pts/23
 Thread::try_create(): pthread_create failed with error 
 11common/Thread.cc: In function 'void Thread::create(size_t)' thread 
 7f4eb5a65960 time 2013-01-17 02:32:58.096437
 common/Thread.cc: 110: FAILED assert(ret == 0) ceph version 0.56.1 
 (e4a541624df62ef353e754391cbbb707f54b16f7)
 1: (()+0x2aaa8f) [0x7f4eb2de8a8f]
 2: (SafeTimer::init()+0x95) [0x7f4eb2cd2575]
 3: (librados::RadosClient::connect()+0x72c) [0x7f4eb2c689dc]
 4: (()+0xa0290) [0x7f4eb5b27290]
 5: (()+0x879dd) [0x7f4eb5b0e9dd]
 6: (()+0x87c1b) [0x7f4eb5b0ec1b]
 7: (()+0x87ae1) [0x7f4eb5b0eae1