Re: [Openstack] why did openstack choose ceph (and not glusterfs)
Haomai, Zippy, Do you have any actual link to share about where your impression is from? I don't want to be the last to know that we _choose_ something. On Thu, Jul 11, 2013 at 8:00 PM, Haomai Wang hao...@unitedstack.com wrote: Hi, I think the information OpenStack choose Ceph is from Ceph community, we can often heard about correlative talks from Ceph community and blogs. There is a illusion that OpenStack decide to choose Ceph as backend. Best regards, Haomai Wang, UnitedStack Inc. 在 2013-7-12,上午8:32,John Griffith john.griff...@solidfire.com 写道: On Thu, Jul 11, 2013 at 6:27 PM, Tom Fifield t...@openstack.org wrote: Hi, Community Manager here - just confirming - OpenStack has not chosen Ceph. Not sure where that information is coming from - got a blog link so we can fix any confusion? :) Regards, Tom On 12/07/13 10:23, Zippy Zeppoli wrote: Hello, I apologize if this email causes some kind of subjective ruckus, but frankly I don't care (sorry etiquette) since it will resolve a reasonable question that isn't clearly answered on the web. Why did openstack choose ceph and not glusterfs. There doesn't seem to be a lot of (good) information on how/why to choose one over the other, and I'm sure most folks do a proof-of-concept to figure this out, but it doesn't seem like a lot of information has been shared on the matter. That being said, OpenStack is a large open source project that has decided to use this storage platform (big decision). Why and how did the technical architects for OpenStack come to this decision (blog post would be awesome, wasn't able to find one Googling). CheerZ __**_ Mailing list: https://launchpad.net/~**openstackhttps://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~**openstackhttps://launchpad.net/~openstack More help : https://help.launchpad.net/**ListHelphttps://help.launchpad.net/ListHelp Hi Zippy, To be clear, OpenStack doesn't really choose at all. In terms of Cinder you have a choice, that could be the base LVM implementation, Ceph-RBD, Gluster or any choice from a long list of supported/integrated backend storage devices. John ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Swift] Cache pressure tuning
On Tue, Jun 18, 2013 at 10:42 AM, Jonathan Lu jojokur...@gmail.com wrote: On 2013/6/17 18:59, Robert van Leeuwen wrote: I'm facing the issue about the performance degradation, and once I glanced that changing the value in /proc/sys /vm/vfs_cache_pressure will do a favour. Can anyone explain to me whether and why it is useful? Hi, When this is set to a lower value the kernel will try to keep the inode/dentry cache longer in memory. Since the swift replicator is scanning the filesystem continuously it will eat up a lot of iops if those are not in memory. To see if a lot of cache misses are happening, for xfs, you can look at xs_dir_lookup and xs_ig_missed. ( look at http://xfs.org/index.php/**Runtime_Statshttp://xfs.org/index.php/Runtime_Stats) We greatly benefited from setting this to a low value but we have quite a lot of files on a node ( 30 million) Note that setting this to zero will result in the OOM killer killing the machine sooner or later. (especially if files are moved around due to a cluster change ;) Cheers, Robert van Leeuwen Hi, We set this to a low value(20) and the performance is better than before. It seems quite useful. According to your description, this issue is related with the object quantity in the storage. We delete all the objects in the storage but it doesn't help anything. The only method to recover is to format and re-mount the storage node. We try to install swift on different environment but this degradation problem seems to be an inevitable one. It's inode cache for each file(object) helps (reduce extra disk IOs). As long as your memory is big enough to hold inode information of those frequently accessed objects, you are good. And there's no need (no point) to limit # of objects for each storage node IMO. You may manually load inode information of objects into VFS cache if you like (by simply 'ls' files), to _restore_ performance. But still memory size and object access pattern are the key to this kind of performance tuning, if memory is too small, inode cache will be invalided sooner or later. Cheers, Jonathan Lu __**_ Mailing list: https://launchpad.net/~**openstackhttps://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~**openstackhttps://launchpad.net/~openstack More help : https://help.launchpad.net/**ListHelphttps://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Swift] Cache pressure tuning
On Tue, Jun 18, 2013 at 12:35 PM, Jonathan Lu jojokur...@gmail.com wrote: Hi, Huang Thanks for you explanation. Does it mean that the storage cluster of specific processing ability will be slower and slower with more and more objects? Is there any test about the rate of the decline or is there any lower limit? For example, my environment is: Swift version : grizzly Tried on Ubuntu 12.04 3 Storage-nodes : each for 16GB-ram / CPU 4*2 / 3TB*12 The expected* *throughout is more than 100/s with uploaded objects of 50KB. At the beginning it works quite well and then it drops. If this degradation is unstoppable, I'm afraid that the performance will finally not be able to meet our needs no matter how I tuning other config. It won't be hard to do a base line performance (without inode cache) assessment of your system: populate your system with certain mount of objects with desired size (say 50k, 10million objects 1,000 objects per container at 10,000 containers), and *then drop VFS caches explicitly before testing*. Measure performance with your desired IO pattern and in the mean time drop VFS cache every once in a while (say every 60s). That's roughly the performance you can get when your storage system gets into a 'steady' state (i.e. objects # has out grown memory size). This will give you idea of pretty much the worst case. Jonathan Lu On 2013/6/18 11:05, Huang Zhiteng wrote: On Tue, Jun 18, 2013 at 10:42 AM, Jonathan Lu jojokur...@gmail.comwrote: On 2013/6/17 18:59, Robert van Leeuwen wrote: I'm facing the issue about the performance degradation, and once I glanced that changing the value in /proc/sys /vm/vfs_cache_pressure will do a favour. Can anyone explain to me whether and why it is useful? Hi, When this is set to a lower value the kernel will try to keep the inode/dentry cache longer in memory. Since the swift replicator is scanning the filesystem continuously it will eat up a lot of iops if those are not in memory. To see if a lot of cache misses are happening, for xfs, you can look at xs_dir_lookup and xs_ig_missed. ( look at http://xfs.org/index.php/Runtime_Stats ) We greatly benefited from setting this to a low value but we have quite a lot of files on a node ( 30 million) Note that setting this to zero will result in the OOM killer killing the machine sooner or later. (especially if files are moved around due to a cluster change ;) Cheers, Robert van Leeuwen Hi, We set this to a low value(20) and the performance is better than before. It seems quite useful. According to your description, this issue is related with the object quantity in the storage. We delete all the objects in the storage but it doesn't help anything. The only method to recover is to format and re-mount the storage node. We try to install swift on different environment but this degradation problem seems to be an inevitable one. It's inode cache for each file(object) helps (reduce extra disk IOs). As long as your memory is big enough to hold inode information of those frequently accessed objects, you are good. And there's no need (no point) to limit # of objects for each storage node IMO. You may manually load inode information of objects into VFS cache if you like (by simply 'ls' files), to _restore_ performance. But still memory size and object access pattern are the key to this kind of performance tuning, if memory is too small, inode cache will be invalided sooner or later. Cheers, Jonathan Lu ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Could not create volume(Sheepdog)
That's because latest Cinder now uses filter scheduler which requires back-end (in your case: sheepdog) driver to report capabilities/stats. Unfortunately, sheepdog driver hasn't been updated (yet). On Fri, Feb 22, 2013 at 1:36 AM, harryxiyou harryxi...@gmail.com wrote: Hi all, After i depolied openstack ENV. by Devstack, i could not create Volume successfully. The commands of creating a volume are like this 'cinder create 5'. At last, I caught following error logs by 'screen -r'. 2013-02-22 01:14:02.237 INFO cinder.api.openstack.wsgi [req-56d30373-c93b-41f8-8fdd-1d9b123f5f40 d863ce5682954b268cd92ad8da440de7 1c13d4432d5c486dbc0c54030d5ceb00] POST http://192.168.1.3:8776/v1/1c13d4432d5c486dbc0c54030d5ceb00/volumes 2013-02-22 01:14:02.239 AUDIT cinder.api.v1.volumes [req-56d30373-c93b-41f8-8fdd-1d9b123f5f40 d863ce5682954b268cd92ad8da440de7 1c13d4432d5c486dbc0c54030d5ceb00] Create volume of 5 GB 2013-02-22 01:14:02.778 AUDIT cinder.api.v1.volumes [req-56d30373-c93b-41f8-8fdd-1d9b123f5f40 d863ce5682954b268cd92ad8da440de7 1c13d4432d5c486dbc0c54030d5ceb00] vol={'volume_metadata': [], 'availability_zone': 'nova', 'terminated_at': None, 'updated_at': None, 'snapshot_id': None, 'ec2_id': None, 'mountpoint': None, 'deleted_at': None, 'id': 'f3ebeb9d-bc9a-4984-abe2-47bd5e0e773b', 'size': 5, 'user_id': u'd863ce5682954b268cd92ad8da440de7', 'attach_time': None, 'display_description': None, 'project_id': u'1c13d4432d5c486dbc0c54030d5ceb00', 'launched_at': None, 'scheduled_at': None, 'status': 'creating', 'volume_type_id': None, 'deleted': False, 'provider_location': None, 'host': None, 'source_volid': None, 'provider_auth': None, 'display_name': None, 'instance_uuid': None, 'created_at': datetime.datetime(2013, 2, 21, 17, 14, 2, 434462), 'attach_status': 'detached', 'volume_type': None, 'metadata': {}} 2013-02-22 01:14:02.780 INFO cinder.api.openstack.wsgi [req-56d30373-c93b-41f8-8fdd-1d9b123f5f40 d863ce5682954b268cd92ad8da440de7 1c13d4432d5c486dbc0c54030d5ceb00] http://192.168.1.3:8776/v1/1c13d4432d5c486dbc0c54030d5ceb00/volumes returned with HTTP 200 $ cd /opt/stack/cinder /opt/stack/cinder/bin/cinder-scheduler --config-file /etc/cinder/cinder.conf || touch /opt/stack/status/stack/c-sch.failure 2013-02-22 01:01:23.277 AUDIT cinder.service [-] Starting cinder-scheduler node (version 2013.1) 2013-02-22 01:01:25.319 INFO cinder.openstack.common.rpc.common [-] Connected to AMQP server on localhost:5672 2013-02-22 01:01:25.944 INFO cinder.openstack.common.rpc.common [-] Connected to AMQP server on localhost:5672 2013-02-22 01:05:02.445 WARNING cinder.scheduler.filters.capacity_filter [req-f5d7f718-c29f-4ce2-b213-47ec3314622e d863ce5682954b268cd92ad8da440de7 1c13d4432d5c486dbc0c54030d5ceb00] Free capacity not set;volume node info collection broken. 2013-02-22 01:05:02.445 WARNING cinder.scheduler.manager [req-f5d7f718-c29f-4ce2-b213-47ec3314622e d863ce5682954b268cd92ad8da440de7 1c13d4432d5c486dbc0c54030d5ceb00] Failed to schedule_create_volume: No valid host was found. 2013-02-22 01:14:02.782 WARNING cinder.scheduler.filters.capacity_filter [req-56d30373-c93b-41f8-8fdd-1d9b123f5f40 d863ce5682954b268cd92ad8da440de7 1c13d4432d5c486dbc0c54030d5ceb00] Free capacity not set;volume node info collection broken. 2013-02-22 01:14:02.783 WARNING cinder.scheduler.manager [req-56d30373-c93b-41f8-8fdd-1d9b123f5f40 d863ce5682954b268cd92ad8da440de7 1c13d4432d5c486dbc0c54030d5ceb00] Failed to schedule_create_volume: No valid host was found. Could anyone give me some suggestions? Thanks in advance. -- Thanks Harry Wei ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Nova] Question about multi-scheduler
Multi-scheduler was there for the days when Nova has to deal with both request for instance and volume. If you are using a version of Nova which still has nova-volume, I think multi-scheduler is a must. And there are two sub config option for Nova: http://docs.openstack.org/folsom/openstack-compute/admin/content/ch_scheduling.html scheduler_driver=nova.scheduler.multi.MultiScheduler volume_scheduler_driver=nova.scheduler.chance.ChanceScheduler compute_scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler But if you are using trunk version Nova, scheduler_driver should use 'nova.scheduler.filter_scheduler.FilterScheduler' directly. On Tue, Feb 19, 2013 at 10:06 AM, Wangpan hzwang...@corp.netease.com wrote: Hi all, Is there anyone has used the multi-scheduler property, is there any problem? Can we use this property into production environment? Thanks! 2013-02-19 Wangpan ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Nova] Question about multi-scheduler
Oh, I got it. I remember in last conference, Ryan Richard's 'consideration for building an openstack private cloud' mentioned using more than one scheduler instances to do HA when the cluster becomes large (this is not in his slides where you can find on slideshare.net but he orally mentioned it). But he also mentioned that adding more scheduler instances didn't help reducing processing time, which I suspect it won't reduce processing time for single request but it surely increase the concurrency of scheduler so should be helpful for large number of requests use case. Are you referring to certain new feature of Nova when saying 'multi scheduler property' or just referring to any method that running multiple nova scheduler instances? For latter, I think this is surely doable and someone out there has been using this kind of technique (in production I guess). On Tue, Feb 19, 2013 at 10:34 AM, Wangpan hzwang...@corp.netease.com wrote: Hi Zhiteng, May what I said in the last mail is confused, What I want to ask is that, can we use more than one nova-scheduler service in one cluster? is it stable enough to be used in the production environment? Because I want to make the availability of scheduler service higher. thanks! 2013-02-19 Wangpan 发件人:Huang Zhiteng 发送时间:2013-02-19 10:15 主题:Re: [Openstack] [Nova] Question about multi-scheduler 收件人:Wangpanhzwang...@corp.netease.com 抄送:openstack _社区openstack@lists.launchpad.net Multi-scheduler was there for the days when Nova has to deal with both request for instance and volume. If you are using a version of Nova which still has nova-volume, I think multi-scheduler is a must. And there are two sub config option for Nova: http://docs.openstack.org/folsom/openstack-compute/admin/content/ch_scheduling.html scheduler_driver=nova.scheduler.multi.MultiScheduler volume_scheduler_driver=nova.scheduler.chance.ChanceScheduler compute_scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler But if you are using trunk version Nova, scheduler_driver should use 'nova.scheduler.filter_scheduler.FilterScheduler' directly. On Tue, Feb 19, 2013 at 10:06 AM, Wangpan hzwang...@corp.netease.com wrote: Hi all, Is there anyone has used the multi-scheduler property, is there any problem? Can we use this property into production environment? Thanks! 2013-02-19 Wangpan ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Openstack-dev][Sheepdog]Add a new driver for Openstack Cinder like Sheepdog volumes
Until the QEMU support is official, I don't think it's a good idea to have HLFS driver in Cinder. On Sat, Jan 19, 2013 at 1:14 PM, harryxiyou harryxi...@gmail.com wrote: On Sat, Jan 19, 2013 at 12:24 PM, MORITA Kazutaka morita.kazut...@gmail.com wrote: At Fri, 18 Jan 2013 22:56:38 +0800, [...] The answer depends on the protocol between QEMU and HLFS. What is used for accessing HLFS volumes from QEMU? Is it iSCSI, NFS, or something else? Actually, we just realize the block driver interfaces QEMU provided. You can see our patch from http://code.google.com/p/cloudxy/source/browse/trunk/hlfs/patches/hlfs_driver_for_qemu.patch And what about Sheepdog? What is used for accessing Sheepdog volumes from QEMU? Is it iSCSI, NFS, or something else? -- Thanks Harry Wei ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Openstack-dev][Sheepdog]Add a new driver for Openstack Cinder like Sheepdog volumes
It seems you also have tgt patch for HLFS, personally I'd prefer iSCSI support over qEMU support since iSCSI is well supported by almost every hypervisor. On Jan 19, 2013 9:23 PM, harryxiyou harryxi...@gmail.com wrote: On Sat, Jan 19, 2013 at 7:00 PM, Huang Zhiteng winsto...@gmail.com wrote: Until the QEMU support is official, I don't think it's a good idea to have HLFS driver in Cinder. It sounds reasonable, we will send our patches to QEMULibvirt community. After the patches are merged, we will send patch to Openstack. Do you have any other suggestions? -- Thanks Harry Wei ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Volume driver in Cinder by ISCSI way
For development efforts, it is better to use openstack-dev list instead of this general openstack list. You can also join #openstack-cinder IRC channel in freenode for online discussion with cinder developers. On Jan 18, 2013 9:27 PM, harryxiyou harryxi...@gmail.com wrote: On Fri, Jan 18, 2013 at 8:35 PM, yang, xing xing.y...@emc.com wrote: Hi Harry, If you have questions about EMC volume driver, you can email me. Thanks for your help, that's very kind of you ;-) EMC is not open-source code project, right? And Openstack support EMC by iSCSI way. That is to say, iSCSI is the bridge to connect Openstack and EMC, right? Could you please give me the docs of https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/emc.py I wanna understand how to realize EMC's iSCSI driver. By the way, if i wanna add a driver(a block storage system by iSCSI way) for Openstack Cinder, i should just add my driver file under the dir, https://github.com/openstack/cinder/blob/master/cinder/volume/drivers like emc.py, right? -- Thanks Harry Wei ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] How can openstack support Sheepdog in details?
What kind of details do you need? How to setup a SheepDog cluster? Or how to configure Cinder to use SheepDog? On Tue, Jan 15, 2013 at 11:33 PM, harryxiyou harryxi...@gmail.com wrote: Hi all, I find openstack can support sheepdog(modify qemu and libvirt), but i can't find how openstack support sheepdog in details. Could anyone give me some suggestions? Thanks in advance. -- Thanks Harry Wei ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] How can openstack support Sheepdog in details?
What Razique means is currently there's no doc (yet) and the priority of the creating such doc is low. Razique, am I getting you right? On Tue, Jan 15, 2013 at 11:52 PM, harryxiyou harryxi...@gmail.com wrote: On Tue, Jan 15, 2013 at 11:46 PM, Razique Mahroua razique.mahr...@gmail.com wrote: Hi, Hi unfortunately, we don't have much feedback/ tests that have been done, so the doc is not top notch regarding Sheepdog. Could you please tell me where is the doc? Or send me this doc. It would be interesting if someone helps us with extra infos on that Maybe i would give some tests ;-) -- Thanks Harry Wei ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Duplication of code in nova and cinder
On Wed, Dec 12, 2012 at 8:56 AM, Joshua Harlow harlo...@yahoo-inc.com wrote: Related to this, how do we in the future stop such code-copying from happening in the first place? Is it just that there needs to be a place for this (oslo?) that can be updated more quickly, or something similar? I'm always sorta 'weirded out' when people say that they copied some code in the name of 'it was quicker' or we are just 'borrowing it'. But you have to admit 'it is quicker'. My experience is getting common code into Oslo and then port to project usually take two times longer time to merge, in best case. From: John Griffith john.griff...@solidfire.com Date: Tuesday, December 11, 2012 3:36 PM To: Sam Morrison sorri...@gmail.com Cc: OpenStack mailing list openstack@lists.launchpad.net Subject: Re: [Openstack] Duplication of code in nova and cinder On Tue, Dec 11, 2012 at 4:24 PM, Sam Morrison sorri...@gmail.com wrote: I attempted to create a volume from an image in cinder and was getting this strange error, turns out it was because I had my glance servers specified as https://glanceserver:9292 In cinder the version of images/glance.py is older than the one in nova and is missing the ssl support additions. https://bugs.launchpad.net/cinder/+bug/1089147 My real question is why is there one version is nova and one version in cinder. I also think there is quite a bit more unnecessary duplication. Should it all go into oslo? Cheers, Sam ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp Hi Sam, Short answer is yes. Need to check scoping etc and make sure that it does in fact fit within the parameters of OSLO. It's something I thought of a couple weeks ago but to be honest it's been low on my list personally and nobody else that I know of has shown an interest in picking it up. You'll notice another image related item we're *borrowing* from Nova (cinder.image.image_utils). In both cases there are slight modifications to fit Cinder's use case that given a bit of work could easily be shared. John ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Duplication of code in nova and cinder
On Wed, Dec 12, 2012 at 9:34 AM, Joshua Harlow harlo...@yahoo-inc.com wrote: Isn't that a lets fix the slowness instead of continue bad behavior. Fix the root problem and don't bypass it in the first place? Then said root problem is solved for everyone and isn't pushed into the future (as is typically done). I'm not saying copying is the right thing to do. I totally agree we should avoid doing this. Fixing the slowness is also important. Oslo core devs, please take a look at the review queue, I've patches there for you. :) -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Cinder] New volume status stuck at Creating after creation in Horizon
to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] The list of the backend storage for openstack.
I think the short answer is it depends. It really depends on what kind of user needs you want to use storage system to fulfill. There's no best solution but in each specific case, there can be one most suitable solution for the requirement. I know this may not be the answer you are seeking but I'd say start looking at the problem you are solving and figure out what kind of specific needs the problem has for storage, then you may easily narrow down # of candidates. On Thu, Nov 29, 2012 at 3:36 PM, Lei Zhang zhang.lei@gmail.com wrote: Anybody can help?Could you give me some key term or links that I can find some article/book about this. I am very confused about the storage in the Openstack. On Wed, Nov 28, 2012 at 2:48 PM, Lei Zhang zhang.lei@gmail.com wrote: Hi all , Here is my understanding about the stroage in the openstack, I am very happy about that who can figure out the wrong meaning or add your comment for this. Right now, there are two main kinds of storage needed by Openstack. One is for Image, the other one is for Volume. These two support many kinds of storage. Image ( Glance servicer ): File NAS CIFS NFS FTP ... SAN ISCSI AoE ... Local HTTP RBD(Ceph) S3 Swift Volume (Cinder/nova-volume) SAN ISCSI AoE Local LVM RBD (Ceph) Sheepdog Questions: 1. What's the position for swift. It is just the image container for the glance server? 2. Is there any rules when making choice? 2. What's your tend to choice in so many solution? and why? -- Lei Zhang Blog: http://jeffrey4l.github.com twitter/weibo: @jeffrey4l -- Lei Zhang Blog: http://jeffrey4l.github.com twitter/weibo: @jeffrey4l ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Scheduler issues in folsom
Hi Jonathan, If I understand correctly, that bug is about multiple scheduler instances(processes) doing scheduler at the same time. When compute node found itself unable to fulfil a create_instance request, it'll resend the request back to scheduler (max_retry is to avoid endless retry). From your description, I only see one scheduler. And you are right, even memory may have some issue but about cpu_allocation_ratio should have limit scheduler to put instances with more vCPUs then pCPUs. What openstack package are you using? On Wed, Oct 31, 2012 at 11:41 PM, Jonathan Proulx j...@jonproulx.com wrote: Hi All While the RetryScheduler may not have been designed specifically to fix this issue https://bugs.launchpad.net/nova/+bug/1011852 suggests that it is meant to fix it, well if it is a scheduler race condition which is my suspicion. This is my current scheduler config which gives the failure mode I describe: scheduler_available_filters=nova.scheduler.filters.standard_filters scheduler_default_filters=AvailabilityZoneFilter,RamFilter,CoreFilter,ComputeFilte r,RetryFilter scheduler_max_attempts=30 least_cost_functions=nova.scheduler.least_cost.compute_fill_first_cost_fn compute_fill_first_cost_fn_weight=1.0 cpu_allocation_ratio=1.0 ram_allocation_ratio=1.0 I'm running the scheduler and api server on a single controller host and it's pretty consistent about scheduling hundred instances per node at first then iteratively rescheduling them elsewhere when presented with either an single API request to start many instances (using euca2ools) or a shell loop around nova boot to generate one api request per server. the cpu_allocation ratio should limit the scheduler to 24 instances per compute node regardless how how it's calculating memory, so while I talked a lot about memory allocation as a motivation it is more frequent for cpu to actually be the limiting factor in my deployment and it certainly should. And yet after attempting to launch 200 m1.tiny instances: root@nimbus-0:~# nova-manage service describe_resource nova-23 2012-10-31 11:17:56 HOST PROJECT cpu mem(mb) hdd nova-23 (total)24 48295 882 nova-23 (used_now)107 56832 30 nova-23 (used_max)107 56320 30 nova-23 98333a1a28e746fa8c629c83a818ad57 106 54272 0 nova-23 3008a142e9524f7295b06ea811908f93 1 2048 30 eventually those bleed off to other systems though not entirely 2012-10-31 11:29:41 HOST PROJECT cpu mem(mb) hdd nova-23 (total)24 48295 882 nova-23 (used_now) 43 24064 30 nova-23 (used_max) 43 23552 30 nova-23 98333a1a28e746fa8c629c83a818ad57 42 21504 0 nova-23 3008a142e9524f7295b06ea811908f93 1 2048 30 at this point 12min later out of 200 instances 168 are active 22 are errored and 10 are still building. Notably only 23 actual VMs are running on nova-23: root@nova-23:~# virsh list|grep instance |wc -l 23 So that's what I see perhaps my assumptions about why I'm seeing it are incorrect Thanks, -Jon -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Scheduler issues in folsom
On Wed, Oct 31, 2012 at 6:55 AM, Vishvananda Ishaya vishvana...@gmail.com wrote: The retry scheduler is NOT meant to be a workaround for this. It sounds like the ram filter is not working properly somehow. Have you changed the setting for ram_allocation_ratio? It defaults to 1.5 allowing overallocation, but in your case you may want 1.0. I would be using the following two config options to achieve what you want: compute_fill_first_cost_fn_weight=1.0 ram_allocation_ratio=1.0 I'd suggest the same ratio too. But besides memory overcommitment, I suspect this issue is also related to how KVM do memory allocation (it doesn't do actual allocation of the entire memory for guest when booting). I've seen compute node reported more memory than it should have (e.g. 4G node has two 1GB instances running but still report 3GB free memory) because libvirt driver calculates free memory simply based on /proc/meminfo, which doesn't reflect how many memory guests are intended to use. If you are using the settings above, then the scheduler should be using up the resources on the node it schedules to until it fills up the available ram and then moving on to the next node. If this is not occurring then you have uncovered some sort of bug. Vish On Oct 30, 2012, at 9:21 AM, Jonathan Proulx j...@csail.mit.edu wrote: Hi All, I'm having what I consider serious issues with teh scheduler in Folsom. It seems to relate to the introdution of threading in the scheduler. For a number of local reason we prefer to have instances start on the compute node with the least amount of free RAM that is still enough to satisfy the request which is the reverse of the default policy of scheduling on the system with the most free RAM. I'm fairly certain the smae behavior would be seen with that policy as well, and any other policy that results in a best choice for scheduling the next instance. We have work loads that start hundreds of instances or the same image and there are plans on scaling this to thousands. What I'm seeing is somehting like this: * user submits API request for 300 instances * scheduler puts them all on one node * retry schedule kicks in at some point for the 276 that don't fit * those 276 are all scheduled on the next best node * retry cycle repeats with the 252 that don't fit there I'm not clear exactly where the RetryScheduler in serts itself (I should probably read it) but the first compute node is very overloaded handling start up request which results in a fair number of instances entering ERROR state rather than rescheduling (so not all 276 actually make it to the next round) and the whole process it painfully slow. In the end we are lucky to see 50% of the requested instances actually make it into Active state (and then only becasue we increased scheduler_max_attempts). Is that really how it's supposed to work? With the introduction of the RetryScheduler as a fix for the scheduling race condition I think it is, but it is a pretty bad solution for me, unless I'm missing something, am I? wouln't be the first time... For now I'm working around this by using the ChanceScheduler (compute_scheduler_driver=nova.scheduler.chance.ChanceScheduler) so the scheduler threads don't pick a best node. This is orders of magnitude faster and consistantly successful in my tests. It is not ideal for us as we have a small minority of ciompute nodes with twice the memory capacity of our standard nodes and would prefer to keep those available for some of our extra large memory flavors and we'd also liek to minimize memory fragmentation on the standard sized nodes for similar reasons. -Jon ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Scheduler issues in folsom
On Wed, Oct 31, 2012 at 10:07 AM, Vishvananda Ishaya vishvana...@gmail.com wrote: On Oct 30, 2012, at 7:01 PM, Huang Zhiteng winsto...@gmail.com wrote: I'd suggest the same ratio too. But besides memory overcommitment, I suspect this issue is also related to how KVM do memory allocation (it doesn't do actual allocation of the entire memory for guest when booting). I've seen compute node reported more memory than it should have (e.g. 4G node has two 1GB instances running but still report 3GB free memory) because libvirt driver calculates free memory simply based on /proc/meminfo, which doesn't reflect how many memory guests are intended to use. Ah interesting, if this is true then this is a bug we should try to fix. I was under the impression that it allocated all of the memory unless you were using virtio_balloon, but I haven't verified. I'm pretty sure about this. Can anyone from RedHat confirm this is how KVM works? Vish -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Folsom nova-scheduler race condition?
On Wed, Oct 10, 2012 at 3:44 PM, Day, Phil philip@hp.com wrote: Per my understanding, this shouldn't happen no matter how (fast) you create instances since the requests are queued and scheduler updates resource information after it processes each request. The only possibility may cause the problem you met that I can think of is there are more than 1 scheduler doing scheduling. I think the new retry logic is meant to be safe even if there is more than one scheduler, as the requests are effectively serialised when they get to the compute manager, which can then reject any that break its actual resource limits ? Yes, but it seems Jonathan's filter list doesn't include RetryFilter, so it's possible that he ran into a race condition that RetryFilter targeted to solve. -Original Message- From: openstack-bounces+philip.day=hp@lists.launchpad.net [mailto:openstack-bounces+philip.day=hp@lists.launchpad.net] On Behalf Of Huang Zhiteng Sent: 10 October 2012 04:28 To: Jonathan Proulx Cc: openstack@lists.launchpad.net Subject: Re: [Openstack] Folsom nova-scheduler race condition? On Tue, Oct 9, 2012 at 10:52 PM, Jonathan Proulx j...@jonproulx.com wrote: Hi All, Looking for a sanity test before I file a bug. I very recently upgraded my install to Folsom (on top of Ubuntu 12.04/kvm). My scheduler settings in nova.conf are: scheduler_available_filters=nova.scheduler.filters.standard_filters scheduler_default_filters=AvailabilityZoneFilter,RamFilter,CoreFilter, ComputeFilter least_cost_functions=nova.scheduler.least_cost.compute_fill_first_cost _fn compute_fill_first_cost_fn_weight=1.0 cpu_allocation_ratio=1.0 This had been working to fill systems based on available RAM and to not exceed 1:1 allocation ration of CPU resources with Essex. With Folsom, if I specify a moderately large number of instances to boot or spin up single instances in a tight shell loop they will all get schedule on the same compute node well in excess of the number of available vCPUs . If I start them one at a time (using --poll in a shell loop so each instance is started before the next launches) then I get the expected allocation behaviour. Per my understanding, this shouldn't happen no matter how (fast) you create instances since the requests are queued and scheduler updates resource information after it processes each request. The only possibility may cause the problem you met that I can think of is there are more than 1 scheduler doing scheduling. I see https://bugs.launchpad.net/nova/+bug/1011852 which seems to attempt to address this issue but as I read it that fix is based on retrying failures. Since KVM is capable of over committing both CPU and Memory I don't seem to get retryable failure, just really bad performance. Am I missing something this this fix or perhaps there's a reported bug I didn't find in my search, or is this really a bug no one has reported? Thanks, -Jon ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Folsom nova-scheduler race condition?
On Tue, Oct 9, 2012 at 10:52 PM, Jonathan Proulx j...@jonproulx.com wrote: Hi All, Looking for a sanity test before I file a bug. I very recently upgraded my install to Folsom (on top of Ubuntu 12.04/kvm). My scheduler settings in nova.conf are: scheduler_available_filters=nova.scheduler.filters.standard_filters scheduler_default_filters=AvailabilityZoneFilter,RamFilter,CoreFilter,ComputeFilter least_cost_functions=nova.scheduler.least_cost.compute_fill_first_cost_fn compute_fill_first_cost_fn_weight=1.0 cpu_allocation_ratio=1.0 This had been working to fill systems based on available RAM and to not exceed 1:1 allocation ration of CPU resources with Essex. With Folsom, if I specify a moderately large number of instances to boot or spin up single instances in a tight shell loop they will all get schedule on the same compute node well in excess of the number of available vCPUs . If I start them one at a time (using --poll in a shell loop so each instance is started before the next launches) then I get the expected allocation behaviour. Per my understanding, this shouldn't happen no matter how (fast) you create instances since the requests are queued and scheduler updates resource information after it processes each request. The only possibility may cause the problem you met that I can think of is there are more than 1 scheduler doing scheduling. I see https://bugs.launchpad.net/nova/+bug/1011852 which seems to attempt to address this issue but as I read it that fix is based on retrying failures. Since KVM is capable of over committing both CPU and Memory I don't seem to get retryable failure, just really bad performance. Am I missing something this this fix or perhaps there's a reported bug I didn't find in my search, or is this really a bug no one has reported? Thanks, -Jon ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] OpenStack Summit Tracks Topics
On Tue, Aug 14, 2012 at 11:29 PM, Sandy Walsh sandy.wa...@rackspace.com wrote: Perhaps off topic, but ... One of the things I've noticed at the last couple of summits are the number of new attendees that could really use an OpenStack 101 session. Many of them are on fact-finding missions and their understanding of the architecture is 10,000'+. Usually when conf's get to this size there's a day beforehand for workshops/tutorials/getting-started stuff. I'm sure it's too late for this coming summit, but perhaps something to consider for later ones? Hands-on, code-level, devstack, configuration, debug. I'd be happy to help out with this. +1! Thoughts? -S From: openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net [openstack-bounces+sandy.walsh=rackspace@lists.launchpad.net] on behalf of Thierry Carrez [thie...@openstack.org] Sent: Tuesday, August 14, 2012 12:19 PM To: openstack@lists.launchpad.net Subject: Re: [Openstack] OpenStack Summit Tracks Topics Lauren Sell wrote: Speaking submissions for the conference-style content are live http://www.openstack.org/summit/san-diego-2012/call-for-speakers/ (basically everything except the Design Summit working sessions which will open for submissions in the next few weeks), and the deadline is August 30. A bit of explanation on the contents for the Design Summit track: The Design Summit track is for developers and contributors to the next release cycle of OpenStack (codenamed Grizzly). Each session is an open discussion on a given technical theme or specific feature to-be-developed in one of the OpenStack core projects. Compared to previous editions, we'll run parallel to rest of the OpenStack Summit and just be one of the tracks for the general event. We'll run over 4 days, but there will be no session scheduled during the general session of the OpenStack Summit (first hours in the morning on Tuesday/Wednesday). Finally, all sessions will be 40-min long, to align with the rest of the event. Within the Design Summit we also used to have classic presentations around Devops, ecosystem and related projects: those will now have their own tracks in the OpenStack Summit (Operations Summit, Related OSS Projects, Ecosystem, Security...), so they are no longer a subpart of the Design Summit track. The Design Summit will be entirely focused on the Grizzly cycle of official OpenStack projects, and entirely made of open discussions. We'll also have some breakout rooms available for extra workgroups and incubated projects. The sessions within the design summit are now organized around Topics. The topics for the Design Summit are the core projects, openstack-common, Documentation and a common Process track to cover the release cycle and infrastructure. Each topic content is coordinated by the corresponding team lead(s). Since most developers are focused on Folsom right now, we traditionally open our call for sessions a bit later (should be opened first week of September). Contributors will be invited to suggest a topic for design summit sessions. After the Folsom release, each topic lead will review the suggestions, merge some of them and come up with an agenda for his/her topic. You can already see the proposed topic layout on the Design Summit topics tab in the document linked in Lauren's email. Comments/Feedback welcome ! -- Thierry Carrez (ttx) Release Manager, OpenStack ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] KVM live block migration: stability, future, docs
But to the contrary. I tested live-migrate (without block migrate) last night using a guest with 8GB RAM (almost fully committed) and lost any access/contact with the guest for over 4 minutes - it was paused for the duration. Not something I'd want to do to a user's web-server on a regular basis... 4 minutes of pause (down time)? That's way too long. Even there was crazy memory intensive workload inside the VM being migrated, the worst case is KVM has to pause VM and transmit all 8 GB memory (all memory are dirty, which is very rare). If you have 1GbE link between two host, that worst case pause period (down time) is less than 2 minutes. My previous experience is: the down time for migrating one idle (almost no memory access) 8GB VM via 1GbE is less than 1 second; the down time for migrating a 8 GB VM that page got dirty really quickly is 60 seconds. FYI. -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [nova] [cinder] Nova-volume vs. Cinder in Folsom
On Tue, Jul 17, 2012 at 6:12 PM, Howley, Tom tom.how...@hp.com wrote: Will we also have a separate Cinder API server? Yes, we have. Tom -Original Message- From: openstack-bounces+tom.howley=hp@lists.launchpad.net [mailto:openstack-bounces+tom.howley=hp@lists.launchpad.net] On Behalf Of Thomas, Duncan Sent: 17 July 2012 10:47 To: Jay Pipes; openstack@lists.launchpad.net Subject: Re: [Openstack] [nova] [cinder] Nova-volume vs. Cinder in Folsom Jay Pipes on 16 July 2012 18:31 wrote: On 07/16/2012 09:55 AM, David Kranz wrote: Sure, although in this *particular* case the Cinder project is a bit-for-bit copy of nova-volumes. In fact, the only thing really of cause for concern are: * Providing a migration script for the database tables currently in the Nova database to the Cinder database * Ensuring that Keystone's service catalog exposes the volume endpoint along with the compute endpoint so that volume API calls are routed to the right endpoint (and there's nothing preventing a simple URL rewrite redirect for the existing /volumes calls in the Compute API to be routed directly to the new Volumes endpoint which has the same API Plus stand up a new rabbit HA server. Plus stand up a new HA database server. Plus understand the new availability constraints of the nova-cinder interface point Plus whatever else I haven't scoped yet And there are bug fixes and correctness fixes slowly going into Cinder, so it is not a bit--for-bit copy any longer... ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] What is the most commonly used Hypervisor and toolset combination?
I'd suggest: CentOS 6.3/RHEL 6.3 + KVM + OpenStack Essex, you may replace CentOS with Ubuntu/Fedora if you want. If you are big fan of Xen or having huge legacy PV VMs, you might want to try XenServer + XenAPI + OpenStack Essex. On Wed, Jul 18, 2012 at 12:20 PM, Wang Li fox...@gmail.com wrote: hi, all My team is trying to deploy openstack in production environment. We tried to get libvirt + xen 3.4.3 + CenOS 5.4 + Openstack 2012.2 working, but encountered lots of issues. We already have thousands of virtual machines running in production, and that's why we are trying Xen 3.4.3 and CentOS 5.4. After we solved one problem, there comes more, which is very annoying So, my question is: In real production environment using Openstack, what's the most commonly used Hypervisor and toolset? We hope to deploy Openstack quickly, and stay in main stream. Thank u. Regards Wang Li ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [nova] [cinder] Nova-volume vs. Cinder in Folsom
+1 for Option 1. On Wed, Jul 11, 2012 at 11:26 PM, Vishvananda Ishaya vishvana...@gmail.com wrote: Hello Everyone, Now that the PPB has decided to promote Cinder to core for the Folsom release, we need to decide what happens to the existing Nova Volume code. As far as I can see it there are two basic strategies. I'm going to give an overview of each here: Option 1 -- Remove Nova Volume == Process --- * Remove all nova-volume code from the nova project * Leave the existing nova-volume database upgrades and tables in place for Folsom to allow for migration * Provide a simple script in cinder to copy data from the nova database to the cinder database (The schema for the tables in cinder are equivalent to the current nova tables) * Work with package maintainers to provide a package based upgrade from nova-volume packages to cinder packages * Remove the db tables immediately after Folsom Disadvantages - * Forces deployments to go through the process of migrating to cinder if they want to use volumes in the Folsom release Option 2 -- Deprecate Nova Volume = Process --- * Mark the nova-volume code deprecated but leave it in the project for the folsom release * Provide a migration path at folsom * Backport bugfixes to nova-volume throughout the G-cycle * Provide a second migration path at G * Package maintainers can decide when to migrate to cinder Disadvantages - * Extra maintenance effort * More confusion about storage in openstack * More complicated upgrade paths need to be supported Personally I think Option 1 is a much more manageable strategy because the volume code doesn't get a whole lot of attention. I want to keep things simple and clean with one deployment strategy. My opinion is that if we choose option 2 we will be sacrificing significant feature development in G in order to continue to maintain nova-volume for another release. But we really need to know if this is going to cause major pain to existing deployments out there. If it causes a bad experience for deployers we need to take our medicine and go with option 2. Keep in mind that it shouldn't make any difference to end users whether cinder or nova-volume is being used. The current nova-client can use either one. Vish ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Nova and asynchronous instance launching
On Fri, Jun 29, 2012 at 5:19 AM, Devin Carlen de...@openstack.org wrote: On Jun 28, 2012, at 9:01 AM, Jay Pipes wrote: On 06/27/2012 06:51 PM, Doug Davis wrote: Consider the creation of a Job type of entity that will be returned from the original call - probably a 202. Then the client can check the Job to see how things are going. BTW - this pattern can be used for any async op, not just the launching of multiple instances since technically any op might be long-running (or queued) based on the current state of the system. Note that much of the job of launching an instance is already asynchronous -- the initial call to create an instance really just creates an instance UUID and returns to the caller -- most of the actual work to create the instance is then done via messaging calls and the caller can continue to call for a status of her instance to check on it. In this particular case, I believe Devin is referring to when you indicate you want to spawn a whole bunch of instances and in that case, things happen synchronously instead of asynchronously? Devin, is that correct? If so, it seems like returning a packet immediately that contains a list of the instance UUIDs that can be used for checking status is the best option? Yep, exactly. The client still waits synchronously for the underlying RPC to complete. Sound like a performance issue. I think this symptom can be much eased if we spend sometime fixing whatever bottleneck causing this (slow AMQP, scheduler, or network)? Now that Nova API has got multprocess enabled, we'd move to next bottleneck in long path of 'launching instance'. Devin, is it possible that you provide more details about this issue so that someone else can reproduce it? Or am I missing something here? -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] HVM + Xen Hypervisor via libvirt possible?
Of course it is possible. What kind of issue did you run into? On Thu, Jun 21, 2012 at 5:52 PM, Wang Li fox...@gmail.com wrote: hi,all I need to run virtual machines on Xen Hypervisor in HVM mode, is it possible when using libvirt? Regards ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Swift][performance degradation] How many objects for PUT/GET in swift-bench is reasonable to you ? I got lots of failure after 50000
On Fri, Jun 22, 2012 at 12:28 AM, Kuo Hugo tonyt...@gmail.com wrote: Hi Folks , I'm in progress of swift QA. Also interesting about the number of PUT/GET operation in your swift-bench configuration. Well , I thought that swift should handle as much as I set in bench.conf. However , the performance degradation came after 4+ Does my configuration reasonable in your mind ? [bench] auth = http://%swift_ip%:8082/auth/v1.0 user = admin:admin key = admin concurrency = 100 object_size = 4 num_objects = 10 num_gets = 10 delete = yes The performance degradation of swift PUT result from 1200/s to 400/s GET result from 1800/s to 800/s (but failures around 400) DELETE result from 800/s to 300/s (lots of failures) 1. Does my configuration is reasonable in reality ? You didn't give us the configuration of you Swift. :) 2. I saw that most of failures is been log as object-server failled to connect to %storage_ip%:%port/%device% ... connection timeout(0.5) What's the reason cause the kind of timeout? Also the loading is very low at this period and almost 0 request send to storage-nodes. 3. Another odd behavior , swift proxy does not consistent send requests to storage-nodes . In my 10 PUT period. The storage-node's loading is not balanced . It might in 70% loading and 10% in next second. Seems is a periodically behavior. I'm really confusing about this issue. One generic suggestion is you may try turning auditor/replicator service and re-do your test to see if it performs any better. -- +Hugo Kuo+ tonyt...@gmail.com +886 935004793 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Performance metrics
On Thu, Jun 21, 2012 at 12:36 AM, Rick Jones rick.jon...@hp.com wrote: I do not have numbers I can share, but do have an interest in discussing methodology for evaluating scaling particularly as regards to networking. My initial thoughts are simply starting with what I have done for network scaling on SMP systems (as vaguely instantiated in the likes of the runemomniaggdemo.sh script under http://www.netperf.org/svn/netperf2/trunk/doc/examples/ ) though expanding it by adding more and more VMs/hypervisors etc as one goes. By 'network scaling', do you mean the aggregated throughput (bandwidth, packets/sec) of the entire cloud (or part of it)? I think picking up 'netperf' as micro benchmark is just 1st step, there's more work needs to be done. For OpenStack network, there's 'inter-cloud' and 'cloud-to-external-world' throughput. If we care about the performance for end user, then reason numbers (for network scaling) should be captured inside VM instances. For example, spawn 1,000 VM instances across cloud, then pair them to do 'netperf' tests in order to measure 'inter-cloud' network throughput. While netperf (or its like) is simply a microbenchmark, and so somewhat removed from reality it does have the benefit of not (directly at least :) leaking anything proprietary about what is going-on in any one vendor's environment. And if something will scale well under the rigors of netperf workloads it will probably scale well under real workloads. Such scaling under netperf may not be necessary, but it should be sufficient. happy benchmarking, rick jones -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] could you give me some tips on the process of how to change the scheduler algorithm
I don't quite understand your question, but if what you want is scheduler policy that always schedules instance to physical node has least load (AMB # of vCPU), then SimpleScheduler is the answer. Or, since you talked about 'process two command at the same time', are you looking for a multprocess Nova API service? Like this https://review.openstack.org/#/c/5762/ ? On Mon, Jun 4, 2012 at 3:09 PM, 吴联盟 wulianmeng4643...@163.com wrote: hi,all I want to change the default scheduling algorithm of nova, as follows: when you use command like this euca-run-instances -k test -t m1.tiny ami-tiny -n1 euca-run-instances -k test -t m1.small ami-small -n2 I want to process the two command at the same time(now it is treated separately,after the first command , the instance specified by the first command is running,and then the following command will be executed),I mean, to handle the information got from the commands at the same time. That is first to find the most reasonable host for each instance whatever it is from the first command or the second command ,and then create the instances on the hosts.More exactly, to find a way to locate the instances which is specified by the above commands at the same time on the physical machines so that we will get the least physical machines used. But now I am confused about the process with the command from api to amqp to scheduler, I wonder whether anybody can give me any help on that ? Thanks a lot! ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] [OpenStack][Nova]Reviewer needed for: blueprint multi-process-api-service
Hey guys, The implementation of Nova blueprint multi-process-api-service (https://review.openstack.org/#/c/5762/) has went through several rounds of revision and got '+1' from one Nova core (thanks Kevin for your review)! Can other Nova core help and review this change? Really appreciate that! Regards, HUANG, Zhiteng Intel SSG/SSD/SOTC/PRC Scalability Lab ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Ops] OpenStack and Operations: Input from the Wild
Thanks. Now I understand the performance metrics you guys were talking about. It'd be good if we can have some tool reporting numbers for a cloud just like 'mpstat', 'iostat' did for a system. On Mon, Apr 9, 2012 at 3:06 PM, Tim Bell tim.b...@cern.ch wrote: Availability metrics for me are ones that allow me to tell if the service is up, degraded or down. Each of us as we start production monitoring need to work out how many nova, glance and swift processes of which type should be running. Furthermore, we need to add basic ‘ping’ style probes to see that the services are responding as expected. ** ** Performance metrics are for cases where we want to record how well the system is running. Examples of number of REST calls/second, VMs created/second etc. These are the kind of metrics which feed into capacity planning, bottleneck identification, trending. ** ** Building up an open, standard and consistent set will avoid duplicate effort as sites deploy to production and allow us to keep the monitoring up to date when the internals of OpenStack change. ** ** Tim ** ** *From:* Huang Zhiteng [mailto:winsto...@gmail.com] *Sent:* 09 April 2012 05:42 *To:* Tim Bell *Cc:* David Kranz; Andrew Clay Shafer; openstack-operat...@lists.openstack.org; Duncan McGreggor; openstack *Subject:* Re: [Openstack] [Ops] OpenStack and Operations: Input from the Wild ** ** Hi Tim, Could you elaborate more on 'performance metrics'? Like what kind of metrics are considered as performance ones? Thanks. On Sat, Apr 7, 2012 at 2:13 AM, Tim Bell tim.b...@cern.ch wrote: Splitting monitoring into 1. Gathering of metrics (availability, performance) and reporting in a standard fashion should be part of OpenStack. 2. Best practice sensors should sample the metrics and provide alarms for issues which could cause service impacts. Posting of these alarms to a monitoring system should be based on plug ins 3. Reference implementations for standard monitoring systems such as Nagios should be available that queries the data above and feeds it into the package selected Each site does not want to be involved in defining the best practice. Equally, each monitoring system should not have to have an intimate understanding of OpenStack to produce a red/green light. The components for 1 and 2 fall under the associated openstack component. Component 3 is the monitoring solution provider. Tim *From:* openstack-bounces+tim.bell=cern...@lists.launchpad.net [mailto: openstack-bounces+tim.bell=cern...@lists.launchpad.net] *On Behalf Of *David Kranz *Sent:* 06 April 2012 16:44 *To:* Andrew Clay Shafer *Cc:* openstack-operat...@lists.openstack.org; openstack; Duncan McGreggor *Subject:* Re: [Openstack] [Ops] OpenStack and Operations: Input from the Wild This is a really great list! With regard to cluster health and monitoring, I did a bunch of stuff with Swift before turning to nova and really appreciated the way each swift service has a healthcheck call that can be used by a monitoring system. While I don't think providing a production-ready monitoring system should be part of core OpenStack, it is the core architects who really know what needs to be checked to ensure that a system is healthy. There are various sets of poking at ports, process lists and so on that Crowbar, Zenoss, etc. set up but it would be a big improvement for deployers if each openstack service provided healthcheck apis based on expert knowledge of what is supposed to be happening inside. That would also insulate deployers from changes in the code that might impact what it means to be running properly. Looking forward to the discussion. -David On 4/6/2012 1:06 AM, Andrew Clay Shafer wrote: Interested in devops. Off the top of my head. live upgrades api queryable indications of cluster health api queryable cluster version and configuration info enabling monitoring as a first class concern in OpenStack (either as a cross cutting concern, or as it's own project) a framework for gathering and sharing performance benchmarks with architecture and configuration On Thu, Apr 5, 2012 at 1:52 PM, Duncan McGreggor dun...@dreamhost.com wrote: For anyone interested in DevOps, Ops, cloud hosting management, etc., there's a proposed session we could use your feedback on for topics of discussion: http://summit.openstack.org/sessions/view/57 Respond with your thoughts and ideas, and I'll be sure to add them to the list. Thanks! d ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] [Ops] OpenStack and Operations: Input from the Wild
Hi Tim, Could you elaborate more on 'performance metrics'? Like what kind of metrics are considered as performance ones? Thanks. On Sat, Apr 7, 2012 at 2:13 AM, Tim Bell tim.b...@cern.ch wrote: ** ** Splitting monitoring into ** ** **1. **Gathering of metrics (availability, performance) and reporting in a standard fashion should be part of OpenStack. **2. **Best practice sensors should sample the metrics and provide alarms for issues which could cause service impacts. Posting of these alarms to a monitoring system should be based on plug ins **3. **Reference implementations for standard monitoring systems such as Nagios should be available that queries the data above and feeds it into the package selected ** ** Each site does not want to be involved in defining the best practice. Equally, each monitoring system should not have to have an intimate understanding of OpenStack to produce a red/green light. The components for 1 and 2 fall under the associated openstack component. Component 3 is the monitoring solution provider. ** ** Tim ** ** *From:* openstack-bounces+tim.bell=cern...@lists.launchpad.net [mailto: openstack-bounces+tim.bell=cern...@lists.launchpad.net] *On Behalf Of *David Kranz *Sent:* 06 April 2012 16:44 *To:* Andrew Clay Shafer *Cc:* openstack-operat...@lists.openstack.org; openstack; Duncan McGreggor *Subject:* Re: [Openstack] [Ops] OpenStack and Operations: Input from the Wild ** ** This is a really great list! With regard to cluster health and monitoring, I did a bunch of stuff with Swift before turning to nova and really appreciated the way each swift service has a healthcheck call that can be used by a monitoring system. While I don't think providing a production-ready monitoring system should be part of core OpenStack, it is the core architects who really know what needs to be checked to ensure that a system is healthy. There are various sets of poking at ports, process lists and so on that Crowbar, Zenoss, etc. set up but it would be a big improvement for deployers if each openstack service provided healthcheck apis based on expert knowledge of what is supposed to be happening inside. That would also insulate deployers from changes in the code that might impact what it means to be running properly. Looking forward to the discussion. -David On 4/6/2012 1:06 AM, Andrew Clay Shafer wrote: Interested in devops. ** ** Off the top of my head. ** ** live upgrades api queryable indications of cluster health api queryable cluster version and configuration info enabling monitoring as a first class concern in OpenStack (either as a cross cutting concern, or as it's own project) a framework for gathering and sharing performance benchmarks with architecture and configuration ** ** ** ** On Thu, Apr 5, 2012 at 1:52 PM, Duncan McGreggor dun...@dreamhost.com wrote: For anyone interested in DevOps, Ops, cloud hosting management, etc., there's a proposed session we could use your feedback on for topics of discussion: http://summit.openstack.org/sessions/view/57 Respond with your thoughts and ideas, and I'll be sure to add them to the list. Thanks! d ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ** ** ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Configure Rate limits on OS API
I have a relevant question about API throughput. I'm using Diablo release version and I found out that there's only one process of nova-api to handle all API requests from even different users. That results very low (10) API throughput (definition: API requests handled per second). Similar symptom was observed back in Cactus release. Is there anyway to configure API service to run in multi-threaded mode so that it can utilize multi-core hardware? I'd really appreciate if anyone can also shed some light on why Nova API was designed to work in this way. Thank you in advance. On Tue, Dec 20, 2011 at 3:28 AM, Day, Phil philip@hp.com wrote: Hi Folks, ** ** Is there a file that can be used to configure the API rate limits for the OS API on a per user basis ? ** ** I can see where the default values are set in the code, but it looks as if there should be a less brutal configuration mechanism to go along with this ? ** ** Thanks Phil ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Writes are faster than reads in Swift
Can anyone explain why Swift doesn't want to utilize page cache _at all_? On Wed, Dec 14, 2011 at 12:32 PM, Michael Barton mike-launch...@weirdlooking.com wrote: I can't explain it off the top of my head. I don't have a swift installation to play with at the moment, but it's conceivable that posix_fadvise is slower than we expect (drop_cache is called more frequently during reads than writes, iirc). That could be tested by making drop_cache a no-op in the object server. Or running the object server under a profiler during both operations might shed some light on what is taking so much time. --Mike On Mon, Dec 12, 2011 at 8:44 AM, Zhenhua (Gerald) Guo jen...@gmail.com wrote: Hi, folks Recently, I have run some read/write tests for large files (400GB) in Swift. I found that writes were always faster than reads, which is kinda counter-intuitive. What may be the cause? Has anyone else seen the same problem? Gerald ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] Glance client release?
That'll be very helpful. Thanks On Wed, Sep 28, 2011 at 11:26 PM, Jay Pipes jaypi...@gmail.com wrote: We should be able to do that, yes. I have to figure out how to do it, but I will create a bug for it in Launchpad and track progress. Cheers, jay On Wed, Sep 28, 2011 at 11:05 AM, Huang Zhiteng winsto...@gmail.com wrote: I am also looking forward to python-glance being available on PyPI. Will it be released with Diablo? On Sat, Sep 24, 2011 at 3:02 AM, Jay Pipes jaypi...@gmail.com wrote: On Fri, Sep 23, 2011 at 2:51 PM, Devin Carlen devin.car...@gmail.com wrote: Awesome, thanks! Any plans to have the client available on PyPI? Makes testing a lot easier. Yup! -jay ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp -- Regards Huang Zhiteng -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] How to limit the total virtual processors/memory for one compute node?
Hi all, In my setup of Cactus, I found Nova scheduler would place newly created instance to a compute node that is already full occupied (in terms of memory or # of virtual processors), which lead to swapping and VP overcommitting. That would cause serious performance issue on a busy environment. So I was wondering if there's some kinda mechanism to limit to resource one compute node could use, something like the 'weight' in OpenNebula. I'm using Cactus (with GridDynamic's RHEL package), default scheduler policy, one zone only. Any suggestion? -- Regards Huang Zhiteng ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp