Re: [Openstack] [Cinder] HP MSA array as a secondary backend is only used user is an admin
just a bump. can anyone offer any advice on this cinder driver cinder.volume.drivers.san.hp.hpmsa_fc.HPMSAFCDriver? thanks! -- Jim On Thu, Oct 11, 2018 at 4:08 PM Jim Okken wrote: > hi All, > > not sure if I can find an answer here to this specific situation with the > cinder backend driver cinder.volume.drivers.san.hp.hpmsa_fc.HPMSAFCDriver. > If not how can I get in touch with someone more familiar with > cinder.volume.drivers.san.hp.hpmsa_fc.HPMSAFCDriver > > we have a HP MSA storage array connected to most of our compute nodes and > we are using the cinder driver > cinder.volume.drivers.san.hp.hpmsa_fc.HPMSAFCDriver as a second backend so > that openstack can, if directed by metadata, create volumes on it during > instance creation. Openstack creates volumes using this MSA backend if the > metadata of the image selected contains "cinder_image_volume_type=MSA". > This second MSA type of volume was added to cinder. > > We use a CentOS-6-x86_64-GenericCloud-1707.qcow2 image which has this > metadata added. Without this metadata RBD/CEPH images are made > > This works great for the admin user but not for a regular _ member_ user. > > With the admin user volumes created show Type=*MSA* and > Host=node-44.domain.com@*MSA#A*. (correct) > > With the _member_ user volumes created show Type=*MSA* but > Host=rbd:volumes@RBD-backend#*RBD-backend (this is CEPH, incorrect!)*. > > And I can confirm the volume is not on the MSA. Correct RBD/CEPH volumes > show Type=*volumes_ceph* and Host=rbd:volumes@RBD-backend#*RBD-backend*. > > This happens if the cinder volume type is created as a Private type or a > Public type. > > I have tried to set the properties on the cinder MSA volume type for the > specific project we want to use this volume type in, and to set the > project-domain for this volume type. nothing has helped. > > can anyone shed any light on this behavior or point out anything helpful > in the logs pls? > > Looking at the logs I do see the _ member_ user is a non-default-domain > user while admin is obviously the default domain. other than that I can't > make heads or tails of the logs. > > Here are logs if anyone wants to look at them: > a bad _ member_ volume creation was UUID > fb9047c3-1b6b-4d2b-bae8-5177e86eb1f2 https://pastebin.com/bmFAy6RR > > a good admin volume creation was UUID b49e33db-8ab8-489f-b7cb-092f421178c1 > https://pastebin.com/5SAecNJ2 > > We are using Newton, thanks!!! > > > -- Jim > ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Re: [Openstack] [Cinder] HP MSA array as a secondary backend is only used user is an admin
hi All, not sure if I can find an answer here to this specific situation with the cinder backend driver cinder.volume.drivers.san.hp.hpmsa_fc.HPMSAFCDriver. If not how can I get in touch with someone more familiar with cinder.volume.drivers.san.hp.hpmsa_fc.HPMSAFCDriver we have a HP MSA storage array connected to most of our compute nodes and we are using the cinder driver cinder.volume.drivers.san.hp.hpmsa_fc.HPMSAFCDriver as a second backend so that openstack can, if directed by metadata, create volumes on it during instance creation. Openstack creates volumes using this MSA backend if the metadata of the image selected contains "cinder_image_volume_type=MSA". This second MSA type of volume was added to cinder. We use a CentOS-6-x86_64-GenericCloud-1707.qcow2 image which has this metadata added. Without this metadata RBD/CEPH images are made This works great for the admin user but not for a regular _ member_ user. With the admin user volumes created show Type=*MSA* and Host=node-44.domain.com@*MSA#A*. (correct) With the _member_ user volumes created show Type=*MSA* but Host=rbd:volumes@RBD-backend#*RBD-backend (this is CEPH, incorrect!)*. And I can confirm the volume is not on the MSA. Correct RBD/CEPH volumes show Type=*volumes_ceph* and Host=rbd:volumes@RBD-backend#*RBD-backend*. This happens if the cinder volume type is created as a Private type or a Public type. I have tried to set the properties on the cinder MSA volume type for the specific project we want to use this volume type in, and to set the project-domain for this volume type. nothing has helped. can anyone shed any light on this behavior or point out anything helpful in the logs pls? Looking at the logs I do see the _ member_ user is a non-default-domain user while admin is obviously the default domain. other than that I can't make heads or tails of the logs. Here are logs if anyone wants to look at them: a bad _ member_ volume creation was UUID fb9047c3-1b6b-4d2b-bae8-5177e86eb1f2 https://pastebin.com/bmFAy6RR a good admin volume creation was UUID b49e33db-8ab8-489f-b7cb-092f421178c1 https://pastebin.com/5SAecNJ2 We are using Newton, thanks!!! -- Jim ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Re: [Openstack] [Fuel] add custom settings to a fuel deploy
hi Jitendra, thanks very much for your reply! We deploy with the UI and right now the environment is set and we only add compute nodes. In the past we deployed one compute node at a time but now that we understand the process we deploy multiple, 3 or 5 compute nodes at a time. Honestly though going fwd it could be 1 or multiple nodes at a time. This is an growing internal-use environment, but we have 23 compute nodes right now, so we are going to be growing it slower going fwd. Right now we have some simple shell scripts which we run after a successful deploy, these set the settings in the config files and restart openstack services. But until those scripts are run the environment is missing those needed additions and not really usable. Not a huge problem for an internal-use environment, but we would like to have no downtime. Also it is HA so we have 3 controllers. thanks!! -- Jim On Tue, May 1, 2018 at 3:39 PM, Jitendra Kumar Bhaskar < jitendr...@pramati.com> wrote: > Hi Jim, > > I can help you one that, but before that wanted to understand how are you > deploying the additional computes: > 1. If CLI then share the command that you used to deploy. > 2. If UI then are you deploying only one node after selection ? > > > Regards > Jitendra Bhaskar > > Regards > Bhaskar > +1-469-514-7986 > > > > > > On Tue, May 1, 2018 at 12:21 PM, Jim Okken <j...@jokken.com> wrote: > >> Hi list, >> >> >> >> We’ve created a pretty large openstack Newton HA environment using fuel. >> After initial hiccups with deployment (not all fuel troubles) we can now >> add additional compute nodes to the environment with ease! >> >> Thank you for all who’ve worked on all the projects to make this product. >> >> >> >> My question has to do with something I think I should know already: How >> can we get fuel to stop overwriting custom settings in our environment? >> When we deploy new compute nodes, original openstack settings on all nodes >> are re-deployed/re-set. >> >> >> >> For example we have changes to settings in these files on the controller >> nodes. >> >> >> >> /etc/nova/nova.conf >> >> /etc/neutron/dhcp_agent.ini >> >> /etc/neutron/plugins/ml2/openvswitch_agent.ini >> >> /etc/openstack-dashboard/local_settings.py >> >> /etc/keystone/keystone.conf >> >> /etc/cinder/cinder.conf >> >> /etc/neutron/neutron.conf >> >> >> >> I’m guessing the method to resolve this is not to stop fuel from >> overwriting settings, but to add to fuel some tasks that sets these custom >> settings again near the end of each deploy. >> >> >> >> I’m sure this is something I am supposed to know already, but so far in >> my route thru Openstack land experience with this has escaped me. >> >> Can you send me some advice, pointers, places to start? >> >> >> >> Thanks! >> >> >> >> --jim >> >> >> ___ >> Mailing list: http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> Post to : openstack@lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> >> > > Disclaimer: > The contents of this email and any attachments are confidential. They are > intended for the named recipient(s) only. If you have received this email > by mistake, please notify the sender immediately and do not disclose the > contents to anyone or make copies thereof. > ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
[Openstack] [Fuel] add custom settings to a fuel deploy
Hi list, We’ve created a pretty large openstack Newton HA environment using fuel. After initial hiccups with deployment (not all fuel troubles) we can now add additional compute nodes to the environment with ease! Thank you for all who’ve worked on all the projects to make this product. My question has to do with something I think I should know already: How can we get fuel to stop overwriting custom settings in our environment? When we deploy new compute nodes, original openstack settings on all nodes are re-deployed/re-set. For example we have changes to settings in these files on the controller nodes. /etc/nova/nova.conf /etc/neutron/dhcp_agent.ini /etc/neutron/plugins/ml2/openvswitch_agent.ini /etc/openstack-dashboard/local_settings.py /etc/keystone/keystone.conf /etc/cinder/cinder.conf /etc/neutron/neutron.conf I’m guessing the method to resolve this is not to stop fuel from overwriting settings, but to add to fuel some tasks that sets these custom settings again near the end of each deploy. I’m sure this is something I am supposed to know already, but so far in my route thru Openstack land experience with this has escaped me. Can you send me some advice, pointers, places to start? Thanks! --jim ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Re: [Openstack] compute nodes down
I believe this issue turned out to be the shared storage device we are using for shared storage to each compute node. it had an access issue and one instance's vHD files had access attempts that hung forever and never timed out. this make sense for one node to be having nova issues. But could this cause all compute nodes to have nova services to stop after some time? (in a shared storage setup does each node access/query each vHD on the storage periodically?) thanks! -- Jim On Tue, Dec 19, 2017 at 3:45 AM, Tobias Urdin <tobias.ur...@crystone.com> wrote: > Enable debug in nova.conf and check conductor and compute logs. > > Check that your clock is in-sync with NTP or you might experience that the > alive checks in the database exceeds the service_down_time config value. > > On 12/19/2017 12:09 AM, Jim Okken wrote: > > hi list, > > hoping someone could shed some light on this issue I just started seeing > today > > all my compute nodes started showing as "Down" in the Horizon -> > Hypervisors -> Compute Nodes tab > > > root@node-1:~# nova service-list > +-+--+---+--+--- > --+---++-+ > | Id | Binary | Host | Zone | Status | State > | Updated_at | Disabled Reason | > +-+--+---+--+--- > --+---++-+ > | 325 | nova-compute | node-9.mydom.com | nova | enabled | down > | 2017-12-18T21:59:38.00 | - | > | 448 | nova-compute | node-14.mydom.com | nova | enabled | up > | 2017-12-18T22:41:42.00 | - | > | 451 | nova-compute | node-17.mydom.com | nova | enabled | up > | 2017-12-18T22:42:04.00 | - | > | 454 | nova-compute | node-11.mydom.com | nova | enabled | up > | 2017-12-18T22:42:02.00 | - | > | 457 | nova-compute | node-12.mydom.com | nova | enabled | up > | 2017-12-18T22:42:12.00 | - | > | 472 | nova-compute | node-16.mydom.com | nova | enabled | down > | 2017-12-18T00:16:01.00 | - | > | 475 | nova-compute | node-10.mydom.com | nova | enabled | down > | 2017-12-18T00:26:09.00 | - | > | 478 | nova-compute | node-13.mydom.com | nova | enabled | down > | 2017-12-17T23:54:06.00 | - | > | 481 | nova-compute | node-15.mydom.com | nova | enabled | up > | 2017-12-18T22:41:34.00 | - | > | 484 | nova-compute | node-8.mydom.com | nova | enabled | down > | 2017-12-17T23:55:50.00 | - | > > > if I stop and the start nova-compute on the down nodes the stop will take > several minutes and then the start will be quick and fine. but after about > 2 hours the nova-compute service will show down again. > > i am not seeing any ERRORS in nova logs. > > I get this for the status of a node that is showing as "UP" > > > > root@node-14:~# systemctl status nova-compute.service > â nova-compute.service - OpenStack Compute >Loaded: loaded (/lib/systemd/system/nova-compute.service; enabled; > vendor preset: enabled) >Active: active (running) since Mon 2017-12-18 21:57:10 UTC; 35min ago > Docs: man:nova-compute(1) > Process: 32193 ExecStartPre=/bin/chown nova:adm /var/log/nova > (code=exited, status=0/SUCCESS) > Process: 32190 ExecStartPre=/bin/chown nova:nova /var/lock/nova > /var/lib/nova (code=exited, status=0/SUCCESS) > Process: 32187 ExecStartPre=/bin/mkdir -p /var/lock/nova /var/log/nova > /var/lib/nova (code=exited, status=0/SUCCESS) > Main PID: 32196 (nova-compute) >CGroup: /system.slice/nova-compute.service >ââ32196 /usr/bin/python /usr/bin/nova-compute > --config-file=/etc/nova/nova-compute.conf --config-file=/etc/nova/nova.conf > --log-file=/var/log/nova/nova-compute.log > > Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 > 22:31:47.570 32196 DEBUG oslo_messaging._drivers.amqpdriver > [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] CALL msg_id: > 2877b9707da144f3a91e7b80e2705fb3 exchange 'nova' topic 'conductor' _send > /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:448 > Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 > 22:31:47.604 32196 DEBUG oslo_messaging._drivers.amqpdriver [-] received > reply msg_id: 2877b9707da144f3a91e7b80e2705fb3 __call__ > /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:296 > Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 > 22:31:47.605 32196 INFO nova.compute.resource_tracker > [req-f30b2
[Openstack] compute nodes down
hi list, hoping someone could shed some light on this issue I just started seeing today all my compute nodes started showing as "Down" in the Horizon -> Hypervisors -> Compute Nodes tab root@node-1:~# nova service-list +-+--+---+--+-+---++-+ | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | +-+--+---+--+-+---++-+ | 325 | nova-compute | node-9.mydom.com | nova | enabled | down | 2017-12-18T21:59:38.00 | - | | 448 | nova-compute | node-14.mydom.com | nova | enabled | up| 2017-12-18T22:41:42.00 | - | | 451 | nova-compute | node-17.mydom.com | nova | enabled | up| 2017-12-18T22:42:04.00 | - | | 454 | nova-compute | node-11.mydom.com | nova | enabled | up| 2017-12-18T22:42:02.00 | - | | 457 | nova-compute | node-12.mydom.com | nova | enabled | up| 2017-12-18T22:42:12.00 | - | | 472 | nova-compute | node-16.mydom.com | nova | enabled | down | 2017-12-18T00:16:01.00 | - | | 475 | nova-compute | node-10.mydom.com | nova | enabled | down | 2017-12-18T00:26:09.00 | - | | 478 | nova-compute | node-13.mydom.com | nova | enabled | down | 2017-12-17T23:54:06.00 | - | | 481 | nova-compute | node-15.mydom.com | nova | enabled | up| 2017-12-18T22:41:34.00 | - | | 484 | nova-compute | node-8.mydom.com | nova | enabled | down | 2017-12-17T23:55:50.00 | - | if I stop and the start nova-compute on the down nodes the stop will take several minutes and then the start will be quick and fine. but after about 2 hours the nova-compute service will show down again. i am not seeing any ERRORS in nova logs. I get this for the status of a node that is showing as "UP" root@node-14:~# systemctl status nova-compute.service â nova-compute.service - OpenStack Compute Loaded: loaded (/lib/systemd/system/nova-compute.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2017-12-18 21:57:10 UTC; 35min ago Docs: man:nova-compute(1) Process: 32193 ExecStartPre=/bin/chown nova:adm /var/log/nova (code=exited, status=0/SUCCESS) Process: 32190 ExecStartPre=/bin/chown nova:nova /var/lock/nova /var/lib/nova (code=exited, status=0/SUCCESS) Process: 32187 ExecStartPre=/bin/mkdir -p /var/lock/nova /var/log/nova /var/lib/nova (code=exited, status=0/SUCCESS) Main PID: 32196 (nova-compute) CGroup: /system.slice/nova-compute.service ââ32196 /usr/bin/python /usr/bin/nova-compute --config-file=/etc/nova/nova-compute.conf --config-file=/etc/nova/nova.conf --log-file=/var/log/nova/nova-compute.log Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 22:31:47.570 32196 DEBUG oslo_messaging._drivers.amqpdriver [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] CALL msg_id: 2877b9707da144f3a91e7b80e2705fb3 exchange 'nova' topic 'conductor' _send /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:448 Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 22:31:47.604 32196 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: 2877b9707da144f3a91e7b80e2705fb3 __call__ /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:296 Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 22:31:47.605 32196 INFO nova.compute.resource_tracker [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Total usable vcpus: 40, total allocated vcpus: 0 Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 22:31:47.606 32196 INFO nova.compute.resource_tracker [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Final resource view: name=node-14.mydom.com phys_ram=128812MB used_ram=512MB phys_disk=6691GB used_disk=0GB total_vcpus=40 used_vcpus=0 pci_stats=[] Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 22:31:47.610 32196 DEBUG oslo_messaging._drivers.amqpdriver [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] CALL msg_id: ad32abe833f4440d86c15b911aa35c43 exchange 'nova' topic 'conductor' _send /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:448 Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 22:31:47.632 32196 DEBUG oslo_messaging._drivers.amqpdriver [-] received reply msg_id: ad32abe833f4440d86c15b911aa35c43 __call__ /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:296 Dec 18 22:31:47 node-14.mydom.com nova-compute[32196]: 2017-12-18 22:31:47.633 32196 WARNING nova.scheduler.client.report [req-f30b2331-2097-4981-89c8-acea4a81f7f2 - - - - -] Unable to refresh my resource provider record Dec
Re: [Openstack] soft lockup on Newton compute nodes
just an update for this question. the issue is resolved with a kernel update upgrading multiple compute nodes from kernel 4.4.0.93 to 4.4.0.98 fixed the softlockup issue. Also this kernel change does not seem to have broken anything else in openstack -- Jim On Fri, Nov 10, 2017 at 9:50 AM, Jim Okken <j...@jokken.com> wrote: > = UPDATE 11/10 == > hi again, > > based on some advice from a member of this mailing list we've been looking > into kernel and driver versions of our compute nodes > > We also have plain non openstack "KVM on Ubuntu" servers for testing. > > I looked at driver and kernel differences between these Ubuntu 16 w/ KVM > systems and our openstack compute nodes. I found Ubuntu 16 w/ KVM was at > kernel version 4.4.0-87 and that the openstack compute nodes were at > 4.4.0-93. So I upgraded the Ubuntu 16 w/ KVM to 4.4.0-93 and was able to > reproduce this problem (but only on the exact HP hardware that is our > openstack compute nodes, and not on other hardware). > Next I updated these Ubuntu 16 w/ KVM to 4.4.0-98 and the problem no > longer occured! > > I need to upgrade a few openstack compute nodes to 4.4.0-98 and test. Do > anyone think this kernel change could break openstack? > > In the kernel change log I found a fix for a specific HP server in > 4.4.0-98 (not the same as our server but somewhat similar) > > thanks! > > -- Jim > > On Mon, Oct 23, 2017 at 10:25 PM, Jim Okken <j...@jokken.com> wrote: > >> = UPDATE 10/23 == >> >> we have been trying different things to get better debug we disabled >> rate-limiting in order to get better info in /var/log/message. for some >> reason (maybe unrelated) we didn't get the soft lockup during this test But >> this time we got openvswitch, br_netfilter, etc in the call trace in >> /var/log/messages >> >> Please advise in any way! thx!! >> >> basically we are running various types of SIP/RTP test traffic between 2 >> instances (on different compute nodes). This time instead of one hypervisor >> getting the errors both hypervisors did, but neither got the soft lockup. >> >> log snippetes below, full logs here: >> >> www.jokken.com/downloads/node-68.txt >> >> www.jokken.com/downloads/node-90.txt >> >> >> *node-68* >> >> 2017-10-20T17:48:37.031741+00:00 node-68 rsyslogd-2177: imuxsock[pid >> 5085]: 40 messages lost due to rate-limiting >> >> 2017-10-20T17:58:36.281069+00:00 node-68 rsyslogd-2177: imuxsock[pid >> 5085]: begin to drop messages due to rate-limiting >> >> 2017-10-20T17:58:37.548500+00:00 node-68 rsyslogd-2177: imuxsock[pid >> 5085]: 41 messages lost due to rate-limiting >> >> 2017-10-20T18:08:36.180377+00:00 node-68 rsyslogd-2177: imuxsock[pid >> 5085]: begin to drop messages due to rate-limiting >> >> 2017-10-20T18:08:37.058861+00:00 node-68 rsyslogd-2177: imuxsock[pid >> 5085]: 40 messages lost due to rate-limiting >> >> 2017-10-20T18:18:36.175797+00:00 node-68 rsyslogd-2177: imuxsock[pid >> 5085]: begin to drop messages due to rate-limiting >> >> 2017-10-20T18:18:37.583237+00:00 node-68 rsyslogd-2177: imuxsock[pid >> 5085]: 41 messages lost due to rate-limiting >> >> 2017-10-20T18:28:36.172090+00:00 node-68 rsyslogd-2177: imuxsock[pid >> 5085]: begin to drop messages due to rate-limiting >> >> 2017-10-20T18:28:37.125346+00:00 node-68 rsyslogd-2177: imuxsock[pid >> 5085]: 40 messages lost due to rate-limiting >> >> >> >> ps -aef | grep 5080 >> >> ceilome+ 5080 3502 0 Oct03 ? 01:32:57 ceilometer-polling - AgentManager(0) >> >> >> >> 2017-10-20T18:35:10.759230+00:00 node-68 rsyslogd: [origin >> software="rsyslogd" swVersion="8.16.0" x-pid="3431" x-info=" >> http://www.rsyslog.com;] exiting on signal 15. >> >> 2017-10-20T18:35:10.790611+00:00 node-68 rsyslogd: [origin >> software="rsyslogd" swVersion="8.16.0" x-pid="23851" x-info=" >> http://www.rsyslog.com;] start >> >> 2017-10-20T18:35:10.790395+00:00 node-68 rsyslogd: rsyslogd's groupid >> changed to 108 >> >> 2017-10-20T18:35:10.790455+00:00 node-68 rsyslogd: rsyslogd's userid >> changed to 104 >> >> 2017-10-20T18:35:10.790491+00:00 node-68 rsyslogd-2357: queue "action 0 >> queue": high water mark is set quite low at 8000. You should only set it >> below 60% (60) if you have a good reason for this. [v8.16.0 try >> http://www.rsyslog.com/e/2357 ] >> >> >> >> Test starts: Fri Oct 20 18:52:48 2017 >&
Re: [Openstack] [Fuel] node name issue
update to an old question: I have gotten around this issue. I'm not sure how I got around this issue, but my theory is something I noticed quite by accident. The servers I was having these troubles on use 80GB hard drives, but they also have a flash drive in them, for small OS deployments. I stumbled across a /dev/sda6 partition on these flash drives. On this partition I found 2 files: meta-data and user-data. In those files was the old node name I mentioned in the original post. This partition must have been detected, and these stale files were used by fuel-agent when the newly provisioned node first booted, even though they were booting from the 80GB drive which had its own /dev/sda6 partition. I suspect the stale/incorrect /dev/sda6 is probably is from Fuel 8 when we tried to deploy on some of these flash drives... once I deleted the stale/incorrect /dev/sda6 then theprovisioning and deployment went perfectly thanks -- Jim On Thu, Sep 28, 2017 at 5:02 PM, Jim Okken <j...@jokken.com> wrote: > I ran "fuel2 node update -H blade13 20" just to get out of the node-* > naming convention, as someone suggested > > > > The deploy still names the node node-11 and provisioning fails. > > digging a little more, i see it might have to do with the fuel-agent > cloud-init scripts. > > in the cloud-init.log on the new node I see the node name being set to > node-11! > > > > this isnt node-11. this was node-20, but I renamed it to blade13 with the > command "fuel2 node update -H blade13 20" > > > > i also noted that after the cloud-init scripts ran at the end the first > boot of the new provisioned OS, that in the Fuel GUI, the FQDN field became > node-11.ourdomain.com (before it was bootstrap.ourdomain.com) > > (in the same window Hostname still show as blade13) > > > > But FQDN in the Fuel2 CLI output still shows node-20.ourdomain.com!!! > > > > [fuel2 node show 20 > > | id | 20 > | > > | name| Untitled (68:58) > | > > | status | ready > | > > | os_platform | ubuntu > | > > | roles | [u'compute'] > | > > | kernel_params | None > | > > | pending_roles | [] > | > > | hostname| node-20 > | > > | fqdn| node-20.dialogic.com > | > > | platform_name | ProLiant BL460c Gen9 > | > > > > > > > > > > > > where can i find the cloud init settings which are deploy to new nodes? > > i guess this has something to do with this file: > /usr/share/fuel-agent/cloud-init-templates/cloud_config_ubuntu.jinja2 > > in that file I see > > hostname: {{ common.hostname }} > > fqdn: {{ common.fqdn }} > > > > please help me with an info you might have or let me know that populates > those 2 parts of the template? > > > > Is there a database these values are all stored in on the fuel server? > > > > Thanks > > > > --Jim > > -- Jim > > On Tue, Sep 26, 2017 at 12:00 PM, Jim Okken <j...@jokken.com> wrote: > >> also I should add, I dont have the original hard drives in the system so >> it isn't because it is booting the old OS where these node names were set. >> this is definitely the newly installed OS being given the wroing hostname >> >> >> >> is there a database this is all kept in? maybe I could look around and >> find where these old node names are being saved? >> >> thanks! >> >> -- Jim >> >> On Mon, Sep 25, 2017 at 6:03 PM, Jim Okken <j...@jokken.com> wrote: >> >>> hi all, >>> >>> I am using Fuel 10. >>> >>> i have 2 nodes I am trying to deploy as compute nodes. at one time in >>> the past I was attempting to deploy them too. I assume back then their node >>> names were node-11 and node-20. >>> >>> they were never successfully deploy and now I've worked out their >>> hardware issues and are attempting to
Re: [Openstack] soft lockup on Newton compute nodes
= UPDATE 11/10 == hi again, based on some advice from a member of this mailing list we've been looking into kernel and driver versions of our compute nodes We also have plain non openstack "KVM on Ubuntu" servers for testing. I looked at driver and kernel differences between these Ubuntu 16 w/ KVM systems and our openstack compute nodes. I found Ubuntu 16 w/ KVM was at kernel version 4.4.0-87 and that the openstack compute nodes were at 4.4.0-93. So I upgraded the Ubuntu 16 w/ KVM to 4.4.0-93 and was able to reproduce this problem (but only on the exact HP hardware that is our openstack compute nodes, and not on other hardware). Next I updated these Ubuntu 16 w/ KVM to 4.4.0-98 and the problem no longer occured! I need to upgrade a few openstack compute nodes to 4.4.0-98 and test. Do anyone think this kernel change could break openstack? In the kernel change log I found a fix for a specific HP server in 4.4.0-98 (not the same as our server but somewhat similar) thanks! -- Jim On Mon, Oct 23, 2017 at 10:25 PM, Jim Okken <j...@jokken.com> wrote: > = UPDATE 10/23 == > > we have been trying different things to get better debug we disabled > rate-limiting in order to get better info in /var/log/message. for some > reason (maybe unrelated) we didn't get the soft lockup during this test But > this time we got openvswitch, br_netfilter, etc in the call trace in > /var/log/messages > > Please advise in any way! thx!! > > basically we are running various types of SIP/RTP test traffic between 2 > instances (on different compute nodes). This time instead of one hypervisor > getting the errors both hypervisors did, but neither got the soft lockup. > > log snippetes below, full logs here: > > www.jokken.com/downloads/node-68.txt > > www.jokken.com/downloads/node-90.txt > > > *node-68* > > 2017-10-20T17:48:37.031741+00:00 node-68 rsyslogd-2177: imuxsock[pid > 5085]: 40 messages lost due to rate-limiting > > 2017-10-20T17:58:36.281069+00:00 node-68 rsyslogd-2177: imuxsock[pid > 5085]: begin to drop messages due to rate-limiting > > 2017-10-20T17:58:37.548500+00:00 node-68 rsyslogd-2177: imuxsock[pid > 5085]: 41 messages lost due to rate-limiting > > 2017-10-20T18:08:36.180377+00:00 node-68 rsyslogd-2177: imuxsock[pid > 5085]: begin to drop messages due to rate-limiting > > 2017-10-20T18:08:37.058861+00:00 node-68 rsyslogd-2177: imuxsock[pid > 5085]: 40 messages lost due to rate-limiting > > 2017-10-20T18:18:36.175797+00:00 node-68 rsyslogd-2177: imuxsock[pid > 5085]: begin to drop messages due to rate-limiting > > 2017-10-20T18:18:37.583237+00:00 node-68 rsyslogd-2177: imuxsock[pid > 5085]: 41 messages lost due to rate-limiting > > 2017-10-20T18:28:36.172090+00:00 node-68 rsyslogd-2177: imuxsock[pid > 5085]: begin to drop messages due to rate-limiting > > 2017-10-20T18:28:37.125346+00:00 node-68 rsyslogd-2177: imuxsock[pid > 5085]: 40 messages lost due to rate-limiting > > > > ps -aef | grep 5080 > > ceilome+ 5080 3502 0 Oct03 ? 01:32:57 ceilometer-polling - AgentManager(0) > > > > 2017-10-20T18:35:10.759230+00:00 node-68 rsyslogd: [origin > software="rsyslogd" swVersion="8.16.0" x-pid="3431" x-info=" > http://www.rsyslog.com;] exiting on signal 15. > > 2017-10-20T18:35:10.790611+00:00 node-68 rsyslogd: [origin > software="rsyslogd" swVersion="8.16.0" x-pid="23851" x-info=" > http://www.rsyslog.com;] start > > 2017-10-20T18:35:10.790395+00:00 node-68 rsyslogd: rsyslogd's groupid > changed to 108 > > 2017-10-20T18:35:10.790455+00:00 node-68 rsyslogd: rsyslogd's userid > changed to 104 > > 2017-10-20T18:35:10.790491+00:00 node-68 rsyslogd-2357: queue "action 0 > queue": high water mark is set quite low at 8000. You should only set it > below 60% (60) if you have a good reason for this. [v8.16.0 try > http://www.rsyslog.com/e/2357 ] > > > > Test starts: Fri Oct 20 18:52:48 2017 > > > > 2017-10-20T18:56:20.408532+00:00 node-68 kernel: [1458996.797708] > [ cut here ] > > 2017-10-20T18:56:20.408571+00:00 node-68 kernel: [1458996.797728] > WARNING: CPU: 27 PID: 0 at /build/linux-YyUNAI/linux-4.4.0/net/core/dev.c:2445 > skb_warn_bad_offload+0xd1/0x120() > > 2017-10-20T18:56:20.408574+00:00 node-68 kernel: [1458996.797732] > qvofd385f05-cb: caps=(0x0184075b59e9, 0x) len=2636 > data_len=2594 gso_size=1480 gso_type=6 ip_summed=0 > > 2017-10-20T18:56:20.408576+00:00 node-68 kernel: [1458996.797735] Modules > linked in: bonding binfmt_misc nf_conntrack_netlink vhost_net vhost macvtap > macvlan xt_mac xt_tcpudp xt_physdev br_netfilter xt_set ip_set_hash_net > ip_set nfnetli
Re: [Openstack] soft lockup on Newton compute nodes
7-10-20T19:00:19.698100+00:00 node-90 kernel: [97583.653163] [] ? generic_exec_single+0x85/0x120 2017-10-20T19:00:19.698101+00:00 node-90 kernel: [97583.653167] [] ? be_eq_notify+0x60/0x70 [be2net] 2017-10-20T19:00:19.698101+00:00 node-90 kernel: [97583.653168] [] __netif_receive_skb+0x18/0x60 2017-10-20T19:00:19.698102+00:00 node-90 kernel: [97583.653170] [] process_backlog+0xa8/0x150 2017-10-20T19:00:19.698104+00:00 node-90 kernel: [97583.653171] [] net_rx_action+0x21e/0x360 2017-10-20T19:00:19.698105+00:00 node-90 kernel: [97583.653173] [] __do_softirq+0x101/0x290 2017-10-20T19:00:19.698106+00:00 node-90 kernel: [97583.653175] [] do_softirq_own_stack+0x1c/0x30 2017-10-20T19:00:19.698107+00:00 node-90 kernel: [97583.653176] [] do_softirq.part.19+0x38/0x40 2017-10-20T19:00:19.698108+00:00 node-90 kernel: [97583.653179] [] do_softirq+0x1d/0x20 2017-10-20T19:00:19.698110+00:00 node-90 kernel: [97583.653181] [] netif_rx_ni+0x33/0x80 2017-10-20T19:00:19.698111+00:00 node-90 kernel: [97583.653184] [] tun_get_user+0x506/0x880 2017-10-20T19:00:19.698112+00:00 node-90 kernel: [97583.653185] [] tun_sendmsg+0x51/0x70 2017-10-20T19:00:19.698112+00:00 node-90 kernel: [97583.653188] [] handle_tx+0x306/0x4e0 [vhost_net] 2017-10-20T19:00:19.698113+00:00 node-90 kernel: [97583.653190] [] handle_tx_kick+0x15/0x20 [vhost_net] 2017-10-20T19:00:19.698113+00:00 node-90 kernel: [97583.653193] [] vhost_worker+0xf3/0x190 [vhost] 2017-10-20T19:00:19.698115+00:00 node-90 kernel: [97583.653195] [] ? vhost_poll_wakeup+0x30/0x30 [vhost] 2017-10-20T19:00:19.698116+00:00 node-90 kernel: [97583.653198] [] kthread+0xe5/0x100 2017-10-20T19:00:19.698117+00:00 node-90 kernel: [97583.653199] [] ? kthread_create_on_node+0x1e0/0x1e0 2017-10-20T19:00:19.698117+00:00 node-90 kernel: [97583.653203] [] ret_from_fork+0x3f/0x70 2017-10-20T19:00:19.698118+00:00 node-90 kernel: [97583.653204] [] ? kthread_create_on_node+0x1e0/0x1e0 2017-10-20T19:00:19.698123+00:00 node-90 kernel: [97583.653206] ---[ end trace d7e73079b38e57b4 ]--- -- Jim On Wed, Oct 18, 2017 at 11:37 PM, Jim Okken <j...@jokken.com> wrote: > hi all, > > please help us out with an issue we are seeing on multiple compute nodes > running Newton (Ubuntu 16.04.3 Kernel 4.4.0). After about 1 hour of running > our VOIP test application the instances become non-responsive and can't be > pinged as well do the compute nodes. > > messages appear on the compute node console screens. a screen shot of that > is hosted here: > > http://www.jokken.com/downloads/console.png > > i'll try to attach it also. > > The first compute node this was seen on was running 2 instances, the > second was running only 1 instance. They were using on a portion of the > total 40 vCPUs available, and the load was moderate. Cold boot these nodes > and all is well again, until we run our application for about 1 hour. > > please let us know what you think thanks! > > not a lot is shown in DEBUG logging of Nova and Neutron on the compute node > > these logs are here: > > http://www.jokken.com/downloads/logs.zip > > i'll try to attach them too. > > https://ask.openstack.org/en/question/110748/soft-lockup- > on-newton-compute-nodes/ > > /var/log/messages on the compute node shows many repeats of these messages: > > 2017-10-18T20:49:26.462309+00:00 node-58 kernel: [1297007.624935] Modules > linked in: binfmt_misc nf_conntrack_netlink vhost_net vhost macvtap macvlan > ip6table_raw xt_mac xt_tcpudp xt_physdev br_netfilter xt_set > ip_set_hash_net ip_set nfnetlink veth ebtable_filter ebtables openvswitch > ocfs2 quota_tree ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager > ocfs2_stackglue configfs ip6table_filter ip6_tables xt_multiport > xt_conntrack iptable_filter xt_comment xt_CT iptable_raw ip_tables x_tables > xfs ipmi_ssif 8021q garp mrp intel_rapl x86_pkg_temp_thermal > intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd > serio_raw bridge stp llc sb_edac edac_core hpilo ioatdma lpc_ich shpchp dca > ipmi_si 8250_fintek ipmi_msghandler acpi_power_meter mac_hid kvm_intel kvm > irqbypass ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr > iscsi_tcp libiscsi_tcp nf_conntrack_proto_gre nf_conntrack_ipv6 > nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack autofs4 raid10 > raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor > raid6_pq libcrc32c raid1 raid0 multipath linear dm_round_robin ses > enclosure uas usb_storage psmouse ahci lpfc be2iscsi libahci be2net > iscsi_boot_sysfs libiscsi vxlan scsi_transport_fc ip6_udp_tunnel > scsi_transport_iscsi udp_tunnel wmi fjes scsi_dh_emc scsi_dh_rdac > scsi_dh_alua dm_multipath > > 2017-10-18T20:49:26.462311+00:00 node-58 kernel: [1297007.625008]
Re: [Openstack] [Fuel] node name issue
I ran "fuel2 node update -H blade13 20" just to get out of the node-* naming convention, as someone suggested The deploy still names the node node-11 and provisioning fails. digging a little more, i see it might have to do with the fuel-agent cloud-init scripts. in the cloud-init.log on the new node I see the node name being set to node-11! this isnt node-11. this was node-20, but I renamed it to blade13 with the command "fuel2 node update -H blade13 20" i also noted that after the cloud-init scripts ran at the end the first boot of the new provisioned OS, that in the Fuel GUI, the FQDN field became node-11.ourdomain.com (before it was bootstrap.ourdomain.com) (in the same window Hostname still show as blade13) But FQDN in the Fuel2 CLI output still shows node-20.ourdomain.com!!! [fuel2 node show 20 | id | 20 | | name| Untitled (68:58) | | status | ready | | os_platform | ubuntu | | roles | [u'compute'] | | kernel_params | None | | pending_roles | [] | | hostname| node-20 | | fqdn| node-20.dialogic.com | | platform_name | ProLiant BL460c Gen9 | where can i find the cloud init settings which are deploy to new nodes? i guess this has something to do with this file: /usr/share/fuel-agent/cloud-init-templates/cloud_config_ubuntu.jinja2 in that file I see hostname: {{ common.hostname }} fqdn: {{ common.fqdn }} please help me with an info you might have or let me know that populates those 2 parts of the template? Is there a database these values are all stored in on the fuel server? Thanks --Jim -- Jim On Tue, Sep 26, 2017 at 12:00 PM, Jim Okken <j...@jokken.com> wrote: > also I should add, I dont have the original hard drives in the system so > it isn't because it is booting the old OS where these node names were set. > this is definitely the newly installed OS being given the wroing hostname > > > > is there a database this is all kept in? maybe I could look around and > find where these old node names are being saved? > > thanks! > > -- Jim > > On Mon, Sep 25, 2017 at 6:03 PM, Jim Okken <j...@jokken.com> wrote: > >> hi all, >> >> I am using Fuel 10. >> >> i have 2 nodes I am trying to deploy as compute nodes. at one time in the >> past I was attempting to deploy them too. I assume back then their node >> names were node-11 and node-20. >> >> they were never successfully deploy and now I've worked out their >> hardware issues and are attempting to deploy them again. now Fuel has given >> them the names node-80 and node-81. >> (i may be at 80 in my node names but I only have 17 nodes so far) >> >> the deploy of these 2 nodes does not get past installing Ubuntu. The >> nodes reboot after Ubuntu is installed and come up incorrectly as node-11 >> and node-20. After that Fuel sits for a long while and then gives an error >> (pasted at the end of email). I assume the nodes come up with the wrong >> name/ip/ssh-key and Fuel can't contact them. >> >> I'm a novice at using the fuel and fuel2 cli's but I've tried deleting >> these nodes and removing from database. Then re-PXE boot the nodes and >> start a fresh deploy just to have them named node11 and 20 again. Fuel cli >> does show the correct host name for these nodes, but I've tried anyway to >> (re)set the host name for these node with no affect. >> >> If I try to delete node-11 and node-20 I get this error >> 404 Client Error: Not Found for url: http://10.20.243.1:8000/api/v1 >> /nodes/?ids=11 (NodeCollection not found) >> >> what can I do to get past this please? >> >> >> >> Errors from the Fuel Astute log: >> 2017-09-25 21:06:28 ERROR [1565] Error running provisioning: >> # , >> trace: ["/usr/share/gems/gems/astute-10.0.0/lib/astute/mclient.rb:178:in >> `rescue in initialize_mclient'", "/usr/share/gems/gems/astute-1 >> 0.0.0/lib/astute/mclient.rb:161:in `initialize_mclient'", >> "/usr/share/gems/gems/astute-10.0.0/lib/astute/mclient.rb:51:in >> `initialize'", "/usr/share/gems/gems/astute-1 >> 0.0.0/lib/astute/nailgun_hooks.rb:421:in `new'", >> "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:421:in >> `run_shell_without_check'", "/usr/share/gems/gems/astute-1 >> 0.0.0/lib/astute/nailgun_hooks.rb:449:in `update_node_status'", >> "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:313:in >> `reboot_hook'", "/usr/share/gems/gems/astute-1 >> 0.0.0/lib/astute/nailg
Re: [Openstack] [Fuel] node name issue
also I should add, I dont have the original hard drives in the system so it isn't because it is booting the old OS where these node names were set. this is definitely the newly installed OS being given the wroing hostname is there a database this is all kept in? maybe I could look around and find where these old node names are being saved? thanks! -- Jim On Mon, Sep 25, 2017 at 6:03 PM, Jim Okken <j...@jokken.com> wrote: > hi all, > > I am using Fuel 10. > > i have 2 nodes I am trying to deploy as compute nodes. at one time in the > past I was attempting to deploy them too. I assume back then their node > names were node-11 and node-20. > > they were never successfully deploy and now I've worked out their hardware > issues and are attempting to deploy them again. now Fuel has given them the > names node-80 and node-81. > (i may be at 80 in my node names but I only have 17 nodes so far) > > the deploy of these 2 nodes does not get past installing Ubuntu. The nodes > reboot after Ubuntu is installed and come up incorrectly as node-11 and > node-20. After that Fuel sits for a long while and then gives an error > (pasted at the end of email). I assume the nodes come up with the wrong > name/ip/ssh-key and Fuel can't contact them. > > I'm a novice at using the fuel and fuel2 cli's but I've tried deleting > these nodes and removing from database. Then re-PXE boot the nodes and > start a fresh deploy just to have them named node11 and 20 again. Fuel cli > does show the correct host name for these nodes, but I've tried anyway to > (re)set the host name for these node with no affect. > > If I try to delete node-11 and node-20 I get this error > 404 Client Error: Not Found for url: http://10.20.243.1:8000/api/ > v1/nodes/?ids=11 (NodeCollection not found) > > what can I do to get past this please? > > > > Errors from the Fuel Astute log: > 2017-09-25 21:06:28 ERROR [1565] Error running provisioning: > # , > trace: ["/usr/share/gems/gems/astute-10.0.0/lib/astute/mclient.rb:178:in > `rescue in initialize_mclient'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/mclient.rb:161:in `initialize_mclient'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/mclient.rb:51:in > `initialize'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:421:in > `new'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:421:in > `run_shell_without_check'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/nailgun_hooks.rb:449:in `update_node_status'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:313:in > `reboot_hook'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:38:in > `block in process'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/nailgun_hooks.rb:26:in `each'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:26:in > `process'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/image_provision.rb:117:in > `reboot'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:273:in > `soft_reboot'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:240:in > `provision_piece'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/provision.rb:126:in `block (3 levels) in > provision_and_watch_progress'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/provision.rb:309:in `call'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:309:in > `sleep_not_greater_than'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/provision.rb:120:in `block (2 levels) in > provision_and_watch_progress'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/provision.rb:119:in `loop'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:119:in `block > in provision_and_watch_progress'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/provision.rb:118:in `catch'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:118:in > `provision_and_watch_progress'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/provision.rb:52:in `provision'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/orchestrator.rb:109:in > `provision'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/dispatcher.rb:46:in > `provision'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/dispatcher.rb:37:in > `image_provision'", > "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:172:in > `dispatch_message'", "/usr/share/gems/gems/astute- > 10.0.0/lib/astute/server/server.rb:131:in `block in dispatch'&
[Openstack] [Fuel] node name issue
hi all, I am using Fuel 10. i have 2 nodes I am trying to deploy as compute nodes. at one time in the past I was attempting to deploy them too. I assume back then their node names were node-11 and node-20. they were never successfully deploy and now I've worked out their hardware issues and are attempting to deploy them again. now Fuel has given them the names node-80 and node-81. (i may be at 80 in my node names but I only have 17 nodes so far) the deploy of these 2 nodes does not get past installing Ubuntu. The nodes reboot after Ubuntu is installed and come up incorrectly as node-11 and node-20. After that Fuel sits for a long while and then gives an error (pasted at the end of email). I assume the nodes come up with the wrong name/ip/ssh-key and Fuel can't contact them. I'm a novice at using the fuel and fuel2 cli's but I've tried deleting these nodes and removing from database. Then re-PXE boot the nodes and start a fresh deploy just to have them named node11 and 20 again. Fuel cli does show the correct host name for these nodes, but I've tried anyway to (re)set the host name for these node with no affect. If I try to delete node-11 and node-20 I get this error 404 Client Error: Not Found for url: http://10.20.243.1:8000/api/v1/nodes/?ids=11 (NodeCollection not found) what can I do to get past this please? Errors from the Fuel Astute log: 2017-09-25 21:06:28 ERROR [1565] Error running provisioning: # , trace: ["/usr/share/gems/gems/astute-10.0.0/lib/astute/mclient.rb:178:in `rescue in initialize_mclient'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/mclient.rb:161:in `initialize_mclient'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/mclient.rb:51:in `initialize'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:421:in `new'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:421:in `run_shell_without_check'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:449:in `update_node_status'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:313:in `reboot_hook'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:38:in `block in process'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:26:in `each'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/nailgun_hooks.rb:26:in `process'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/image_provision.rb:117:in `reboot'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:273:in `soft_reboot'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:240:in `provision_piece'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:126:in `block (3 levels) in provision_and_watch_progress'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:309:in `call'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:309:in `sleep_not_greater_than'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:120:in `block (2 levels) in provision_and_watch_progress'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:119:in `loop'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:119:in `block in provision_and_watch_progress'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:118:in `catch'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:118:in `provision_and_watch_progress'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/provision.rb:52:in `provision'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/orchestrator.rb:109:in `provision'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/dispatcher.rb:46:in `provision'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/dispatcher.rb:37:in `image_provision'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:172:in `dispatch_message'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:131:in `block in dispatch'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/task_queue.rb:64:in `call'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/task_queue.rb:64:in `block in each'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/task_queue.rb:56:in `each'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/task_queue.rb:56:in `each'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:128:in `each_with_index'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:128:in `dispatch'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/server/server.rb:106:in `block in perform_main_job'"] 2017-09-25 21:06:26 ERROR [1565] Error occured while provisioning: # > 2017-09-25 21:06:26 ERROR [1565] No more retries for MCollective client instantiation after exception: ["/usr/share/gems/gems/mcollective-client-2.8.4/lib/mcollective/rpc/client.rb:507:in `discover'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/mclient.rb:167:in `initialize_mclient'", "/usr/share/gems/gems/astute-10.0.0/lib/astute/mclient.rb:51:in `initialize'",
[Openstack] [Fuel] Danube Fuel 10 compute node base system partition size
Hi all, In danube disk provisioning for a compute node, the smallest disk/partition size for the base system is 54GB. After I deploy a compute node I see 44GB free of the 54GB. So it seems something smaller that 54GB can be used. Can I somehow change the setting for the smallest disk/partition size to something smaller so I can have Fuel deploy the base OS to a smaller drive? I have 14 HP blades with an internal 32GB disk which I would prefer to use for the base system. See /dev/mapper/os-root: Filesystem Size Used Avail Use% Mounted on udev 63G 0 63G 0% /dev tmpfs 13G 49M 13G 1% /run /dev/mapper/os-root 50G 3.0G 44G 7% / tmpfs 63G 0 63G 0% /dev/shm tmpfs5.0M 0 5.0M 0% /run/lock tmpfs 63G 0 63G 0% /sys/fs/cgroup /dev/sda3196M 58M 129M 32% /boot /dev/mapper/vm-nova 318G 33M 318G 1% /var/lib/nova cgmfs100K 0 100K 0% /run/cgmanager/fs /dev/mapper/3600c0ff0001ea00fa8a1b6590300-part1 280G 4.5G 275G 2% /mnt/MSA_FC_Vol1 tmpfs 13G 0 13G 0% /run/user/0 thanks! ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Re: [Openstack] [Fuel] storage question. (Fuel 10 Newton deploy with storage nodes)
thanks for the help once again Eddie! im sure you remember i have that fiber channel SAN configuration This system has a 460GB disk mapped to it from the fiber channel SAN. As far as I can tell this disk isn't much different to the OS than a local SATA drive. There is also a internal 32GB USB/Flash drive in this system which isn't even shown in the Fuel 10 GUI In the bootstrap OS I see: ls /dev/disk/by-path: pci-:00:14.0-usb-0:3.1:1.0-scsi-0:0:0:0 pci-:09:00.0-fc-0x247000c0ff25ce6d-lun-12 pci-:09:00.0-fc-0x207000c0ff25ce6d-lun-12 both those xxx-lun-12 devices are the same drive. I also see one /dev/dm-X device lsblk /dev/dm-0 NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT 3600c0ff0001ea00f5d1fa4590100 252:00 429.3G 0 mpath there are 3 /dev/sdX devices 1. (parted) select /dev/sda Using /dev/sda (parted) print Model: HP iLO Internal SD-CARD (scsi) Disk /dev/sdd: 32.1GB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 2. (parted) select /dev/sdb Using /dev/sdb (parted) print Error: /dev/sdb: unrecognised disk label Model: HP MSA 2040 SAN (scsi) Disk /dev/sdb: 461GB Sector size (logical/physical): 512B/512B Partition Table: unknown Disk Flags: 3. (parted) select /dev/sdc Using /dev/sdc (parted) print Error: /dev/sdc: unrecognised disk label Model: HP MSA 2040 SAN (scsi) Disk /dev/sdc: 461GB Sector size (logical/physical): 512B/512B Partition Table: unknown Disk Flags: dev sdb and sdc are the same disk. I see this bug, but wouldn't know how to even start applying a patch if it apply to my situation. https://bugs.launchpad.net/fuel/+bug/1652788 thanks! -- Jim On Mon, Sep 4, 2017 at 2:34 AM, Eddie Yen <missile0...@gmail.com> wrote: > Hi > > Can you describe your disk configuration and partitioning? > > 2017-09-02 4:57 GMT+08:00 Jim Okken <j...@jokken.com>: > >> Hi all, >> >> >> >> Can you offer and insight in this failure I get when deploying 2 compute >> nodes using Fuel 10, please? (contoller etc nodes are all deployed/working) >> >> >> >> fuel_agent.cmd.agent PartitionNotFoundError: Partition >> /dev/mapper/3600c0ff0001ea00f521fa4590100-part2 not found after >> creation fuel_agent.cmd.agent [-] Partition >> /dev/mapper/3600c0ff0001ea00f521fa4590100-part2 >> not found after creation >> >> >> >> >> >> ls -al /dev/mapper >> >> 600c0ff0001ea00f521fa4590100 -> ../dm-0 >> >> 600c0ff0001ea00f521fa4590100-part1 -> ../dm-1 >> >> 600c0ff0001ea00f521fa4590100p2 -> ../dm-2 >> >> >> >> Why the 2nd partition was created and actually named "...000p2" rather >> than "...000-part2" is beyond me. >> >> >> >> More logging if it helps, lots of failures: >> >> >> >> 2017-09-01 18:42:32ERRpuppet-user[3642]: /bin/bash >> "/etc/puppet/shell_manifests/provision_56_command.sh" returned 255 >> instead of one of [0] >> >> 2017-09-01 18:42:32NOTICE puppet-user[3642]: >> (/Stage[main]/Main/Exec[provision_56_shell]/returns) Partition >> /dev/mapper/3600c0ff0001ea00f5d1fa4590100-part2 not found after >> creation >> >> 2017-09-01 18:42:32NOTICE puppet-user[3642]: >> (/Stage[main]/Main/Exec[provision_56_shell]/returns) Unexpected error >> >> 2017-09-01 18:42:32NOTICE puppet-user[3642]: >> (/Stage[main]/Main/Exec[provision_56_shell]/returns) /bin/bash: warning: >> setlocale: LC_ALL: cannot change locale (en_US.UTF-8) >> >> 2017-09-01 18:42:31WARNING systemd-udevd[4982]: >> Process '/sbin/kpartx -u -p -part /dev/dm-0' failed with exit code 1. >> >> 2017-09-01 18:42:31INFO multipathd[1012]: dm-3: remove map >> (uevent) >> >> 2017-09-01 18:42:31WARNING systemd-udevd[4964]: >> Process '/usr/bin/partx -d --nr 1-1024 /dev/sdc' failed with exit code 1. >> >> 2017-09-01 18:42:31WARNING systemd-udevd[4963]: >> Process '/usr/bin/partx -d --nr 1-1024 /dev/sdb' failed with exit code 1. >> >> 2017-09-01 18:42:31ERRmultipath: /dev/sda: can't store >> path info >> >> 2017-09-01 18:42:30WARNING systemd-udevd[4889]: >> Process '/sbin/kpartx -u -p -part /dev/dm-0' failed with exit code 1. >> >> 2017-09-01 18:42:29INFO multipathd[1012]: dm-3: remove map >> (uevent) >> >> 2017-09-01 18:42:29WARNING systemd-udevd[4866]: >> Process '/usr/bin/partx -d --nr 1-1024 /dev/
Re: [Openstack] [Fuel] storage question. (Fuel 10 Newton deploy with storage nodes)
Hi all, Can you offer and insight in this failure I get when deploying 2 compute nodes using Fuel 10, please? (contoller etc nodes are all deployed/working) fuel_agent.cmd.agent PartitionNotFoundError: Partition /dev/mapper/3600c0ff0001ea00f521fa4590100-part2 not found after creation fuel_agent.cmd.agent [-] Partition /dev/mapper/3600c0ff0001ea00f521fa4590100-part2 not found after creation ls -al /dev/mapper 600c0ff0001ea00f521fa4590100 -> ../dm-0 600c0ff0001ea00f521fa4590100-part1 -> ../dm-1 600c0ff0001ea00f521fa4590100p2 -> ../dm-2 Why the 2nd partition was created and actually named "...000p2" rather than "...000-part2" is beyond me. More logging if it helps, lots of failures: 2017-09-01 18:42:32ERRpuppet-user[3642]: /bin/bash "/etc/puppet/shell_manifests/provision_56_command.sh" returned 255 instead of one of [0] 2017-09-01 18:42:32NOTICE puppet-user[3642]: (/Stage[main]/Main/Exec[provision_56_shell]/returns) Partition /dev/mapper/3600c0ff0001ea00f5d1fa4590100-part2 not found after creation 2017-09-01 18:42:32NOTICE puppet-user[3642]: (/Stage[main]/Main/Exec[provision_56_shell]/returns) Unexpected error 2017-09-01 18:42:32NOTICE puppet-user[3642]: (/Stage[main]/Main/Exec[provision_56_shell]/returns) /bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) 2017-09-01 18:42:31WARNING systemd-udevd[4982]: Process '/sbin/kpartx -u -p -part /dev/dm-0' failed with exit code 1. 2017-09-01 18:42:31INFO multipathd[1012]: dm-3: remove map (uevent) 2017-09-01 18:42:31WARNING systemd-udevd[4964]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdc' failed with exit code 1. 2017-09-01 18:42:31WARNING systemd-udevd[4963]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdb' failed with exit code 1. 2017-09-01 18:42:31ERRmultipath: /dev/sda: can't store path info 2017-09-01 18:42:30WARNING systemd-udevd[4889]: Process '/sbin/kpartx -u -p -part /dev/dm-0' failed with exit code 1. 2017-09-01 18:42:29INFO multipathd[1012]: dm-3: remove map (uevent) 2017-09-01 18:42:29WARNING systemd-udevd[4866]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdb' failed with exit code 1. 2017-09-01 18:42:29WARNING systemd-udevd[4867]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdc' failed with exit code 1. 2017-09-01 18:42:29ERRmultipath: /dev/sda: can't store path info 2017-09-01 18:42:28WARNING systemd-udevd[4791]: Process '/sbin/kpartx -u -p -part /dev/dm-0' failed with exit code 1. 2017-09-01 18:42:28INFO multipathd[1012]: dm-3: remove map (uevent) 2017-09-01 18:42:28WARNING systemd-udevd[4773]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdb' failed with exit code 1. 2017-09-01 18:42:28WARNING systemd-udevd[4774]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdc' failed with exit code 1. 2017-09-01 18:42:28ERRmultipath: /dev/sda: can't store path info 2017-09-01 18:42:28INFO multipathd[1012]: dm-2: remove map (uevent) 2017-09-01 18:42:27WARNING systemd-udevd[4655]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdc' failed with exit code 1. 2017-09-01 18:42:27WARNING systemd-udevd[4654]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdb' failed with exit code 1. 2017-09-01 18:42:27ERRmultipath: /dev/sda: can't store path info 2017-09-01 18:42:26WARNING systemd-udevd[4576]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdb' failed with exit code 1. 2017-09-01 18:42:26WARNING systemd-udevd[4577]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdc' failed with exit code 1. 2017-09-01 18:42:26ERRmultipath: /dev/sda: can't store path info 2017-09-01 18:42:26INFO multipathd[1012]: dm-2: remove map (uevent) 2017-09-01 18:42:25NOTICE nailgun-agent: I, [2017-09-01T18:42:21.541001 #3601] INFO -- : Wrote data to file '/etc/nailgun_uid'. Data: 56 2017-09-01 18:42:24WARNING systemd-udevd[4114]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdb' failed with exit code 1. 2017-09-01 18:42:24WARNING systemd-udevd[4115]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdc' failed with exit code 1. 2017-09-01 18:42:24ERRmultipath: /dev/sda: can't store path info 2017-09-01 18:42:24INFO multipathd[1012]: dm-2: remove map (uevent) 2017-09-01 18:42:24NOTICE nailgun-agent: I, [2017-09-01T18:42:20.153616 #3601] INFO -- : API URL is https://10.20.243.1:8443/api 2017-09-01 18:42:24ERRmultipath: /dev/sda: can't store path info 2017-09-01 18:42:24WARNING systemd-udevd[3965]: Process '/usr/bin/partx -d --nr 1-1024 /dev/sdc' failed
Re: [Openstack] [Fuel] storage question. (Fuel 10 Newton deploy with storage nodes)
thanks Mike for the info yes I do want very fast VM provisioning and all the useful features that comes with having all 3 glance/cinder/ephemeral in CEPH on the storage node. But I can't afford to have my vHD (either as a cinder volume, or as a ephemeral volume) over the network on the storage node. Do any FUEL experts know exactly what the "Ceph RBD for ephemeral volumes (Nova)" option in Fuel 10 does? Does it move the running instances vHD off the hypervisors, and onto the storage node? (aka: move ephemeral from local IO to, network IO?) thanks! -- Jim On Fri, Aug 25, 2017 at 12:08 AM, Mike Smith <mism...@overstock.com> wrote: > Ceph is basically a ‘swiss army knife of storage’. It can play multiple > roles in an Openstack deployment, which is one reason why it is so popular > among this crowd. It can be used as storage for: > > - nova ephemeral disks (Ceph RBD) > - replacement for swift (Ceph Object) > - cinder volume backend (Ceph RBD) > - glance image backend (Ceph RBD) > - gnocchi metrics storage (Ceph Object) > - generic filesystem (CephFS) > > …and probably a few more that I’m missing. > > The combination of Ceph as backend for glance and nova ephemeral and/or > cinder volumes is gorgeous because it’s an ‘instance clone’ of the glance > image into the disk/volume which means very fast VM provisioning. Some > people boot instances off of nova ephemeral storage, some prefer to boot > off of cinder volumes. It depends if you want features like QoS (I/O > limiting), snapshots, backup, and whether you want the data to be able to > ‘persist’ as a volume after the VM that uses it is removed or if you want > it to disappear when the VM is deleted (i.e. ‘ephemeral’) > > I’m not a ‘fuel/mirantis guy’ so I can’t tell you specifically what those > options in their installer do, but generally Ceph storage is often housed > on separate servers dedicated to Ceph regardless of how you want to use > it. Some people to colocate Ceph onto their compute nodes and have them > perform double duty (i.e. ‘hyperconverged’) > > Hopefully this gives you a little bit of information regarding how Ceph is > used. > > > Mike Smith > Lead Cloud System Architect > Overstock.com > > > > On Aug 24, 2017, at 9:22 PM, Jim Okken <j...@jokken.com> wrote: > > Ive been learning a bit more about storage. Let me share what think I know > and ask a more specific question. Please correct me if I am off on what I > think I know. > > > Glance Images and Cinder Volumes are traditionally stored on the storage > node. Ephemeral volumes (Nova managed, traditionally on the compute node) > are the copy of the Glance image that has been copied to the compute node > and booted as an instances' vHD. Cinder volumes can (among other things) be > added to an instance as additional storage besides this Glance Image. > > > In Fuel I set the "Ceph RBD for volumes (Cinder)" and "Ceph RBD for images > (Glance)" settings, which will setup Glance and Cinder on the CEPH OSD > storage nodes. > > But I am not sure about what the setting "Ceph RBD for ephemeral volumes > (Nova)" will do. > > Would selecting it move the running instances' vHD off the hypervisors and > onto the storage node? (aka: move ephemeral from local to over the network? > > > Thanks > > > --jim > > > > On Thu, Aug 24, 2017 at 12:14 PM, Jim Okken <j...@jokken.com> wrote: > >> Hi all, >> >> >> We have a pretty complicated storage setup and I am not sure how to >> configure Fuel for deployment of the storage nodes. I'm using Fuel >> 10/Newton. Plus i'm a bit confused on some of the storage aspects >> (image/glance, volume/cinder, ephemeral/?.) >> >> >> We have 3 nodes dedicated to be storage nodes, for HA. >> >> We’re using fiber channel extents and need to use the CEPH filesystem. >> >> >> I’ll try to simplify the storage situation at first to ask my initial >> question without too many details. >> >> >> We have a fast and a slow storage location. Management tells me they want >> the slow location for the Glance images and the fast location for the place >> where the instances actually run. (assume compute nodes with slow hard >> drives but access to a fast fiber channel volume.) >> >> >> Where is “the place where the instances actually run”. It isn’t via >> Glance nor Cinder is it? >> >> When I configure the storage for CEPH OSD node I see volume settings for >> Base System, CEPH and CEPH journal. (I see my slow storage and my fast >> storage disks). >> >> >> When I configure the storage for a Compute node I see volum
[Openstack] [Fuel] storage question. (Fuel 10 Newton deploy with storage nodes)
Ive been learning a bit more about storage. Let me share what think I know and ask a more specific question. Please correct me if I am off on what I think I know. Glance Images and Cinder Volumes are traditionally stored on the storage node. Ephemeral volumes (Nova managed, traditionally on the compute node) are the copy of the Glance image that has been copied to the compute node and booted as an instances' vHD. Cinder volumes can (among other things) be added to an instance as additional storage besides this Glance Image. In Fuel I set the "Ceph RBD for volumes (Cinder)" and "Ceph RBD for images (Glance)" settings, which will setup Glance and Cinder on the CEPH OSD storage nodes. But I am not sure about what the setting "Ceph RBD for ephemeral volumes (Nova)" will do. Would selecting it move the running instances' vHD off the hypervisors and onto the storage node? (aka: move ephemeral from local to over the network? Thanks --jim On Thu, Aug 24, 2017 at 12:14 PM, Jim Okken <j...@jokken.com> wrote: > Hi all, > > > We have a pretty complicated storage setup and I am not sure how to > configure Fuel for deployment of the storage nodes. I'm using Fuel > 10/Newton. Plus i'm a bit confused on some of the storage aspects > (image/glance, volume/cinder, ephemeral/?.) > > > We have 3 nodes dedicated to be storage nodes, for HA. > > We’re using fiber channel extents and need to use the CEPH filesystem. > > > I’ll try to simplify the storage situation at first to ask my initial > question without too many details. > > > We have a fast and a slow storage location. Management tells me they want > the slow location for the Glance images and the fast location for the place > where the instances actually run. (assume compute nodes with slow hard > drives but access to a fast fiber channel volume.) > > > > Where is “the place where the instances actually run”. It isn’t via Glance > nor Cinder is it? > > When I configure the storage for CEPH OSD node I see volume settings for > Base System, CEPH and CEPH journal. (I see my slow storage and my fast > storage disks). > > > When I configure the storage for a Compute node I see volume settings for > Base system and Virtual Storage. Is this Ephemeral storage? How does a > Virtual Storage volume here compare to the CEPH volume on the CEPH OSD? > > > I have seen an openstack instance who’s .xml file on the compute node > shows the vHD as a CEPH path (ie: > rbd:compute/f63e4d30-7706-40be-8eda-b74e91b9dac1_disk. > Is this a CEPH local to the compute node or CEPH on the storage node? (Is > this Ephemeral storage?) > > > Thanks for any help you might have, I’m a bit confused > > > thanks > > > -- Jim > ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
[Openstack] storage questions (Fuel 10 Newton deploy)
Hi all, We have a pretty complicated storage setup and I am not sure how to configure Fuel for deployment of the storage nodes. I'm using Fuel 10/Newton. Plus i'm a bit confused on some of the storage aspects (image/glance, volume/cinder, ephemeral/?.) We have 3 nodes dedicated to be storage nodes, for HA. We’re using fiber channel extents and need to use the CEPH filesystem. I’ll try to simplify the storage situation at first to ask my initial question without too many details. We have a fast and a slow storage location. Management tells me they want the slow location for the Glance images and the fast location for the place where the instances actually run. (assume compute nodes with slow hard drives but access to a fast fiber channel volume.) Where is “the place where the instances actually run”. It isn’t via Glance nor Cinder is it? When I configure the storage for CEPH OSD node I see volume settings for Base System, CEPH and CEPH journal. (I see my slow storage and my fast storage disks). When I configure the storage for a Compute node I see volume settings for Base system and Virtual Storage. Is this Ephemeral storage? How does a Virtual Storage volume here compare to the CEPH volume on the CEPH OSD? I have seen an openstack instance who’s .xml file on the compute node shows the vHD as a CEPH path (ie: rbd:compute/f63e4d30-7706-40be-8eda-b74e91b9dac1_disk. Is this a CEPH local to the compute node or CEPH on the storage node? (Is this Ephemeral storage?) Thanks for any help you might have, I’m a bit confused thanks -- Jim ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack