[Yahoo-eng-team] [Bug 1880828] Re: New instance is always in "spawning" status

2022-05-13 Thread Billy Olsen
Marking charm tasks as invalid on this particular bug as these aren't
related to the charms and were chased down to other components.

** Changed in: charm-nova-compute
   Status: New => Invalid

** Changed in: openstack-bundles
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1880828

Title:
  New instance is always in "spawning" status

Status in OpenStack Nova Compute Charm:
  Invalid
Status in OpenStack Compute (nova):
  Triaged
Status in OpenStack Bundles:
  Invalid

Bug description:
  bundle: openstack-base-bionic-train 
https://github.com/openstack-charmers/openstack-bundles/blob/master/development/openstack-base-bionic-train/bundle.yaml
  hardware: 2 d05 and 2 d06 (the log of the compute node is from one of the 
d06. Please note they are arm64 arch.)

  When trying to create new instances on the deployed openstack, the
  instance is always in the status of "spawning"

  [Steps to Reproduce]
  1. Deploy with the above bundle and hardware by following the instruction of 
https://jaas.ai/openstack-base/bundle/67
  2. Wait about 1.5 until the deployment is ready. By ready it means every unit 
shows its message as "ready" e.g. https://paste.ubuntu.com/p/k48YVnPyVZ/
  3. Follow the instruction of https://jaas.ai/openstack-base/bundle/67 until 
the step of "openstack server create" to create new instance. This step is also 
summarized in details in this gist code snippet 
https://gist.github.com/tai271828/b0c00a611e703046dd52da12a66226b0#file-02-basic-test-just-deployed-sh

  [Expected Behavior]
  An instance is created a few seconds later

  [Actual Behavior]
  The status of the instance is always (> 20 minutes) "spawning"

  [Additional Information]

  1. [workaround] Use `ps aux | grep qemu-img` to check if a qemu-img
  image converting process exists or not. The process should complete
  within ~20 sec. If the process exists for more than 1 minutes, use
  `pkill -f qemu-img` to terminate the process and re-create instances
  again.

  The image converting process looks like this one:

  ```
  qemu-img convert -t none -O raw -f qcow2 /var/lib/nova/instance 
s/_base/9b8156fbecaa194804a637226c8ffded93a57489.part 
/var/lib/nova/instances/_base/9b8156fbecaa194804a637226c8ffded93a57489.converted
  ```

  2. By investing in more details, this issue is a coupled issue of 1)
  nova should timeout instance process (comment#21) 2) qemu does not
  terminate the process to convert the image successfully (comment#20)

To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-nova-compute/+bug/1880828/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1880828] Re: New instance is always in "spawning" status

2020-05-27 Thread Taihsiang Ho
** Also affects: charm-nova-compute
   Importance: Undecided
   Status: New

** Also affects: nova
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1880828

Title:
  New instance is always in "spawning" status

Status in OpenStack nova-compute charm:
  New
Status in OpenStack Compute (nova):
  New
Status in OpenStack Bundles:
  New

Bug description:
  bundle: openstack-base-bionic-train 
https://github.com/openstack-charmers/openstack-bundles/blob/master/development/openstack-base-bionic-train/bundle.yaml
  hardware: 2 d05 and 2 d06 (the log of the compute node is from one of the d06)

  When trying to create new instances on the deployed openstack, the
  instance is always in the status of "spawning"

  [Steps to Reproduce]
  1. Deploy with the above bundle and hardware by following the instruction of 
https://jaas.ai/openstack-base/bundle/67
  2. Wait about 1.5 until the deployment is ready. By ready it means every unit 
shows its message as "ready" e.g. https://paste.ubuntu.com/p/k48YVnPyVZ/
  3. Follow the instruction of https://jaas.ai/openstack-base/bundle/67 until 
the step of "openstack server create" to create new instance. This step is also 
summarized in details in this gist code snippet 
https://gist.github.com/tai271828/b0c00a611e703046dd52da12a66226b0#file-02-basic-test-just-deployed-sh

  [Expected Behavior]
  An instance is created a few second later

  [Actual Behavior]
  The status of the instance is always (> 20 minutes) "spawning"

  [Additional Information]

  1. By waiting for 0.5 ~ 1 hr first, and then restart the nova-compute
  by "sudo service nova-compute restart", may make the creation behavior
  back to normal. You may refer to this snippet to see this workaround
  https://gist.github.com/tai271828/b0c00a611e703046dd52da12a66226b0#file-03
  -nova-compute-debugging-sh

  2. By applying the above workaround and then check log from the
  corresponding nova-compute node, you could find lines like:

   818 2020-05-26 02:33:17.845 417168 INFO nova.compute.claims 
[req-9b570186-d136-4df5-8b90-6dd3e8b67f3e 91f35a24c0b1465fbd147d70e439646e 
e16f574f6448430aac827c968e5145be - e61235e9fe7f4 d0284f035c7588dd2b9 
e61235e9fe7f4d0284f035c7588dd2b9] [instance: 
54e49495-c1c2-4567-8493-2c412d676e5c] Claim successful on node kreiken.maas
   819 2020-05-26 02:33:18.657 417168 INFO nova.virt.libvirt.driver 
[req-9b570186-d136-4df5-8b90-6dd3e8b67f3e 91f35a24c0b1465fbd147d70e439646e 
e16f574f6448430aac827c968e5145be - e61235e9 fe7f4d0284f035c7588dd2b9 
e61235e9fe7f4d0284f035c7588dd2b9] [instance: 
54e49495-c1c2-4567-8493-2c412d676e5c] Creating image
   820 2020-05-26 02:33:30.871 417168 ERROR nova.compute.manager 
[req-318355e9-213c-466b-bb9a-f8170ee7ff95 91f35a24c0b1465fbd147d70e439646e 
e16f574f6448430aac827c968e5145be - e61235e9fe7 f4d0284f035c7588dd2b9 
e61235e9fe7f4d0284f035c7588dd2b9] [instance: 
1f9d4cb9-f540-4e33-b36e-23e71268abcf] Instance failed to spawn: 
nova.exception.ImageUnacceptable: Image e9149aae 
-246a-4eb9-963e-d7f25286ba7a is unacceptable: Unable to convert image to raw: 
Image 
/var/lib/nova/instances/_base/9b8156fbecaa194804a637226c8ffded93a57489.part is 
unacceptable: Un able to convert image to raw: Unexpected error while 
running command.
   821 Command: qemu-img convert -t none -O raw -f qcow2 
/var/lib/nova/instances/_base/9b8156fbecaa194804a637226c8ffded93a57489.part 
/var/lib/nova/instances/_base/9b8156fbecaa194804a6372 
26c8ffded93a57489.converted
   822 Exit code: -15
   823 Stdout: ''
   824 Stderr: ''
   825 2020-05-26 02:33:30.871 417168 ERROR nova.compute.manager [instance: 
1f9d4cb9-f540-4e33-b36e-23e71268abcf] Traceback (most recent call last):
   826 2020-05-26 02:33:30.871 417168 ERROR nova.compute.manager [instance: 
1f9d4cb9-f540-4e33-b36e-23e71268abcf]   File 
"/usr/lib/python3/dist-packages/nova/virt/images.py", line 128, i n 
_convert_image
   827 2020-05-26 02:33:30.871 417168 ERROR nova.compute.manager [instance: 
1f9d4cb9-f540-4e33-b36e-23e71268abcf] compress)
   828 2020-05-26 02:33:30.871 417168 ERROR nova.compute.manager [instance: 
1f9d4cb9-f540-4e33-b36e-23e71268abcf]   File 
"/usr/lib/python3/dist-packages/nova/privsep/qemu.py", line 73, i n 
unprivileged_convert_image
   829 2020-05-26 02:33:30.871 417168 ERROR nova.compute.manager [instance: 
1f9d4cb9-f540-4e33-b36e-23e71268abcf] processutils.execute(*cmd)
   830 2020-05-26 02:33:30.871 417168 ERROR nova.compute.manager [instance: 
1f9d4cb9-f540-4e33-b36e-23e71268abcf]   File 
"/usr/lib/python3/dist-packages/oslo_concurrency/processutils.py" , line 
424, in execute
   831 2020-05-26 02:33:30.871 417168 ERROR nova.compute.manager [instance: 
1f9d4cb9-f540-4e33-b36e-23e71268abcf] cmd=sanitized_cmd)
   832 2020-05-26 02:33:30.871 417168 ERROR nova.compute.manager [instance: