[Yahoo-eng-team] [Bug 1936972] Re: MAAS deploys fail if host has NIC w/ random MAC

2021-10-20 Thread Björn Tillenius
I don't think this is a feature request. Ignoring the NIC in MAAS, might
be reasonable. Although it's odd that the NIC doesn't have a MAC of its
own. Is that a hardware feature, or is it the driver that doesn't
surface the physical MAC?

Also, could you please provide the current output from the machine-
resources resources binary for that machine?

** Changed in: maas
   Status: Invalid => Incomplete

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1936972

Title:
  MAAS deploys fail if host has NIC w/ random MAC

Status in cloud-init:
  Incomplete
Status in curtin:
  New
Status in MAAS:
  Incomplete

Bug description:
  The Nvidia DGX A100 server includes a USB Redfish Host Interface NIC.
  This NIC apparently provides no MAC address of it's own, so the driver
  generates a random MAC for it:

  ./drivers/net/usb/cdc_ether.c:

  static int usbnet_cdc_zte_bind(struct usbnet *dev, struct usb_interface *intf)
  {
  int status = usbnet_cdc_bind(dev, intf);

  if (!status && (dev->net->dev_addr[0] & 0x02))
  eth_hw_addr_random(dev->net);

  return status;
  }

  This causes a problem with MAAS because, during deployment, MAAS sees
  this as a normal NIC and records the MAC. The post-install reboot then
  fails:

  [   43.652573] cloud-init[3761]: init.apply_network_config(bring_up=not 
args.local)
  [   43.700516] cloud-init[3761]:   File 
"/usr/lib/python3/dist-packages/cloudinit/stages.py", line 735, in 
apply_network_config
  [   43.724496] cloud-init[3761]: 
self.distro.networking.wait_for_physdevs(netcfg)
  [   43.740509] cloud-init[3761]:   File 
"/usr/lib/python3/dist-packages/cloudinit/distros/networking.py", line 177, in 
wait_for_physdevs
  [   43.764523] cloud-init[3761]: raise RuntimeError(msg)
  [   43.780511] cloud-init[3761]: RuntimeError: Not all expected physical 
devices present: {'fe:b8:63:69:9f:71'}

  I'm not sure what the best answer for MAAS is here, but here's some
  thoughts:

  1) Ignore all Redfish system interfaces. These are a connect between the host 
and the BMC, so they don't really have a use-case in the MAAS model AFAICT. 
These devices can be identified using the SMBIOS as described in the Redfish 
Host Interface Specification, section 8:

https://www.dmtf.org/sites/default/files/standards/documents/DSP0270_1.3.0.pdf
  Which can be read from within Linux using dmidecode.

  2) Ignore (or specially handle) all NICs with randomly generated MAC
  addresses. While this is the only time I've seen the random MAC with
  production server hardware, it is something I've seen on e.g. ARM
  development boards. Problem is, I don't know how to detect a generated
  MAC. I'd hoped the permanent MAC (ethtool -P) MAC would be NULL, but
  it seems to also be set to the generated MAC :(

  fyi, 2 workarounds for this that seem to work:
   1) Delete the NIC from the MAAS model in the MAAS UI after every 
commissioning.
   2) Use a tag's kernel_opts field to modprobe.blacklist the driver used for 
the Redfish NIC.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1936972/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1839491] Re: Manually performed partitioning changes get reverted on reboot

2019-08-14 Thread Björn Tillenius
Sounds like this is indeed an issue in MAAS then. MAAS should turn off
growpart, since we know how big the disks are already and can set up the
right partition size during installation.

** Changed in: maas
   Status: Invalid => Triaged

** Changed in: maas
   Importance: Undecided => High

** Changed in: maas
Milestone: None => 2.7.0alpha1

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1839491

Title:
  Manually performed partitioning changes get reverted on reboot

Status in cloud-init:
  Incomplete
Status in MAAS:
  Triaged

Bug description:
  Hello,

  I am facing an issue where I need to make changes to the initially
  deployed partition layout, but upon making those changes and
  rebooting, the partition layout gets reverted.

  My env:
  MAAS version: 2.6.0 (7802-g59416a869-0ubuntu1~18.04.1)
  System vendor: HP
  System product: ProLiant DL360 Gen9 (780021-S01)
  System version: Unknown
  Mainboard product: ProLiant DL360 Gen9
  Mainboard firmware version: P89
  Mainboard firmware date: 12/27/2015
  CPU model: Intel(R) Xeon(R) CPU E5-2690 v3
  Deployed (16.04 LTS "Xenial Xerus")
  Kernel: xenial (ga-16.04)
  Power type: ipmi
  Power driver: LAN_2_0 [IPMI 2.0]
  Power boot type: EFI boot
  Architecture amd64/generic
  Minimum Kernel: no minimum kernel
  Interfaces: eno1, eno2, noe3, eno4, eno49, eno50. Only eno49 is used.
  Storage: sda Physical 1TB, sdb Physical 1TB.

  
  Steps to reproduce:

  1. Deploy MAAS with the following partition configuration:
  sda-part1 536.9 MB Partition fat32 formatted filesystem mounted at /boot/efi
  sda-part2 100.0 GB Partition ext4 formatted filesystem mounted at /

  2. Check the partitions on the node:

  $ lsblk

  NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
  sda  8:00 931.5G  0 disk 
  |-sda1   8:10   512M  0 part /boot/efi
  `-sda2   8:20   931G  0 part /
  sdb  8:16   0 931.5G  0 disk 

  
  Here we notice the initial partitioning scheme is not respected. This could 
be related to the main issue of partitioning changes being reverted, but could 
also be a separate issue.

  3. Boot an ubuntu ISO and go into rescue mode. I used ubuntu-16.04.6
  -server-amd64.iso

  4. Choose "Do not use a root filesystem" and "Execute a shell in the
  installer environment".

  4. Run the following commands:

  $ e2fsck -f /dev/sda2

  $ resize2fs /dev/sda2 150G

  $ e2fsck -f /dev/sda2

  $ sudo parted /dev/sda

  (parted) unit GiB print

  (parted) resizepart

  Partition number? 2

  End? 200GiB

  (parted) print

  You should see partition 2 resized.

  (parted) quit

  $ e2fsck -f /dev/sda2

  5. Confirm

  $ fdisk -l

  6. Sync writes

  $ sync

  7. Reboot the node. Remove ISO image.

  8. Let system boot, check partitions again:

  $ lsblk

  NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
  sda  8:00 931.5G  0 disk 
  |-sda1   8:10   512M  0 part /boot/efi
  `-sda2   8:20   931G  0 part /
  sdb  8:16   0 931.5G  0 disk 

  We can see see that the changes were reverted.

  If I remove cloud-init, I can successfully re-partition and reboot,
  without the changes being reverted.

  Attached logs before and after repartition.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1839491/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1380965] [NEW] Floating IPs don't have instance ids in Juno

2014-10-14 Thread Björn Tillenius
Public bug reported:

In Icehouse, when I associate a floating IP with an instance, the Nova
API for listing floating IPs (/os-floating-ips) gives you the instance
ID of the associated instance:

  {floating_ips: [{instance_id: 82c2aff3-511b-
4e9e-8353-79da86281dfd, ip: 10.1.151.1, fixed_ip: 10.10.0.4,
id: 8113e71b-7194-447a-ad37-98182f7be80a, pool: ext_net}]}


With latest rc for Juno, the instance_id always seem to be null:

  {floating_ips: [{instance_id: null, ip: 10.96.201.0,
fixed_ip: 10.10.0.8, id: 00ffd9a0-5afe-4221-8913-7e275da7f82a,
pool: ext_net}]}

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1380965

Title:
  Floating IPs don't have instance ids in Juno

Status in OpenStack Compute (Nova):
  New

Bug description:
  In Icehouse, when I associate a floating IP with an instance, the Nova
  API for listing floating IPs (/os-floating-ips) gives you the instance
  ID of the associated instance:

{floating_ips: [{instance_id: 82c2aff3-511b-
  4e9e-8353-79da86281dfd, ip: 10.1.151.1, fixed_ip: 10.10.0.4,
  id: 8113e71b-7194-447a-ad37-98182f7be80a, pool: ext_net}]}

  
  With latest rc for Juno, the instance_id always seem to be null:

{floating_ips: [{instance_id: null, ip: 10.96.201.0,
  fixed_ip: 10.10.0.8, id: 00ffd9a0-5afe-4221-8913-7e275da7f82a,
  pool: ext_net}]}

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1380965/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp