Hi Manfred, hi Packmans, Am Samstag, 15. August 2020, 09:33:36 CEST schrieb Manfred Hollstein: > I don't know if this is caused by a planned downtime, but > pmbs.links2linux.org cannot be reached at the moment. > > Can you please take a look?
That was not really planned, but ... I had the long-standing issue with the defective network adapter in one of my hosts, which caused the frequent downtimes several months ago. Some of it was caused by the ISCSI-connection to my storage devices, which are now connected via NFSv4.1 - and rock-solid so far. For a time now I had several issues, which came with the territory of running uncertified hardware (Dell gen11 Server and VMware 6.7U3 - certified up to 6.0, Intel PRO/1000 ET, certified and supported until VMware 6.7U1). The problem surfaced while putting heavy load on the network interfaces on the Intel card, for instance a live-migration from one host to the other. The interface simply stopped, and could only be revived by unplugging/plugging the cable or shutting down and re- enabling the port on the switch. Known problem with the card. I bought 2 HP NC375T Quad-port cards and replaced the Intel ET last Wednesday. It looked good at first glance, but since Wednesday everything was slow. I mean: S-L-O-W. I thought it might by due to over-committing the buildwk[1-4] workers with 12 vCPU on 8core physical CPUs (with HT), and slow memory assignment in the NUMA architecture of the hosts (CPUs have "private" RAM, and can access RAM from different CPUs more slowly). To check this I wanted to shut down all VMs, reboot the hosts and boot everything up. Should not take more than 15 minutes. Unfortuntely i found out, that there still is a problem with the additional network adapters. During boot the cards cannot be trained, and the system stops there. A warm reboot can fix this. Here might still be a problem with the network cards, or a problem with the PCIe riser - the machines are about 8.5 years old. After booting up the first machine I could not migrate the VMs. I reconfigured the the VMotion interfaces to the on-board network cards and could then migrate. I have reconfigured both hosts now to just rely on the 4 on-board cards. I have only 3 cards connected to the switched, so there might be some congestion until Monday, when I can plug in the 4th card and reconfigure. Meanwhile PMBS is up again, and I will add the workers buildwk3 and -4 in the next minutes and check, how the system behaves. Additionally I ordered another 2 quad-port cards, now Broadcom, and will try with them sometime next week, or whenever they arrive. Sorry for the unexpected downtime, it was indeed 3 hours. Now enough with the chit-chat, go back to work :) Greetings, Stefan -- Stefan Botter zu Hause Bremen
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Packman mailing list [email protected] http://lists.links2linux.de/cgi-bin/mailman/listinfo/packman
