unfortunately i don't think so. we're pretty good about assigning addresses, but still human. i don't see any evidence of a dup'd address, but i'll keep looking
thanks On Thu, Oct 30, 2025 at 8:10 PM Mohr, Rick <[email protected]> wrote: > > Michael, > > It might be a long shot, but is there any chance another machine has the same > IP address as the one having problems? > > --Rick > > > > On 10/30/25, 3:09 PM, "lustre-discuss on behalf of Michael DiDomenico via > lustre-discuss" wrote: > our network is running 2.15.6 everywhere on rhel9.5, we recently built a new > machine using 2.15.7 on rhel9.6 and i'm seeing a strange problem. the client > is ethernet connected to ten lnet routers which bridge ethernet to > infiniband. i can mount the client just fine, read/write data, but then > several hours later, the client marks all the routers offline. the only > recovery is to lazy unmount, lustre_rmmod, and then restart the lustre mount > nothing unusual comes out in the journal/dmesg logs. to lustre it "looks" > like someone pulled the network cable, but there's no evidence that this has > happened physically or even at the switch/software layers we upgraded two > other machine to see if the problem replicates, but so far it hasn't. the > only significant difference between the three machines is the one with the > problem has heavy container (podman) usage, the others have zero. i'm not > sure if this is an cause or just a red herring any suggestions > > _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
