Dear all,
Since last week we are facing 'hanging kernel threads' causing our Lustre
environment (Rocky 8.7/Lustre 2.15.2) to hang.
errors:
Dec 18 10:36:04 hb-oss01 kernel: LustreError: 137-5: scratch-OST0084_UUID:
not available for connect from 172.23.15.246@tcp30 (no target). If you are
running
Hi all,
We are trying to get Lustre secure client mount with distributed key:
[root@dh5-mds01 ger]# rpm -qa | grep lustre
lustre-2.15.3-1.el8.x86_64
kmod-lustre-2.15.3-1.el8.x86_64
lustre-osd-ldiskfs-mount-2.15.3-1.el8.x86_64
kmod-lustre-osd-ldiskfs-2.15.3-1.el8.x86_64
followed the instructions:
Dear community,
We at the university of Groningen (the Netherlands) are looking into doing
QoS on our Lustre file system to prevent users from suffocating our
filesystems. Lustre QoS using TBF is mentioned in a couple of
presentations/slides, but I failed to get/find some useful documentation on
h
Here at the University of Groningen we run a Lustre setup that has some
issues in client-nodes being evicted by the metadata-server:
Kernel: CentOS 7.5 3.10.0-862.2.3-lustre
Lustre: 2.10.4
Network IB/10 Gb Ethernet
logs client:
Dec 19 06:45:28 dh-node03 kernel: [1952901.506173] LustreError: 11-0
Persistent mount opts: errors=remount-ro
Parameters: mgsnode=172.23.34.214@tcp:172.23.34.213@tcp
volumes are now able to mount, with SElinux disabled!
On Thu, May 11, 2017 at 10:34 AM, Strikwerda, Ger
wrote:
> Hi Colin,
>
> [root@umcg-storage03 ~]# tunefs.lustre --print /dev/
;
> tunefs.lustre --print
>
> Do you have this option set within your mount options list?
>
> -cf
>
>
> On Wed, May 10, 2017 at 7:19 AM, Strikwerda, Ger
> wrote:
>
>> Hi all,
>>
>> On a OSS with SElinux disabled we get a strange probably selinux related
&
Hi all,
On a OSS with SElinux disabled we get a strange probably selinux related
error when we want to mount the OST:
[root@umcg-storage03 /]# cat /etc/selinux/config
SELINUX=disabled
# mount -t lustre /dev/dm-3 /mnt/umcgst08-01
dmesg log:
Lustre: Lustre: Build Version:
2.8.0-RC5--PRISTINE-2.6
The second option. We did not trust 'sminfo' so why not double check on the
IB switch or at least look at the logs of the IB switch to see what happens
over there.
On Mon, May 1, 2017 at 3:15 PM, E.S. Rosenberg
wrote:
>
>
> On Mon, May 1, 2017 at 3:45 PM, Strikwerda, Ger
&
18 PM, E.S. Rosenberg
wrote:
>
>
> On Mon, May 1, 2017 at 11:46 AM, Strikwerda, Ger
> wrote:
>
>> Hi all,
>>
>> Our clients-failed-to-mount/lctl ping horror, turned out to be a failing
>> subnet manager issue. We did no see an issue runnning '
gt; This means that the LND didn't connect at startup time, but I don't
>> know what the cause is.
>> > The error that generates this message is IB_CM_REJ_CONSUMER_DEFINED,
>> but I don't know enough about IB to tell you what that means. Some of the
>> later
> a lot of data.
>
> I guess you already tried removing the lustre drivers and adding it again
> ?
> lustre_rmmod
> modprobe -v lustre
>
> And check dmesg for any errors...
>
>
> On Mon, Apr 24, 2017 at 12:43 PM Strikwerda, Ger
> wrote:
>
>> Hi Raj,
>&
e this is the case.
>>
>> You wouldn't want to put mgs into capture debug messages as there will be
>> a lot of data.
>>
>> I guess you already tried removing the lustre drivers and adding it again
>> ?
>> lustre_rmmod
>> modprobe -v lustre
>>
&
, 2017 at 12:27 PM Strikwerda, Ger
> wrote:
>
>> Hi Eli,
>>
>> Nothing can be mounted on the Lustre filesystems so the output is:
>>
>> [root@pg-gpu01 ~]# lfs df /home/ger/
>> [root@pg-gpu01 ~]#
>>
>> Empty..
>>
>>
>>
>> On
Hi Eli,
Nothing can be mounted on the Lustre filesystems so the output is:
[root@pg-gpu01 ~]# lfs df /home/ger/
[root@pg-gpu01 ~]#
Empty..
On Mon, Apr 24, 2017 at 7:24 PM, E.S. Rosenberg wrote:
>
>
> On Mon, Apr 24, 2017 at 8:19 PM, Strikwerda, Ger
> wrote:
>
>> Hal
24, 2017 at 7:16 PM, E.S. Rosenberg
wrote:
>
>
> On Mon, Apr 24, 2017 at 8:13 PM, Strikwerda, Ger
> wrote:
>
>> Hi Raj (and others),
>>
>> In which file should i state the credits/peer_credits stuff?
>>
>> Perhaps relevant config-files:
>>
>&
n compare between working hosts and non working hosts.
> Thanks
> _Raj
>
> On Mon, Apr 24, 2017 at 10:10 AM Strikwerda, Ger
> wrote:
>
>> Hi Rick,
>>
>> Even without iptables rules and loading the correct modules afterwards,
>> we get the same results:
>&g
les.
>
> --
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
>
>
> > On Apr 24, 2017, at 10:19 AM, Strikwerda, Ger
> wrote:
> >
> > Hi Russell,
> >
> > Thanks
t; before doing so. In the past, they have sometimes told me that the
> > "latest FW version available for this device" reported by ibdiagnet is
> > incorrect and should be ignored. Of course, in other cases, newer
> > firmware versions were in fact available and they did
er suggestions.
>
> Best of luck,
> Rusty D.
>
> On Mon, Apr 24, 2017 at 10:19 AM, Strikwerda, Ger
> wrote:
> > Hi Russell,
> >
> > Thanks for the IB subnet clues:
> >
> > [root@pg-gpu01 ~]# ibv_devinfo
> > hca_id: mlx
Dekema
>
>
>
>
> On Mon, Apr 24, 2017 at 9:57 AM, Strikwerda, Ger
> wrote:
> > Hi everybody,
> >
> > Here at the university of Groningen we are now experiencing a strange
> Lustre
> > error. If a client reboots, it fails to mount the Lustr
Hi everybody,
Here at the university of Groningen we are now experiencing a strange
Lustre error. If a client reboots, it fails to mount the Lustre storage.
The client is not able to reach the MSG service. The storage and nodes are
communicating over IB and unitil now without any problems. It loo
21 matches
Mail list logo