Dear Fernando,
I'm not sure if those files contribute to the quota, but I would assume
that the ones on the OSTs consume disk quota and the ones on the MDT
consume inode quota.
As long as they are in the lost+found directory they are not visible to
the users, but they may contain data which
Are there a lot of inodes moved to lost+found by the fsck, which contribute to
the occupied quota now?
- Ursprüngliche Mail -
Von: Fernando Pérez
An: lustre-discuss@lists.lustre.org
Gesendet: Tue, 16 Apr 2019 16:24:13 +0200 (CEST)
Betreff: Re: [lustre-discuss] lfsck repair quota
Thank
Hello Roland,
there is a nice collection of lustre monitoring tools on the lustre wiki:
http://wiki.lustre.org/Lustre_Monitoring_and_Statistics_Guide
which also contains a couple of references. One of them is lltop, which
has already been mentioned a couple of times and that's what came to my
On 11/7/18 9:44 PM, Riccardo Veraldi wrote:
> Anyway I Was wondering if something different is needed for mlx5 and
> what are the suggested values in that case ?
>
> Anyone has experience with mlx5 LNET performance tunings ?
Hi Riccardo,
We have recently integrated mlx5 nodes into our fabric,
:55 PM, Martin Hecht wrote:
> Hi,
>
> I'm trying to build lustre 2.11 from source, with ldiskfs on CentOS 7.4.
>
> patching the kernel for ldiskfs worked fine, I have installed and booted
> the patched kernel as well as the devel-rpm, but when I run `make rpms`
> it exits with
Hi,
I'm trying to build lustre 2.11 from source, with ldiskfs on CentOS 7.4.
patching the kernel for ldiskfs worked fine, I have installed and booted
the patched kernel as well as the devel-rpm, but when I run `make rpms`
it exits with the following errors:
Processing files:
On 03/15/2018 04:48 PM, Steve Thompson wrote:
> If I go with one OST per system (one zpool comprising 8 x 6 RAIDZ2
> vdevs), I will have a lustre f/s comprised of two 60 TB OST's and two
> 192 TB OST's (minus RAIDZ2 overhead). This is obviously a big mismatch
> between OST sizes.
Depending on how
Hi Parag,
can you lctl ping 10.2.1.204@o2ib from the mgs node and from the mds
now? I have seen on the list that you were able to load the modules, but
well, if lnet is not working on the ib this might be a the reason for
the errors you are seeing.
Regards,
Martin
On 11/08/2017 09:15 AM, Parag
Hi Parag,
please reply to the list or keep it in cc at least
On 10/30/2017 01:21 PM, Parag Khuraswar wrote:
> Hi Martin,
>
> The problem got resolved.
> But I am not able to see ib in 'lctl list_nids' output
> My lnet.conf file entry is 'options lnet networks=o2ib(ib0)' This file is
> not
Hi,
On 10/30/2017 09:56 AM, Parag Khuraswar wrote:
> Hi,
>
> I am installing lustre cloned from github.
Hmm... there are a few lustre related repositories on github.
I would prefer the upstream Lustre git repository managed by Intel
git://git.hpdd.intel.com unless you are interested in specific
Hello,
we use the flock mount option on all our lustre systems (currently some
2.5 versions) and are not aware of any issues due to that.
If your applications run on a single node (or require locks only
locally) you could also try localflock.
localflock has less performance impact than the
I have seen this, too, on SL6, build went smoothly, but installation
failed. A few months before 2.9 was tagged on master the build and
install went smoothly. I'm not using zfs by the way. Unfortunately, I
didn't find the time yet, to investigate this more deeply.
Cheers, Martin
On 12/20/2016
Hi Kevin,
I think your proposed lnet config line is correct and it would add tcp0.
If you add a new lnet on the servers you have to reload the lnet module,
which implies that you have to restart lustre (you don't have to reboot
if unloading the modules works smoothly, i.e. unmounting all targets,
Hi James,
I'm not aware of a ready-to use tool, but if you have captured the
output of e2fsck you can use that as a basis for a script that puts the
files back to their original location.
e2fsck usually prints out the full path and the inode numbers of the
files/directories which it moves to
Hi,
I think your client doesn't have the o2ib lnet (it should appear in the
output of the lctl ping, even if you ping on the tcp lnet).
In your /etc/modprobe.d/lustre.conf o2ib is associated with the ib0
interface, but your /var/log/messages talks about ib1.
If it is a dual port card where just
c file. While heartbeat is one option for HA on
> servers, it definitely should not be required. Could you please file a Jira
> ticket with details.
>
> Cheers, Andreas
>
>> On Jun 29, 2016, at 11:36, Martin Hecht <he...@hlrs.de> wrote:
>>
>> Hello,
>>
>
Hello,
I have just seen that you managed to mount with a different kernel, but
let me come back to this error when building your own rpms for a
specific kernel.
Independent if you use it or not, I believe on lustre servers you need
to have heartbeat installed nowadays. This is not installed by
I think, if the apache uid and gid needs to be known on the mds, this
depends on the question if you have configured mdt.group_upcall or not.
If not, the group memberships are checked on the lustre client against
its /etc/group (or ldap if that's configured).
On 03/09/2016 06:59 AM, Philippe
Hi,
comments inline...
On 11/04/2015 01:34 PM, Patrick Farrell wrote:
> Our observation at the time was that lfsck did not add the fid to the ..
> dentry unless there was already space in the appropriate location.
Ok, I might have been wrong in this point and some manual mv by the
users was
On 11/04/2015 03:23 AM, Patrick Farrell wrote:
> PAF: Remember, the specific conditions are pretty tight. Created under 1.8,
> not empty (if it's empty, the .. dentry is not misplaced when moved) but also
> non-htree, then moved with dirdata enabled, and then grown to this larger
> size. How
Hi Chris and Patrick,
I was sick last week so I have found this conversation not before today,
sorry
On 10/27/2015 05:06 PM, Patrick Farrell wrote:
> If you read LU-5626 carefully, there's an explanation of the exact nature of
> the damage, and having that should let you make partial recoveries
Hi,
you can use ll_recover_lost_found_objs to recover the files in lost+found to
their original location.
I think this should be the first step.
Also these messages look a bit scary to me:
Oct 7 13:02:04 OSS50 kernel: LustreError: 0-0: Trying to start OBD
Lustre-OST003b_UUID using the wrong
p 24, 2015 at 1:43 AM, Martin Hecht <he...@hlrs.de> wrote:
>
>> On 09/23/2015 02:38 AM, Exec Unerd wrote:
>>> I made a typo when setting failnode/servicenode parameters, but I can't
>>> figure out how to remove the failnode parameter entirely
>>>
2ib0:/testfs /mnt/testfs
> both: mount -v -t lustre 172.16.10.1@o2ib0,192.168.10.1@tcp0:/testfs
I think here it should be a colon between the two MGS nids:
mount -v -t lustre 172.16.10.1@o2ib0:192.168.10.1@tcp0:/testfs
> /mnt/testfs
>
> Everything should be happy?
>
> O
On 09/23/2015 02:39 AM, Exec Unerd wrote:
> My environment has both TCP and IB clients, so my Lustre config has to
> accommodate both, but I'm having a hard time figuring out the proper syntax
> for it. Theoretically, I should be able to use comma-separated interfaces
> in the mgsnode parameter
On 09/23/2015 02:38 AM, Exec Unerd wrote:
> I made a typo when setting failnode/servicenode parameters, but I can't
> figure out how to remove the failnode parameter entirely
>
> I can change the failnode NIDs, but I can't figure out how to completely
> remove "failnode" from the system.
>
> Does
On 09/24/2015 05:33 PM, Chris Hunter wrote:
> [...]
>>2. What's the best way to trace the TCP client interactions to see
>> where
>>it's breaking down?
> If lnet is running on the client, you can try "lctl ping"
> eg) lctl ping 172.16.10.1@o2ib
>
> I believe a lustre mount uses ipoib for
chris hunter
>
>> On 9/10/15 11:17 AM, Mohr Jr, Richard Frank (Rick Mohr) wrote:
>>> Lewis,
>>>
>>> I did an upgrade from Lustre 1.8.6 to 2.4.3 on our servers, and for the
>>> most part things went pretty good. I?ll chime in on a couple of Martin?s
>
tre 1.8.6 to 2.4.3 on our servers, and for
>> the most part things went pretty good. I’ll chime in on a couple of
>> Martin’s points and mention a few other things.
>>
>>> On Sep 10, 2015, at 9:30 AM, Martin Hecht <he...@hlrs.de> wrote:
>>>
>>> In any case th
On 09/11/2015 05:23 AM, Dilger, Andreas wrote:
> On 2015/09/10, 6:54 PM, "Chris Hunter" wrote:
>
>> We experienced file corruption on several OSTs. We proceeded through
>> recovery using e2fsck & ll_recover_lost_found_obj tools.
>> Following these steps, e2fsck came out
Hi Lewis,
it's difficult to tell how much data loss was actually related to the
lustre upgrade itself. We have upgraded 6 file systems and we had to do
it more or less in one shot, because at that time they were using a
common MGS server. All servers of one file system must be on the same
level
Hi Lewis,
Yes, for lustre 2.x you have to "upgrade" the OS, which basically means
a reinstall of a CentOS 6.x (because there is no upgade path across
major releases), then install the lustre packages and the lustre-patched
kernel, and then the pain begins.
We had a lot of trouble when we upgraded
On 09/03/2015 07:22 AM, E.S. Rosenberg wrote:
> On Wed, Sep 2, 2015 at 8:47 PM, Wahl, Edward wrote:
>
>> That would be my guess here. Any chance this is across NFS? Seen that a
>> great deal with this error, it used to cause crashes.
>>
> Strictly speaking it is not, but it may
Maybe, it's anyhow too late, but I have found this thread in my unread mail:
On 09/01/2015 06:38 PM, Colin Faber wrote:
> If you're just looking to reformat the drive, then just reformat the drive:
>
> http://linux.die.net/man/8/mkfs.ext4
It's still unclear what he actually did. Maybe he
Hi Chris,
On 09/02/2015 07:18 AM, Chris Hunter wrote:
> Hi Andreas
>
> On 09/01/2015 07:22 PM, Dilger, Andreas wrote:
>> On 2015/09/01, 7:59 AM, "lustre-discuss on behalf of Chris Hunter"
>> > chris.hun...@yale.edu> wrote:
>>
>>> Hi Andreas,
Hi,
it might help to disable quota using tune2fs and re-enable it again on
the ext2 level on all devices, see LU-3861.
(BTW you don't need the e2fsprogs mentioned in the bug, there was an
official release last year in September).
You have to stop lustre for the tune2fs run and it takes some
Hi John,
on the Parameters line the different nodes should not be separated by
:. Each node should be specified by a separate mgsnode=... or
failover.node=... statement. I'm not sure if separating the two
interfaces of each node by , is correct here, or if this should be
splitted again in two
it with the derp
option.
thanks,
Kurt
- Original Message -
From: Martin Hecht
To: Kurt Strosahl
Sent: Thursday, May 21, 2015 12:51:41 PM
Subject: Re: [lustre-discuss] Exporting a lustre mounted directory via nfs
Hi Kurt,
some time ago we had a client re-exporting a lustre 1.8.x
Hi,
a few more things which may play a role:
- as you are suspecting, the difference of used blocks vs. used bytes
might be the reason, especially if there are many very small files, but
there are more possible causes:
- some tools use 2^10 bytes and some others use 1000 bytes as kb which
might
Hi bob,
just to make sure: You already followed:
http://wiki.lustre.org/index.php/Handling_File_System_Errors, especially
the steps for e2fsck linked there?
If you did *not yet* do any write operation to the damaged OST, you
might want to back up the whole OST first, using dd for instance (if
40 matches
Mail list logo