Very interesting, thanks for the info and history on this.  The reason for the 
different behavior makes sense after reading about the history and use cases.

From: Andreas Dilger <adil...@whamcloud.com>
Date: Monday, April 29, 2024 at 12:29 PM
To: Simon Guilbault <simon.guilba...@calculquebec.ca>
Cc: Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] 
<darby.vicke...@nasa.gov>, lustre-discuss@lists.lustre.org 
<lustre-discuss@lists.lustre.org>
Subject: Re: [lustre-discuss] [EXTERNAL] [BULK] Files created in append mode 
don't obey directory default stripe count
CAUTION: This email originated from outside of NASA.  Please take care when 
clicking links or opening attachments.  Use the "Report Message" button to 
report suspicious messages to the NASA SOC.


Simon is exactly correct.  This is expected behavior for files opened with 
O_APPEND, at least until LU-12738 is implemented.  Since O_APPEND writes are 
(by definition) entirely serialized, having multiple stripes on such files is 
mostly useless and just adds overhead.

Feel free to read https://jira.whamcloud.com/browse/LU-9341 for the very 
lengthy saga on the history of this behavior.

Cheers, Andreas


On Apr 29, 2024, at 10:42, Simon Guilbault 
<simon.guilba...@calculquebec.ca<mailto:simon.guilba...@calculquebec.ca>> wrote:

This is the expected behaviour. In the original implementation of PFL, when a 
file was open in append mode, the lock from 0 to EOF was initializing all 
stripes of the PFL file. We have a PFL layout on our system with 1 stripe up to 
1 GB, then it increased to 4 and then 32 stripes when the file was getting very 
large. This was a problem with software that was creating 4kb log files (like 
slurm.out) because they were creating files with > 32 stripes because of the 
append mode. This was patched a few releases ago, that behaviour can be 
changed, but I would recommend keeping 1 stripe for files that are using append 
mode.

From the manual:
O_APPEND mode. When files are opened for append, they instantiate all 
uninitialized components expressed in the layout. Typically, log files are 
opened for append, and complex layouts can be inefficient.
Note
The mdd.*.append_stripe_count and mdd.*.append_pool options can be used to 
specify special default striping for files created with O_APPEND.

On Mon, Apr 29, 2024 at 11:21 AM Vicker, Darby J. (JSC-EG111)[Jacobs 
Technology, Inc.] via lustre-discuss 
<lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>> wrote:
Wow, I would say that is definitely not expected.  I can recreate this on both 
of our LFS’s.  One is community lustre 2.14, the other is a DDN Exascalar.  
Shown below is our community lustre but we also have a 3-segment PFL on our 
Exascalar and the behavor is the same there.

$ echo > aaa
$ echo >> bbb
$ lfs getstripe aaa bbb
aaa
  lcm_layout_gen:    3
  lcm_mirror_count:  1
  lcm_entry_count:   3
    lcme_id:             1
    lcme_mirror_id:      0
    lcme_flags:          init
    lcme_extent.e_start: 0
    lcme_extent.e_end:   33554432
      lmm_stripe_count:  1
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: 6
      lmm_objects:
      - 0: { l_ost_idx: 6, l_fid: [0x100060000:0xace8112:0x0] }

    lcme_id:             2
    lcme_mirror_id:      0
    lcme_flags:          0
    lcme_extent.e_start: 33554432
    lcme_extent.e_end:   10737418240
      lmm_stripe_count:  4
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: -1

    lcme_id:             3
    lcme_mirror_id:      0
    lcme_flags:          0
    lcme_extent.e_start: 10737418240
    lcme_extent.e_end:   EOF
      lmm_stripe_count:  8
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: -1

bbb
lmm_stripe_count:  1
lmm_stripe_size:   2097152
lmm_pattern:       raid0
lmm_layout_gen:    0
lmm_stripe_offset: 3
                obdidx                  objid                    objid          
          group
                     3             179773949       0xab721fd                   0


From: lustre-discuss 
<lustre-discuss-boun...@lists.lustre.org<mailto:lustre-discuss-boun...@lists.lustre.org>>
 on behalf of Otto, Frank via lustre-discuss 
<lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>>
Date: Monday, April 29, 2024 at 8:33 AM
To: lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org> 
<lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>>
Subject: [EXTERNAL] [BULK] [lustre-discuss] Files created in append mode don't 
obey directory default stripe count
CAUTION: This email originated from outside of NASA.  Please take care when 
clicking links or opening attachments.  Use the "Report Message" button to 
report suspicious messages to the NASA SOC.

See subject. Is it a known issue? Is it expected? Easy to reproduce:


# lfs getstripe .
.
stripe_count:  4 stripe_size:   1048576 pattern:       raid0 stripe_offset: -1

# echo > aaa
# echo >> bbb
# lfs getstripe .
.
stripe_count:  4 stripe_size:   1048576 pattern:       raid0 stripe_offset: -1

./aaa
lmm_stripe_count:  4
lmm_stripe_size:   1048576
lmm_pattern:       raid0
lmm_layout_gen:    0
lmm_stripe_offset: 0
        obdidx           objid           objid           group
             0            2830          0xb0e                0
             1            2894          0xb4e                0
             2            2831          0xb0f                0
             3            2895          0xb4f                0

./bbb
lmm_stripe_count:  1
lmm_stripe_size:   1048576
lmm_pattern:       raid0
lmm_layout_gen:    0
lmm_stripe_offset: 4
        obdidx           objid           objid           group
             4            2831          0xb0f                0



As you see, file "bbb" is created with stripe count 1 instead of 4.
Observed in Lustre 2.12.x and Lustre 2.15.4.

Thanks,
Frank

--
Dr. Frank Otto
Senior Research Infrastructure Developer
UCL Centre for Advanced Research Computing
Tel: 020 7679 1506
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud






_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
  • [... Otto, Frank via lustre-discuss
    • ... Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
      • ... Simon Guilbault
        • ... Andreas Dilger via lustre-discuss
          • ... Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss
          • ... Otto, Frank via lustre-discuss

Reply via email to