Hi Elena & Dana,

shouldn't disabling file locking rather be a runtime mechanism? What if you want to use the same binary on the same hardware with different file system configurations, or one the same hardware writing to different file systems, or if the sysadmin changes their mind on a daily basis to enable or disable file locking?

           Werner

On 16.05.2016 18:03, Dana Robinson wrote:
If a suitable way to lock files cannot be determined at configure time, a no-op 
function is substituted. This is currently the case on Windows. File locking is 
just advisory, so this isn't a big deal.

As for disabling file locking, we talked about this and will try to get a 
configure-time mechanism for disabling file locking implemented for HDF5 1.10.1.

Dana Robinson
Software Engineer
The HDF Group

-----Original Message-----
From: Hdf-forum [mailto:[email protected]] On Behalf Of 
Elena Pourmal
Sent: Sunday, May 15, 2016 9:01 PM
To: HDF Users Discussion List <[email protected]>
Subject: Re: [Hdf-forum] HDF5-1.10.0 and flock()

Hi Tim,


On May 13, 2016, at 10:55 AM, Timothy Brown <[email protected]> 
wrote:

Hi all,

I was wondering if HDF5 was going to be keep the 1.8.x branch going? Or is it 
recommend to move to the 1.10.x?
Yes, we will keep 1.8 going until we are satisfied with the quality of 1.10.x. 
Transition from 1.8 to 1.10 should be seamless for our users :-)
I'm asking as we all know for SWMR you need flock() and that you can not 
disable SWMR at compile time (I don't need it in my day to day use).
Hmm... HDF5 implements file locking in 1.10.x to prevent unauthorized access to 
an HDF5 file (for example, file is opened for writing (non-SWMR) and another 
process tries to write to it it). File locking is enabled if flock (or similar) 
is available on the system. Configure checks if file locking is available, but 
I think, we failed to check if it is disabled. We will take a look into this 
situation.

Thank you for reporting!

Elena


On one of the clusters I run on we've got a Lustre file-system. However the 
admin's have deemed that file locking is too expensive and have disabled it. 
Here's the mount information:

mds01ib@o2ib1:mds02ib@o2ib1:/scratch on /lustre/janus_scratch type
lustre (rw,noauto,_netdev)

So when I run a very simple test to create a HDF5 with version 1.10.0 on this 
file system it fails:

janus-compile1 ~$ ./test /lustre/janus_scratch/tibr1099/foo.h5
HDF5-DIAG: Error detected in HDF5 (1.10.0) thread 0:
  #000: H5F.c line 491 in H5Fcreate(): unable to create file
    major: File accessibilty
    minor: Unable to open file
  #001: H5Fint.c line 1168 in H5F_open(): unable to lock the file or initialize 
file structure
    major: File accessibilty
    minor: Unable to open file
  #002: H5FD.c line 1821 in H5FD_lock(): driver lock request failed
    major: Virtual File Layer
    minor: Can't update object
  #003: H5FDsec2.c line 939 in H5FD_sec2_lock(): unable to flock file, errno = 
38, error message = 'Function not implemented'
    major: File accessibilty
    minor: Bad file ID accessed
Unable to open: /lustre/janus_scratch/tibr1099/foo.h5:           -1
1

When I strace the program I see it's because flock() failed:

open("/lustre/janus_scratch/tibr1099/foo.h5", O_RDWR) = 3 fstat(3,
{st_mode=S_IFREG|0644, st_size=0, ...}) = 0
close(3)                                = 0
open("/lustre/janus_scratch/tibr1099/foo.h5", O_RDWR|O_CREAT|O_TRUNC,
0666) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
flock(3, LOCK_EX|LOCK_NB)               = -1 ENOSYS (Function not implemented)
close(3)                                = 0

Versus if I trace the program with version 1.8.15:

open("/lustre/janus_scratch/tibr1099/foo.h5", O_RDWR) = 3 fstat(3,
{st_mode=S_IFREG|0644, st_size=0, ...}) = 0
close(3)                                = 0
open("/lustre/janus_scratch/tibr1099/foo.h5", O_RDWR|O_CREAT|O_TRUNC,
0666) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
brk(0x235a000)                          = 0x235a000
mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x7f17252b8000

So my long winded example leads to three questions.
1) Do other HPC sites enable flock() on lustre? If so is it only localflock so 
as not to have the burden of a cluster wide flock?
2) Is there a path forward for sites that don't enable flock?
3) Is there the opposite of H5Fstart_swmr_write?

Thanks!
Tim<test.f90>_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.or
g
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

--
___________________________________________________________________________
Dr. Werner Benger                Visualization Research
Center for Computation & Technology at Louisiana State University (CCT/LSU)
2019  Digital Media Center, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809                        Fax.: +1 225 578-5362


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to