On Mon, 24 Nov 2025 17:01:20 +0100
Corinna Vinschen wrote:
> On Nov 24 23:31, Takashi Yano wrote:
> > On Mon, 24 Nov 2025 22:59:33 +0900
> > Takashi Yano wrote:
> > > On Mon, 24 Nov 2025 22:35:46 +0900
> > > Takashi Yano wrote:
> > > > On Mon, 24 Nov 2025 14:09:03 +0100
> > > > Corinna Vinschen wrote:
> > > > > Version 2 of the patch.  Rather than calling get_all_locks_list()
> > > > > from lf_setlock() *and* lf_clearlock(), call it right from
> > > > > fhandler_base::lock() to avoid calling the function twice.
> > > > > Also, move comment.
> > > > > [...]
> > > > Thanks. Your patch looks better than mine, however, this
> > > > does not fix the second error in the report; i.e.,
> > > > 
> > > > tmp_dir: /tmp/flockRsYGNU
> > > > done[7]
> > > > done[3]
> > > > lock error: 14 - Bad address: 3
> > > > assertion "lock_res == 0" failed: file "main.c", line 40, function: 
> > > > thread_func
> > > >                                                                         
> > > >        Aborted
> > > > 
> > > > while mine does. I'm not sure why...
> > > 
> > > fhandler_base::lock() seems necessary to be reentrant for the original
> > > test case, because, your v2 patch + making i_all_lf local variable
> > > solves the issue. Please see my v2 patch.
> > > 
> > > What do you think?
> > 
> > I think I understand now.
> > 
> > The object of inode_t * is inside cygheap, and it is not thread-local.
> > So if i_all_lf in inode_t * is changed by another thread, the pointer
> > i_all_lf is destroied.
> 
> Ooooh, ok.  I was already puzzeling about the potential situation where
> fhandler_base::lock() should be reentrant.
> 
> I think I get it now.  While the node is in use, it's supposed to be
> LOCK()ed.  But if we're waiting for a blocking POSIX lock, the node
> gets temporarily UNLOCK()ed.  So we have a valid w_get() in i_all_lf
> potentially overwritten by another w_get() which gets released and
> potentially reused in a different context, while i_all_lf still holds
> the pointer.
> 
> Your usage of a local variable is perfectly fine, considering the above.
> 
> However, I wonder if we don't have a general thinko here, in that we're
> using a TLS buffer or a temporary buffer at all?  Thinking it through...
> 
> [Careful, idle musing ahead...]
> 
> I_all_lf is a member of inode_t because the data stored therein is not
> thread local, but node local.  It's content doesn't depend on the thread
> it's used in, just on the state the file locks are in.
> 
> The content of i_all_lf is refreshed evey time the lock list is
> required, and it's always refreshed under node LOCK() conditions.
> 
> A pointer to a member of the lock list in i_all_lf is only generated and
> returned to the caller by getblock().  getblock() is used under node
> LOCK() conditions.
> 
> But, this pointer is used once during node UNLOCK(): in the lf_setlock()
> loop scanning for blocking locks, when we find a lock that blocks us.
> Line 1301 in origin/main:
> 
>     proc = OpenProcess (SYNCHRONIZE, FALSE, block->lf_wid);
>                                             ^^^^^^^^^^^^^
> 
> This can be avoided easily by introducing a local variable getting the
> windows id before calling node->UNLOCK().
> 
> So we could avoid using w_get() in every call to fcntl/flock/lockf if we
> always create the inode_t with an allocated i_all_lf which never gets
> changed or deallocated.  But that also requires to re-malloc() i_all_lf
> after execve().

Good point. I was wondering why i_all_lf need to be temporary. However,
I'm not sure why re-allocation is necessary for execve() case?

cygheap is initialized to initial state after execve(), isn't it?

> Alternatively we convert i_all_lf to an array member, so it's allocated
> on the cygheap automatically. But then every node takes almost 100K on
> the cygheap rather than just 64 bytes.  No, scratch that, that sounds bad.
> 
> Or we just stick to your V2 patch using local TLS buffers, given the
> temporary nature of the data.
> 
> What do you think is the better approach here, single malloc per node or
> TLS per invocation?

I prefer single malloc(), if the cost of malloc() is not much larger than
tmp_pathbuf::w_get().

I have created v3 patch based on your idea. Please have a look.

-- 
Takashi Yano <[email protected]>

Reply via email to