On Mon, 15 Jul 2024 08:54:11 GMT, Maurizio Cimadamore <mcimadam...@openjdk.org> 
wrote:

> > I have one problem with the benchmark: I think it is not measuring the 
> > whole setup in a way that is our workload: The basic problem is that we 
> > don't want to deoptimize threads which are not related to MemorySegments. 
> > So basically, the throughput of those threads should not be affected. For 
> > threads currently in a memory-segment read it should have a bit of effect, 
> > but it should recover fast.
> 
> IMHO there is a bit of confusion in this discussion. When we say that a 
> shared arena close operation is slow, we might mean one of two things:
> 
> 1. calling the `close()` method itself is slow (this is what the benchmark 
> effectively measures)
> 2. throughput of unrelated threads is affected (I think this is what Lucene 
> is seeing)
> 
> Addressing (2) than (1) (in the sense that, if you sign up for a shared arena 
> close, you know it's going to be deterministic, but expensive, as the javadoc 
> itself admits).

I fully agree, we mixed two different approaches. The problem is that the 
benchmark measures both, 1 and 2 per thread. To see an effect of this change, 
the benchmark should have 3 types of threads: One only closing arenas, another 
set that consumes scoped memory and a third group doing totally unrelated stuff.

> For this reason, I'm unsure about some of the "delaying tactics" I see 
> mentioned here: if we delay the underlying "free"/"unmap" operation, this is 
> only going to affect (1). You still need some global operation (e.g. 
> handshake) to make sure all threads agree on the segment state. Moving the 
> cost of the free/unmap from one place to another is not really going to do 
> much for (2).

This is indeed unrelated. It is just an idea I also thorught of. In Apache 
Lucene we are mostly interested to close the shared arena as soon as possible. 
We don't need to make sure it is closed after the "close" call finished (we 
don't care), but we can't wait until GC closes the arena possibly after hours 
or even days. The reason for the latter is that the Arena is a small, 
long-living instance and GC does not want to free it, as there is no pressure.

So basically for us it would be best to trigger the close and then do other 
stuff.

Of course we can do that in a separate thread (this is my idea how to improve 
the closes in lucene). The only problem is that Lucene does not have own 
threadpools, so this would be responsibility of the caller to possibly close 
our indexes in a separate thread (and a single one only).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20158#issuecomment-2228018619

Reply via email to