On Mon, 15 Jul 2024 16:30:11 GMT, Maurizio Cimadamore
wrote:
> * there is some issue involving segment access with `int` induction variable
> which we should investigate separately
This issue is tracked here:
https://bugs.openjdk.org/browse/JDK-8336759
-
PR Comment: https://git.o
On Wed, 17 Jul 2024 15:19:18 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Wed, 17 Jul 2024 15:19:18 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Wed, 17 Jul 2024 15:19:18 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Wed, 17 Jul 2024 15:19:18 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
> This PR limits the number of cases in which we deoptimize frames when closing
> a shared Arena. The initial intent of this was to improve the performance of
> shared arena closure in cases where a lot of threads are accessing and
> closing shared arenas at the same time (see attached benchmark
On Mon, 15 Jul 2024 12:59:27 GMT, Maurizio Cimadamore
wrote:
> Effectively, once all the issues surrounding reachability fences will be
> addressed, we should be able to achieve numbers similar to above even in the
> case of shared close.
Is there an issue where I can follow this? [ EDIT: o
On Tue, 16 Jul 2024 18:09:20 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Tue, 16 Jul 2024 15:12:15 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
> This PR limits the number of cases in which we deoptimize frames when closing
> a shared Arena. The initial intent of this was to improve the performance of
> shared arena closure in cases where a lot of threads are accessing and
> closing shared arenas at the same time (see attached benchmark
On Tue, 16 Jul 2024 15:00:04 GMT, Doug Simon wrote:
>> Jorn Vernee has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> JVMCI support
>
> src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/HotSpotResolvedJavaMethod.java
> line 62:
>
>>
> This PR limits the number of cases in which we deoptimize frames when closing
> a shared Arena. The initial intent of this was to improve the performance of
> shared arena closure in cases where a lot of threads are accessing and
> closing shared arenas at the same time (see attached benchmark
On Tue, 16 Jul 2024 14:46:13 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Tue, 16 Jul 2024 14:46:13 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
> This PR limits the number of cases in which we deoptimize frames when closing
> a shared Arena. The initial intent of this was to improve the performance of
> shared arena closure in cases where a lot of threads are accessing and
> closing shared arenas at the same time (see attached benchmark
On Mon, 15 Jul 2024 17:00:24 GMT, Rémi Forax wrote:
> Even if the int vs long issue is fixed for this case, i think we should
> recommand to call `withInvokeExactBehavior()` after creating any VarHandle so
> all the auto-conversions are treated as runtime errors.
>
> This is what i do with my
On Mon, 15 Jul 2024 11:33:30 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Mon, 15 Jul 2024 16:40:06 GMT, Maurizio Cimadamore
wrote:
> > So +1 to merge this and hopefully backport it at least to 21?
>
> Backport to 21 is difficult, given the handshake code there is different
> (and, FFM is preview there). But, might be more possible for 22. I have
> notified Rola
On Mon, 15 Jul 2024 16:35:26 GMT, Uwe Schindler wrote:
> So +1 to merge this and hopefully backport it at least to 21?
Backport to 21 is difficult, given the handshake code there is different (and,
FFM is preview there). But, might be more possible for 22. I have notified
Roland re. the `int`
On Mon, 15 Jul 2024 16:30:11 GMT, Maurizio Cimadamore
wrote:
>>> > > Even with `arrayElementVarHandle` it's about the same
>>> >
>>> >
>>> > This is very odd, and I don't have a good explanation as to why that is
>>> > the case. What does the baseline (confined arena) look like for
>>> > `ar
On Mon, 15 Jul 2024 14:02:27 GMT, Maurizio Cimadamore
wrote:
> So, that means that `arrayElementVarHandle` is ~4x faster than memory
> segment? Isn't that a bit odd?
I did some more analyis of the benchmark. I first eliminated the closing
thread, and started with two simple benchmarks:
@Ben
On Mon, 15 Jul 2024 15:18:20 GMT, Jorn Vernee wrote:
> > > This is what I was thinking of as well. close() on a shared arena can be
> > > called by any thread, so it would be possible to have an executor service
> > > with 1-n threads that is dedicated to closing memory.
> >
> >
> > This dela
On Mon, 15 Jul 2024 11:29:49 GMT, Rémi Forax wrote:
> > This is what I was thinking of as well. close() on a shared arena can be
> > called by any thread, so it would be possible to have an executor service
> > with 1-n threads that is dedicated to closing memory.
>
> This delays both the clos
On Mon, 15 Jul 2024 11:33:30 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Mon, 15 Jul 2024 13:49:57 GMT, Jorn Vernee wrote:
> > > Even with `arrayElementVarHandle` it's about the same
> >
> >
> > This is very odd, and I don't have a good explanation as to why that is the
> > case. What does the baseline (confined arena) look like for
> > `arrayElementVarHandle`
On Mon, 15 Jul 2024 13:09:21 GMT, Maurizio Cimadamore
wrote:
> > Even with `arrayElementVarHandle` it's about the same
>
> This is very odd, and I don't have a good explanation as to why that is the
> case. What does the baseline (confined arena) look like for
> `arrayElementVarHandle` ?
Pre
On Mon, 15 Jul 2024 12:47:30 GMT, Jorn Vernee wrote:
> Even with `arrayElementVarHandle` it's about the same
This is very odd, and I don't have a good explanation as to why that is the
case. What does the baseline (confined arena) look like for
`arrayElementVarHandle` ?
-
PR Comm
On Mon, 15 Jul 2024 12:34:37 GMT, Jorn Vernee wrote:
> This is the baseline if I change `closing` to use a confined arena:
>
> ```
> BenchmarkMode Cnt ScoreError
> Units
> ConcurrentClose.sharedClose avgt 10 8.089 ± 0.006
On Mon, 15 Jul 2024 12:14:52 GMT, Maurizio Cimadamore
wrote:
> Ah! I had `arrayElementVarHandle` in mind - maybe you can try that?
Even with `arrayElementVarHandle` it's about the same
-
PR Comment: https://git.openjdk.org/jdk/pull/20158#issuecomment-2228425705
On Mon, 15 Jul 2024 11:33:30 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Mon, 15 Jul 2024 12:13:23 GMT, Maurizio Cimadamore
wrote:
> > I also tried using `MethodHandles::arrayElementGetter` for the access, but
> > the numbers I got were pretty much the same:
>
> This is quite strange, as the code involved should be quite similar to those
> with memory segments
On Mon, 15 Jul 2024 12:10:02 GMT, Maurizio Cimadamore
wrote:
> I also tried using `MethodHandles::arrayElementGetter` for the access, but
> the numbers I got were pretty much the same:
This is quite strange, as the code involved should be quite similar to those
with memory segments (e.g. you
On Mon, 15 Jul 2024 12:00:31 GMT, Maurizio Cimadamore
wrote:
> When I remove the `has_scoped_access()` check before the deopt, I expect the
> `otherAccess` thread to be affected, but the effect isn't nearly as big as
> with the FFM thread. I think this is likely due to the `otherAccess`
> ben
On Mon, 15 Jul 2024 11:47:43 GMT, Jorn Vernee wrote:
> I've update the benchmark to run with 3 separate threads: 1 thread that is
> just creating and closing shared arenas in a loop, 1 that is accessing memory
> using the FFM API, and 1 that is accessing a `byte[]`.
>
> Current:
>
> ```
> Ben
On Mon, 15 Jul 2024 11:33:30 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Mon, 15 Jul 2024 10:50:34 GMT, Jorn Vernee wrote:
> This is what I was thinking of as well. close() on a shared arena can be
> called by any thread, so it would be possible to have an executor service
> with 1-n threads that is dedicated to closing memory.
This delays both the closing of th
> This PR limits the number of cases in which we deoptimize frames when closing
> a shared Arena. The initial intent of this was to improve the performance of
> shared arena closure in cases where a lot of threads are accessing and
> closing shared arenas at the same time (see attached benchmark
On Mon, 15 Jul 2024 09:02:29 GMT, Uwe Schindler wrote:
> Of course we can do that in a separate thread (this is my idea how to improve
> the closes in lucene).
This is what I was thinking of as well. `close()` on a shared arena can be
called by any thread, so it would be possible to have an ex
On Mon, 15 Jul 2024 08:41:38 GMT, Doug Simon wrote:
>> Jorn Vernee has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> track has_scoped_access for compiled methods
>
> src/hotspot/share/jvmci/jvmciRuntime.cpp line 2186:
>
>> 2184: n
On Mon, 15 Jul 2024 09:17:31 GMT, Maurizio Cimadamore
wrote:
>>> One only closing arenas, another set that consumes scoped memory and a
>>> third group doing totally unrelated stuff.
>>
>> Exactly. My general feeling is that the cost of handshaking a thread
>> dominates everything else, so do
On Mon, 15 Jul 2024 09:11:53 GMT, Maurizio Cimadamore
wrote:
> avoiding unnecessary deoptimization (as in this PR) is not going to help much,
What would definitively help is to somehow reduce the number of threads to
handshake when calling close - e.g. have an arena that is shared but only to
On Mon, 15 Jul 2024 09:02:29 GMT, Uwe Schindler wrote:
> One only closing arenas, another set that consumes scoped memory and a third
> group doing totally unrelated stuff.
Exactly. My general feeling is that the cost of handshaking a thread dominates
everything else, so doing improvements aro
On Mon, 15 Jul 2024 08:54:11 GMT, Maurizio Cimadamore
wrote:
> > I have one problem with the benchmark: I think it is not measuring the
> > whole setup in a way that is our workload: The basic problem is that we
> > don't want to deoptimize threads which are not related to MemorySegments.
> >
On Mon, 15 Jul 2024 08:57:08 GMT, Maurizio Cimadamore
wrote:
>> Jorn Vernee has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> track has_scoped_access for compiled methods
>
> test/micro/org/openjdk/bench/java/lang/foreign/ConcurrentClose.
On Fri, 12 Jul 2024 20:59:26 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Sun, 14 Jul 2024 11:01:58 GMT, Uwe Schindler wrote:
> I have one problem with the benchmark: I think it is not measuring the whole
> setup in a way that is our workload: The basic problem is that we don't want
> to deoptimize threads which are not related to MemorySegments. So basically,
>
On Mon, 15 Jul 2024 08:41:01 GMT, Alan Bateman wrote:
>> This is the whole magic around the shared arena. It is not public API and
>> internal to Hotspot/VM:
>> -
>> https://github.com/openjdk/jdk/blob/a96de6d8d273d75a6500e10ed06faab9955f893b/src/java.base/share/classes/jdk/internal/misc/X-Scop
On Mon, 15 Jul 2024 08:38:59 GMT, Uwe Schindler wrote:
>> src/hotspot/share/prims/scopedMemoryAccess.cpp line 179:
>>
>>> 177: //
>>> 178: // The safepoint at which we're stopped may be in between the
>>> liveness check
>>> 179: // and actual memory access, but is itself
On Mon, 15 Jul 2024 08:28:16 GMT, Doug Simon wrote:
>> Jorn Vernee has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> track has_scoped_access for compiled methods
>
> src/hotspot/share/prims/scopedMemoryAccess.cpp line 179:
>
>> 177:
On Fri, 12 Jul 2024 20:59:26 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Fri, 12 Jul 2024 20:59:26 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Sat, 13 Jul 2024 16:43:16 GMT, Rémi Forax wrote:
> Knowing that all the segments are freed during close() is something you may
> want. But having the execution time of close() be linear with the number of
> threads is also problematic. Maybe, it means that we need another kind of
> Arena th
On Fri, 12 Jul 2024 20:59:26 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Fri, 12 Jul 2024 20:59:26 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Sat, 13 Jul 2024 15:28:57 GMT, Erik Österlund wrote:
> @dougxc might want to have a look at Graal support for this one.
Yes, I conservatively implemented `has_scoped_access()` for Graal (see
`jvmciRuntime.cpp` changes). It won't regress anything, but there's still an
opportunity for improve
On Fri, 12 Jul 2024 20:59:26 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Fri, 12 Jul 2024 20:59:26 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Fri, 12 Jul 2024 20:59:26 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Fri, 12 Jul 2024 20:59:26 GMT, Jorn Vernee wrote:
>> This PR limits the number of cases in which we deoptimize frames when
>> closing a shared Arena. The initial intent of this was to improve the
>> performance of shared arena closure in cases where a lot of threads are
>> accessing and clo
On Fri, 12 Jul 2024 13:57:23 GMT, Jorn Vernee wrote:
> This PR limits the number of cases in which we deoptimize frames when closing
> a shared Arena. The initial intent of this was to improve the performance of
> shared arena closure in cases where a lot of threads are accessing and
> closing
> This PR limits the number of cases in which we deoptimize frames when closing
> a shared Arena. The initial intent of this was to improve the performance of
> shared arena closure in cases where a lot of threads are accessing and
> closing shared arenas at the same time (see attached benchmark
This PR limits the number of cases in which we deoptimize frames when closing a
shared Arena. The initial intent of this was to improve the performance of
shared arena closure in cases where a lot of threads are accessing and closing
shared arenas at the same time (see attached benchmark), but u
62 matches
Mail list logo