Hi Tibor,

thanks for this very elaborate answer and I always appreciate your
feedback, but to me it kind of misses the point a bit...?

may not necessarily have to do with concurrent access.
But it does in this special case. Please see the issue and the linked
explanations.

The solution with ThreadLocal would eat too much memory.
Is that so? Are you sure about this? How much is "too much"?
Are there any predefined profiling tests I can run?

I mean: yes, it is a workaround and immutable core classes that are
_designed_ for concurrent access would be much better,
but who is going to do such a massive refactoring (without breaking
Maven extensions that are today mutating MavenProject etc.)?

TBH, this is one of the, IMHO, critical bugs that should have been fixed
before Maven 4.

Cheers,
Falko

Am 21.01.2021 um 02:13 schrieb Tibor Digana:
I commented on one issue regarding the NULL JAR file in Artifact a few days
ago.
The thing that some data is "missing" in some large object structures in
the environment with multiple threads may not necessarily have to do with
concurrent access.
There may not be any writes to MavenProject or MavenSession causing
"missed" data, and the answer why this happens is Memory Model.

It's the fact that non-concurrent or non-immutable objects may lose some
references very easily!
This has all to do with JMM and not the happens-before relationship.

Suppose that we have thread T1 creating ArrayList and adding elements into
this collection.
artifacts = new ArrayList();
artifacts.add(new DefaultArtifact(...));

Suppose thread T2 reads the artifacts from the collection right after
"artifacts.add()".
Artifact a = artifacts.get(0);

In practice the following happens:
artifacts.size() returns 1
but artifacts.get(0) returns NULL

Let;s explain why it happens.
The implementation of ArrayList is not native. It is a pure Java
implementation which has two variables inside:
+ count:int
+ array:Object[]
These two variables always appear in a critical section and they do not
have proper treatments in ArrayList.
Technically, the things are complicated on the CPU level and more
complicated than happens-before theorems.
T1 contains pointers and data in CPU registers or CPU cache. No Thread has
a direct access to a stack of another Thread, and of course it does not
operate on main memory.
The CPU uses memory barriers (assembler instructions) and a cache to
operate with RAM and memory coherency.
These instructions are used via Java keywords: final, volatile and
synchronized.
T2 may not see all elements completely from the ArrayList because there are
no safety mechanisms in the implementation of ArrayList to make this happen.
Thus the T2 may see the values in the Java variable "count" *but it may not
see the values in* "array", or vice versa.

The results are NPE, or missing JAR artifacts or the issues with Maven
Resolver, as we can see in https://github.com/apache/maven/pull/310

The solution with ThreadLocal would eat too much memory.
Reimplementing the POJO classes in Maven and making them thread safe would
solve many issues in the Core and Resolver.
Considering my examples with ArrayList, the thread safety should continue
deeper with the implementation of DefaultArtifact, etc.
In my experience, it's worth using the collection which appears in the
package "java.util.concurrent".
For instance, I use ConcurrentLinkedDequeue for simple iterators with small
amounts of elements. Alternatively use COWAL for large data and reordering
of elements by adding or removing them somewhere inside.

Cheers
Tibor17



On Sat, Jan 16, 2021 at 10:21 PM Dan Tran <dant...@gmail.com> wrote:

we are facing the same issue at work (300+ modules), classpath
empty randomly empty

Love to see some resolution, will help to test it

Thanks

-D

On Fri, Jan 15, 2021 at 1:51 PM Falko Modler <f.mod...@gmx.net> wrote:

Hi everyone,

I'd like to raise awareness for the MavenProject concurrency problem
that is causing MNG-6843 "Parallel build fails due to missing JAR
artifacts in compilePath" [1] and probably others [2] [3] [4].

Almost a month ago, I created a ThreadLocal-based fix for this [5]
(after another, older cloning-based approach had raised some concerns by
Robert Scholte [6]).

Michael Osipov was the only one so far having a look (thanks!) and he
suggested that more Maven team members should review this.

So, before I take a stab at the not so trivial integration test that
Michael proposed [7], I'd like to get an approval for the general
aproach (or a declination in case someone has a better idea).

Thanks for your attention and feedback!

Cheers,

Falko


[1] https://issues.apache.org/jira/browse/MNG-6843

[2] https://issues.apache.org/jira/browse/MNG-4996

[3] https://issues.apache.org/jira/browse/MNG-5750

[4] https://issues.apache.org/jira/browse/MNG-5960

[5] https://github.com/apache/maven/pull/413

[6] https://github.com/apache/maven/pull/310#issuecomment-571317501

[7] https://github.com/apache/maven/pull/413#issuecomment-754661032


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@maven.apache.org
For additional commands, e-mail: dev-h...@maven.apache.org

Reply via email to