shichaoyuan opened a new issue, #13635: URL: https://github.com/apache/skywalking/issues/13635
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar issues. ### Apache SkyWalking Component Java Agent (apache/skywalking-java) ### What happened We encountered a deadlock during the application startup/preheat phase. The deadlock occurs between the `main` thread and a Netty worker thread (`Netty4DefaultChannel-Worker-thread-4`). Based on the jstack dump, the conflict arises because: 1. **Thread "main"**: Holds the lock on `sun.nio.cs.StandardCharsets` (via `FastCharsetProvider`). While holding this lock, the SkyWalking Agent (`AgentBuilder$Default$ExecutingTransformer`) attempts to transform a class, which triggers `ClassFileLocator.locate` -> `URLClassLoader.getResourceAsStream`, eventually trying to acquire the lock on `sun.net.www.protocol.jar.JarFileFactory`. 2. **Thread "Netty4DefaultChannel-Worker-thread-4"**: Holds the lock on `sun.net.www.protocol.jar.JarFileFactory` (via `JarFileFactory.get`). While holding this lock, it attempts to perform String decoding (`StringCoding.decode`), which tries to acquire the lock on `sun.nio.cs.StandardCharsets`. It seems the SkyWalking agent introduces an operation (class resource lookup) inside the `StandardCharsets` lock scope on the main thread, leading to the AB-BA deadlock. **Key Stack Trace (from jstack):** **Thread 1 (Main) - Waiting for JarFileFactory, Holding StandardCharsets:** ```text "main": at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:80) - waiting to lock <0x00000006e050be28> (a sun.net.www.protocol.jar.JarFileFactory) at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122) ... at org.apache.skywalking.apm.dependencies.net.bytebuddy.dynamic.ClassFileLocator$ForClassLoader.locate(ClassFileLocator.java:453) ... at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12041) ... at sun.nio.cs.FastCharsetProvider.charsetForName(FastCharsetProvider.java:133) - locked <0x00000006e1cb0708> (a sun.nio.cs.StandardCharsets) ``` **Thread 2 (Netty) - Waiting for StandardCharsets, Holding JarFileFactory:** ```text "Netty4DefaultChannel-Worker-thread-4": at sun.nio.cs.FastCharsetProvider.charsetForName(FastCharsetProvider.java:132) - waiting to lock <0x00000006e1cb0708> (a sun.nio.cs.StandardCharsets) ... at java.lang.StringCoding.decode(StringCoding.java:185) ... at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:81) - locked <0x00000006e050be28> (a sun.net.www.protocol.jar.JarFileFactory) ``` ### What you expected to happen The agent should handle class transformation without causing a deadlock, possibly by avoiding resource lookups that require `JarFileFactory` locking while inside a Charset provider initialization, or the application should be able to initialize safely. ### How to reproduce It happens during the application startup when a high concurrency of class loading and resource accessing occurs. ### Anything else * **Java Version:** 1.8.0 * **OS:** Linux The stack trace shows the transformation is triggered inside `FastCharsetProvider`. It might be safer to exclude `sun.nio.cs.*` or related JDK internal packages from being instrumented or matched by the agent to break the transformation chain. ### Are you willing to submit a pull request to fix on your own? - [x] Yes I am willing to submit a pull request on my own! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
