shichaoyuan opened a new issue, #13635:
URL: https://github.com/apache/skywalking/issues/13635

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Apache SkyWalking Component
   
   Java Agent (apache/skywalking-java)
   
   ### What happened
   
   We encountered a deadlock during the application startup/preheat phase. The 
deadlock occurs between the `main` thread and a Netty worker thread 
(`Netty4DefaultChannel-Worker-thread-4`).
   
   Based on the jstack dump, the conflict arises because:
   1.  **Thread "main"**: Holds the lock on `sun.nio.cs.StandardCharsets` (via 
`FastCharsetProvider`). While holding this lock, the SkyWalking Agent 
(`AgentBuilder$Default$ExecutingTransformer`) attempts to transform a class, 
which triggers `ClassFileLocator.locate` -> 
`URLClassLoader.getResourceAsStream`, eventually trying to acquire the lock on 
`sun.net.www.protocol.jar.JarFileFactory`.
   2.  **Thread "Netty4DefaultChannel-Worker-thread-4"**: Holds the lock on 
`sun.net.www.protocol.jar.JarFileFactory` (via `JarFileFactory.get`). While 
holding this lock, it attempts to perform String decoding 
(`StringCoding.decode`), which tries to acquire the lock on 
`sun.nio.cs.StandardCharsets`.
   
   It seems the SkyWalking agent introduces an operation (class resource 
lookup) inside the `StandardCharsets` lock scope on the main thread, leading to 
the AB-BA deadlock.
   
   **Key Stack Trace (from jstack):**
   
   **Thread 1 (Main) - Waiting for JarFileFactory, Holding StandardCharsets:**
   ```text
   "main":
        at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:80)
        - waiting to lock <0x00000006e050be28> (a 
sun.net.www.protocol.jar.JarFileFactory)
        at 
sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)
       ...
        at 
org.apache.skywalking.apm.dependencies.net.bytebuddy.dynamic.ClassFileLocator$ForClassLoader.locate(ClassFileLocator.java:453)
       ...
        at 
org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12041)
       ...
        at 
sun.nio.cs.FastCharsetProvider.charsetForName(FastCharsetProvider.java:133)
        - locked <0x00000006e1cb0708> (a sun.nio.cs.StandardCharsets)
   ```
   
   **Thread 2 (Netty) - Waiting for StandardCharsets, Holding JarFileFactory:**
   ```text
   "Netty4DefaultChannel-Worker-thread-4":
        at 
sun.nio.cs.FastCharsetProvider.charsetForName(FastCharsetProvider.java:132)
        - waiting to lock <0x00000006e1cb0708> (a sun.nio.cs.StandardCharsets)
       ...
        at java.lang.StringCoding.decode(StringCoding.java:185)
       ...
        at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:81)
        - locked <0x00000006e050be28> (a 
sun.net.www.protocol.jar.JarFileFactory)
   ```
   
   ### What you expected to happen
   
   The agent should handle class transformation without causing a deadlock, 
possibly by avoiding resource lookups that require `JarFileFactory` locking 
while inside a Charset provider initialization, or the application should be 
able to initialize safely.
   
   ### How to reproduce
   
   It happens during the application startup when a high concurrency of class 
loading and resource accessing occurs.
   
   ### Anything else
   
   *   **Java Version:** 1.8.0
   *   **OS:** Linux
   
   The stack trace shows the transformation is triggered inside 
`FastCharsetProvider`. It might be safer to exclude `sun.nio.cs.*` or related 
JDK internal packages from being instrumented or matched by the agent to break 
the transformation chain.
   
   ### Are you willing to submit a pull request to fix on your own?
   
   - [x] Yes I am willing to submit a pull request on my own!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to