[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14545777#comment-14545777 ] Aleksey Yeschenko commented on CASSANDRA-8812: -- I'm assuming that this was reopened by accident. Either way, since the patch made it into 2.1.5, if the issue is still real, a new ticket should be opened instead of reopening one. JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Benedict Fix For: 2.1.5 Attachments: 8812.txt, crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538050#comment-14538050 ] Ariel Weisberg commented on CASSANDRA-8812: --- Can you capture what has to be done as part of the kitchen sink in the [kitchen sink doc | https://docs.google.com/document/d/1kccPqxEAoYQpT0gXnp20MYQUDmjOrakAeQhf6vkqjGo/edit#heading=h.zd5nw0kl2ypi] JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Benedict Fix For: 2.1.5 Attachments: 8812.txt, crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14538505#comment-14538505 ] Benedict commented on CASSANDRA-8812: - Sure, I've updated the doc. It should be caught by one of the most basic concepts of the kitchen sink tests (i.e. stress workload with parallel schema changes), though, so I very much hope it's only needed as a corroborative double-check. JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Benedict Fix For: 2.1.5 Attachments: 8812.txt, crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535855#comment-14535855 ] Ariel Weisberg commented on CASSANDRA-8812: --- [~benedict] bq. I suspect this is the bug. It looks like if the amount of utilised CL space exceeds the limit, we can close without ensuring all writes pending have finished or been synced IFF we are force discarding the segments. Is this a scenario we want to create as part of some other test? Also no regression test? JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Benedict Fix For: 2.1.5 Attachments: 8812.txt, crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326296#comment-14326296 ] Joshua McKenzie commented on CASSANDRA-8812: Seems a reasonable hypothesis and code LGTM - [~amichai]: any chance you could try to reproduce the problem with a build using the attached patch? JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Benedict Attachments: 8812.txt, crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14326853#comment-14326853 ] Amichai Rothman commented on CASSANDRA-8812: I ran a build from the 2.1.2 tag and recreated the exceptions and JVM crash, then applied the attached patch, rebuilt and ran it quite a few more times and was unable to reproduce them again, so the patch seems to fix the issue. I still get occasional failures (which were there before the patch as well) due to what looks like CASSANDRA-8390, but that's a separate issue. Thanks guys for the quick solution! JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Benedict Attachments: 8812.txt, crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324775#comment-14324775 ] Amichai Rothman commented on CASSANDRA-8812: Sure, I just ran it a few more times (with cassandra-all 2.1.2, if any of the line numbers changed) and got several of these: {noformat} 2015-02-17 11:35:13,470 | PERIODIC-COMMIT-LOG-SYNCER | o.a.c.d.c.CommitLog | ERROR | CommitLog.java:367 | Failed to persist commits to disk. Commit disk failure policy is stop; terminating thread org.apache.cassandra.io.FSWriteError: java.io.IOException: The handle is invalid at org.apache.cassandra.db.commitlog.CommitLogSegment.sync(CommitLogSegment.java:329) ~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.db.commitlog.CommitLog.sync(CommitLog.java:195)~[cassandra-all-2.1.2.jar:2.1.2] at org.apache.cassandra.db.commitlog.AbstractCommitLogService$1.run(AbstractCommitLogService.java:81) ~[cassandra-all-2.1.2.jar:2.1.2] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_31] Caused by: java.io.IOException: The handle is invalid at java.nio.MappedByteBuffer.force0(Native Method) ~[na:1.8.0_31] at java.nio.MappedByteBuffer.force(MappedByteBuffer.java:203) ~[na:1.8.0_31] at org.apache.cassandra.db.commitlog.CommitLogSegment.sync(CommitLogSegment.java:315) ~[cassandra-all-2.1.2.jar:2.1.2] ... 3 common frames omitted {noformat} However in the past I also got exceptions with an exactly identical stack trace other than a different IOException message Attempt to access invalid address instead of The handle is invalid. JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Joshua McKenzie Attachments: crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324051#comment-14324051 ] Benedict commented on CASSANDRA-8812: - Do you have the Java exception thrown by buffer.force()? JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Assignee: Joshua McKenzie Attachments: crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8812) JVM Crashes on Windows x86
[ https://issues.apache.org/jira/browse/CASSANDRA-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323297#comment-14323297 ] Amichai Rothman commented on CASSANDRA-8812: I don't know if it's related or not, but it's suspicious that the segment's sync method in some cases closes itself without it being removed from its associated segment manager... JVM Crashes on Windows x86 -- Key: CASSANDRA-8812 URL: https://issues.apache.org/jira/browse/CASSANDRA-8812 Project: Cassandra Issue Type: Bug Environment: Windows 7 running x86(32-bit) Oracle JDK 1.8.0_u31 Reporter: Amichai Rothman Attachments: crashtest.tgz Under Windows (32 or 64 bit) with the 32-bit Oracle JDK, the JVM may crash due to EXCEPTION_ACCESS_VIOLATION. This happens inconsistently. The attached test project can recreate the crash - sometimes it works successfully, sometimes there's a Java exception in the log, and sometimes the hotspot JVM crash shows up (regardless of whether the JUnit test results in success - you can ignore that). Run it a bunch of times to see the various outcomes. It also contains a sample hotspot error log. Note that both when the Java exception is thrown and when the JVM crashes, the stack trace is almost the same - they both eventually occur when the PERIODIC-COMMIT-LOG-SYNCER thread calls CommitLogSegment.sync and accesses the buffer (MappedByteBuffer): if it happens to be in buffer.force(), then the Java exception is thrown, and if it's in one of the buffer.put() calls before it, then the JVM crashes. This possibly exposes a JVM bug as well in this case. So it basically looks like a race condition which results in the buffer sometimes being used after it is no longer valid. I recreated this on a PC with Windows 7 64-bit running the 32-bit Oracle JDK, as well as on a modern.ie virtualbox image of Windows 7 32-bit running the JDK, and it happens both with JDK 7 and JDK 8. Also defining an explicit dependency on cassandra 2.1.2 (as opposed to the cassandra-unit dependency on 2.1.0) doesn't make a difference. At some point in my testing I've also seen a Java-level exception on Linux, but I can't recreate it at the moment with this test project, so I can't guarantee it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)