[jira] [Issue Comment Deleted] (CASSANDRA-12728) Handling partially written hint files
[ https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hansey Chen updated CASSANDRA-12728: Comment: was deleted (was: I was looking at this issue and could not understand one of the effects of this bug. Garvit Juniwal mentioned in [one|https://issues.apache.org/jira/browse/CASSANDRA-12728?focusedCommentId=15576548=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15576548] of his comments that this bug will "put cassandra in a crash loop". Also Harikrishnan said in [a related issue|https://issues.apache.org/jira/browse/CASSANDRA-12844] that this bug crashed many nodes. But I cannot figure out how an EOFE during hinted handoff can crash a cassandra node. Is it only crashing the hints dispatching thread? And how can it affect other nodes? Could anyone please explain a little bit more? Many thanks in advance.) > Handling partially written hint files > - > > Key: CASSANDRA-12728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12728 > Project: Cassandra > Issue Type: Bug > Components: Hints >Reporter: Sharvanath Pathak >Assignee: Garvit Juniwal >Priority: Major > Labels: lhf > Fix For: 3.0.14, 3.11.0, 4.0 > > Attachments: CASSANDRA-12728.patch > > > {noformat} > ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 > HintsDispatchExecutor.java:225 - Failed to dispatch hints file > d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > [apache-cassandra-3.0.6.jar:3.0.6] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_77] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > Caused by: java.io.EOFException: null > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278) > ~[apache-cassandra-3.0.6.jar:3.0.6] > ... 15 common frames omitted > {noformat} > We've found out that the hint file was
[jira] [Commented] (CASSANDRA-12728) Handling partially written hint files
[ https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121904#comment-16121904 ] Hansey Chen commented on CASSANDRA-12728: - I was looking at this issue and could not understand one of the effects of this bug. Garvit Juniwal mentioned in [one|https://issues.apache.org/jira/browse/CASSANDRA-12728?focusedCommentId=15576548=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15576548] of his comments that this bug will "put cassandra in a crash loop". Also Harikrishnan said in [a related issue|https://issues.apache.org/jira/browse/CASSANDRA-12844] that this bug crashed many nodes. But I cannot figure out how an EOFE during hinted handoff can crash a cassandra node. Is it only crashing the hints dispatching thread? And how can it affect other nodes? Could anyone please explain a little bit more? Many thanks in advance. > Handling partially written hint files > - > > Key: CASSANDRA-12728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12728 > Project: Cassandra > Issue Type: Bug >Reporter: Sharvanath Pathak >Assignee: Garvit Juniwal > Labels: lhf > Fix For: 3.0.14, 3.11.0, 4.0 > > Attachments: CASSANDRA-12728.patch > > > {noformat} > ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 > HintsDispatchExecutor.java:225 - Failed to dispatch hints file > d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > [apache-cassandra-3.0.6.jar:3.0.6] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_77] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > Caused by: java.io.EOFException: null > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278) > ~[apache-cassandra-3.0.6.jar:3.0.6] > ... 15 common frames omitted > {noformat} > We've found out that the hint file was truncated because there was a hard > reboot