[jira] [Issue Comment Deleted] (CASSANDRA-12728) Handling partially written hint files

2018-04-22 Thread Hansey Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hansey Chen updated CASSANDRA-12728:

Comment: was deleted

(was: I was looking at this issue and could not understand one of the effects 
of this bug.

Garvit Juniwal mentioned in 
[one|https://issues.apache.org/jira/browse/CASSANDRA-12728?focusedCommentId=15576548=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15576548]
 of his comments that this bug will "put cassandra in a crash loop". Also 
Harikrishnan said in [a related 
issue|https://issues.apache.org/jira/browse/CASSANDRA-12844] that this bug 
crashed many nodes.

But I cannot figure out how an EOFE during hinted handoff can crash a cassandra 
node. Is it only crashing the hints dispatching thread? And how can it affect 
other nodes?

Could anyone please explain a little bit more? Many thanks in advance.)

> Handling partially written hint files
> -
>
> Key: CASSANDRA-12728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12728
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hints
>Reporter: Sharvanath Pathak
>Assignee: Garvit Juniwal
>Priority: Major
>  Labels: lhf
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: CASSANDRA-12728.patch
>
>
> {noformat}
> ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 
> HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
> d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> Caused by: java.io.EOFException: null
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> ... 15 common frames omitted
> {noformat}
> We've found out that the hint file was 

[jira] [Commented] (CASSANDRA-12728) Handling partially written hint files

2017-08-10 Thread Hansey Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121904#comment-16121904
 ] 

Hansey Chen commented on CASSANDRA-12728:
-

I was looking at this issue and could not understand one of the effects of this 
bug.

Garvit Juniwal mentioned in 
[one|https://issues.apache.org/jira/browse/CASSANDRA-12728?focusedCommentId=15576548=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15576548]
 of his comments that this bug will "put cassandra in a crash loop". Also 
Harikrishnan said in [a related 
issue|https://issues.apache.org/jira/browse/CASSANDRA-12844] that this bug 
crashed many nodes.

But I cannot figure out how an EOFE during hinted handoff can crash a cassandra 
node. Is it only crashing the hints dispatching thread? And how can it affect 
other nodes?

Could anyone please explain a little bit more? Many thanks in advance.

> Handling partially written hint files
> -
>
> Key: CASSANDRA-12728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12728
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sharvanath Pathak
>Assignee: Garvit Juniwal
>  Labels: lhf
> Fix For: 3.0.14, 3.11.0, 4.0
>
> Attachments: CASSANDRA-12728.patch
>
>
> {noformat}
> ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 
> HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
> d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> Caused by: java.io.EOFException: null
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> ... 15 common frames omitted
> {noformat}
> We've found out that the hint file was truncated because there was a hard 
> reboot