[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2017-01-02 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15792585#comment-15792585
 ] 

Stefan Podkowinski commented on CASSANDRA-9625:
---

[~llambiel], I can't see anything that would indicate any deadlocks in your 
provided thread dump. In fact {{metrics-graphite-reporter-thread-1}} is just 
waiting for the next tick to execute and pull the latest metrics from 
Cassandra. Are you sure this isn't a problem with your Graphite installation? 
Are there any error messages in Cassandra related to Graphite? Are there any 
connections from Cassandra to Graphite (as can be checked using netstat)?

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: Stefan Podkowinski
> Attachments: Screen Shot 2016-04-13 at 10.40.58 AM.png, metrics.yaml, 
> thread-dump.log, thread-dump2.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-13087) Not enough bytes exception during compaction

2017-01-02 Thread FACORAT (JIRA)
FACORAT created CASSANDRA-13087:
---

 Summary: Not enough bytes exception during compaction
 Key: CASSANDRA-13087
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13087
 Project: Cassandra
  Issue Type: Bug
  Components: Compaction
 Environment: Ubuntu 14.04.3 LTS, Cassandra 2.1.14
Reporter: FACORAT


After a repair we have compaction exceptions on some nodes and its spreading

{noformat}
ERROR [CompactionExecutor:14065] 2016-12-30 14:45:07,245 
CassandraDaemon.java:229 - Exception in thread 
Thread[CompactionExecutor:14065,1,main]
java.lang.IllegalArgumentException: Not enough bytes. Offset: 5. Length: 20275. 
Buffer size: 12594
at 
org.apache.cassandra.db.composites.AbstractCType.checkRemaining(AbstractCType.java:378)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.composites.AbstractCompoundCellNameType.fromByteBuffer(AbstractCompoundCellNameType.java:100)
 ~[apache-cassandra-2.1.14.ja
r:2.1.14]
at 
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:398)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:382)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:75)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:52) 
~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:46) 
~[apache-cassandra-2.1.14.jar:2.1.14]
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:171)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
at 
com.google.common.collect.Iterators$7.computeNext(Iterators.java:645) 
~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
 ~[guava-16.0.jar:na]
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(ColumnIndex.java:166)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:121)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:193) 
~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:127)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
 ~[apache-cassandra-2.1.14.jar:2
.1.14]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_60]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_60]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_60]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_60]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
{noformat}

nodetool scrub will discard part of the sstable with the following errors:

{noformat}
WARN  [CompactionExecutor:14074] 2016-12-30 16:32:31,290 OutputHandler.java:57 
- Error reading row (stacktrace follows):
java.io.IOError: java.io.IOException: Key from data file () does not match key 
from index file (0014434c5030313030303030303036313137383331390350494400)
at org.apache.cassandra.db.compaction.Scrubber.sc

[jira] [Commented] (CASSANDRA-13087) Not enough bytes exception during compaction

2017-01-02 Thread FACORAT (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15792724#comment-15792724
 ] 

FACORAT commented on CASSANDRA-13087:
-

This seems closely related to 
[CASSANDRA-10961|https://issues.apache.org/jira/browse/CASSANDRA-10961] issue

> Not enough bytes exception during compaction
> 
>
> Key: CASSANDRA-13087
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13087
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: Ubuntu 14.04.3 LTS, Cassandra 2.1.14
>Reporter: FACORAT
>
> After a repair we have compaction exceptions on some nodes and its spreading
> {noformat}
> ERROR [CompactionExecutor:14065] 2016-12-30 14:45:07,245 
> CassandraDaemon.java:229 - Exception in thread 
> Thread[CompactionExecutor:14065,1,main]
> java.lang.IllegalArgumentException: Not enough bytes. Offset: 5. Length: 
> 20275. Buffer size: 12594
> at 
> org.apache.cassandra.db.composites.AbstractCType.checkRemaining(AbstractCType.java:378)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.composites.AbstractCompoundCellNameType.fromByteBuffer(AbstractCompoundCellNameType.java:100)
>  ~[apache-cassandra-2.1.14.ja
> r:2.1.14]
> at 
> org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:398)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:382)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:75)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:52) 
> ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:46) 
> ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>  ~[guava-16.0.jar:na]
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
> ~[guava-16.0.jar:na]
> at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:171)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>  ~[guava-16.0.jar:na]
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
> ~[guava-16.0.jar:na]
> at 
> com.google.common.collect.Iterators$7.computeNext(Iterators.java:645) 
> ~[guava-16.0.jar:na]
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>  ~[guava-16.0.jar:na]
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) 
> ~[guava-16.0.jar:na]
> at 
> org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(ColumnIndex.java:166)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:121)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:193) 
> ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:127)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>  ~[apache-cassandra-2.1.14.jar:2.1.14]
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
>  ~[apache-cassandra-2.1.14.jar:2
> .1.14]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_60]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_60]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_60]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav

[jira] [Commented] (CASSANDRA-11349) MerkleTree mismatch when multiple range tombstones exists for the same partition and interval

2017-01-02 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15792783#comment-15792783
 ] 

Stefan Podkowinski commented on CASSANDRA-11349:


I've looked at some metrics today for one of our clusters that has been updated 
to 2.1.16 a couple of weeks ago. We used to see tens of thousands of sstables 
getting streamed each night during repairs with many GBs.

With 2.1.16 the number of streamed sstables went down to almost none. Thanks 
for fixing this to everyone involved! :)

> MerkleTree mismatch when multiple range tombstones exists for the same 
> partition and interval
> -
>
> Key: CASSANDRA-11349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11349
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Fabien Rousseau
>Assignee: Branimir Lambov
>  Labels: repair
> Fix For: 2.1.16, 2.2.8
>
> Attachments: 11349-2.1-v2.patch, 11349-2.1-v3.patch, 
> 11349-2.1-v4.patch, 11349-2.1.patch, 11349-2.2-v4.patch
>
>
> We observed that repair, for some of our clusters, streamed a lot of data and 
> many partitions were "out of sync".
> Moreover, the read repair mismatch ratio is around 3% on those clusters, 
> which is really high.
> After investigation, it appears that, if two range tombstones exists for a 
> partition for the same range/interval, they're both included in the merkle 
> tree computation.
> But, if for some reason, on another node, the two range tombstones were 
> already compacted into a single range tombstone, this will result in a merkle 
> tree difference.
> Currently, this is clearly bad because MerkleTree differences are dependent 
> on compactions (and if a partition is deleted and created multiple times, the 
> only way to ensure that repair "works correctly"/"don't overstream data" is 
> to major compact before each repair... which is not really feasible).
> Below is a list of steps allowing to easily reproduce this case:
> {noformat}
> ccm create test -v 2.1.13 -n 2 -s
> ccm node1 cqlsh
> CREATE KEYSPACE test_rt WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 2};
> USE test_rt;
> CREATE TABLE IF NOT EXISTS table1 (
> c1 text,
> c2 text,
> c3 float,
> c4 float,
> PRIMARY KEY ((c1), c2)
> );
> INSERT INTO table1 (c1, c2, c3, c4) VALUES ( 'a', 'b', 1, 2);
> DELETE FROM table1 WHERE c1 = 'a' AND c2 = 'b';
> ctrl ^d
> # now flush only one of the two nodes
> ccm node1 flush 
> ccm node1 cqlsh
> USE test_rt;
> INSERT INTO table1 (c1, c2, c3, c4) VALUES ( 'a', 'b', 1, 3);
> DELETE FROM table1 WHERE c1 = 'a' AND c2 = 'b';
> ctrl ^d
> ccm node1 repair
> # now grep the log and observe that there was some inconstencies detected 
> between nodes (while it shouldn't have detected any)
> ccm node1 showlog | grep "out of sync"
> {noformat}
> Consequences of this are a costly repair, accumulating many small SSTables 
> (up to thousands for a rather short period of time when using VNodes, the 
> time for compaction to absorb those small files), but also an increased size 
> on disk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-13018) Exceptions encountered calling getSeeds() breaks messaging service

2017-01-02 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13018:
---
Attachment: 0001-Better-handle-config-errors-during-outbound-connecti.patch

> Exceptions encountered calling getSeeds() breaks messaging service
> --
>
> Key: CASSANDRA-13018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13018
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>  Labels: lhf
> Attachments: 
> 0001-Better-handle-config-errors-during-outbound-connecti.patch
>
>
> OutboundTcpConnection.connect() calls getSeeds(). If getSeeds() throws an 
> exception (for example, DD/Config invalid yaml error), messaging thread(s) 
> break(s). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12453) AutoSavingCache does not store required keys making RowCacheTests Flaky

2017-01-02 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-12453:
---
Attachment: 12453-2.2-update.diff

> AutoSavingCache does not store required keys making RowCacheTests Flaky
> ---
>
> Key: CASSANDRA-12453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12453
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 2.2.x, 3.0.x
>
> Attachments: 12453-2.2-update.diff, 12453-2.2.txt
>
>
> RowCacheTests were flaky and while investigating, I found that it does not 
> store all the keys to disk. 
> The reason is that we use  OHCache and call hotKeyIterator on it. This is not 
> guaranteed to return the number of keys we want. Here is the documentation 
> from OHCache 
> /**
>  * Builds an iterator over the N most recently used keys returning 
> deserialized objects.
>  * You must call {@code close()} on the returned iterator.
>  * 
>  * Note: During a rehash, the implementation might return keys twice 
> or not at all.
>  * 
>  */
> CloseableIterator hotKeyIterator(int n);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12453) AutoSavingCache does not store required keys making RowCacheTests Flaky

2017-01-02 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15793601#comment-15793601
 ] 

Jay Zhuang commented on CASSANDRA-12453:


Thanks for the review.
Updated the patch.

> AutoSavingCache does not store required keys making RowCacheTests Flaky
> ---
>
> Key: CASSANDRA-12453
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12453
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: sankalp kohli
>Assignee: Jay Zhuang
>Priority: Minor
> Fix For: 2.2.x, 3.0.x
>
> Attachments: 12453-2.2-update.diff, 12453-2.2.txt
>
>
> RowCacheTests were flaky and while investigating, I found that it does not 
> store all the keys to disk. 
> The reason is that we use  OHCache and call hotKeyIterator on it. This is not 
> guaranteed to return the number of keys we want. Here is the documentation 
> from OHCache 
> /**
>  * Builds an iterator over the N most recently used keys returning 
> deserialized objects.
>  * You must call {@code close()} on the returned iterator.
>  * 
>  * Note: During a rehash, the implementation might return keys twice 
> or not at all.
>  * 
>  */
> CloseableIterator hotKeyIterator(int n);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13079) Repair doesn't work after several replication factor changes

2017-01-02 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15793609#comment-15793609
 ] 

Jeff Jirsa commented on CASSANDRA-13079:


{quote}And this rise a question - shouldn't replication factor change also 
reset repair state for this keyspace?{quote}

If replication factor is increased, it seems like it should, in fact, reset the 
repair state for that keyspace. The fact that we don't is probably a bug.  

{quote}
I think it would be a good idea for this type of scenario to change the repair 
state during replication altering, but I'm not sure if that's always the case.
{quote}

The principle of least astonishment applies here - a user running repair should 
expect all data to be repaired, and a user who adds a new DC and then runs 
repair will see a lot of data streamed. That's not something that SHOULD 
surprise a user. They can work around it if they choose. The fact that 
incremental (default) repair doesn't do anything if you change from rf=1 to 
rf=2 is more surprising and dangerous than extra streaming, so I imagine we 
should consider that more important.



> Repair doesn't work after several replication factor changes
> 
>
> Key: CASSANDRA-13079
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13079
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Debian 
>Reporter: Vladimir Yudovin
>Priority: Critical
>
> Scenario:
> Start two nodes cluster.
> Create keyspace with rep.factor *one*:
> CREATE KEYSPACE rep WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> CREATE TABLE rep.data (str text PRIMARY KEY );
> INSERT INTO rep.data (str) VALUES ( 'qwerty');
> Run *nodetool flush* on all nodes. On one of them table files are created.
> Change replication factor to *two*:
> ALTER KEYSPACE rep WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 2};
> Run repair, then *nodetool flush* on all nodes. On all nodes table files are 
> created.
> Change replication factor to *one*:
> ALTER KEYSPACE rep WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> Then *nodetool cleanup*, only on initial node remained data files.
> Change replication factor to *two* again:
> ALTER KEYSPACE rep WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 2};
> Run repair, then *nodetool flush* on all nodes. No data files on second node 
> (though expected, as after first repair/flush).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13060) NPE in StorageService.java while bootstrapping

2017-01-02 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15793636#comment-15793636
 ] 

Jeff Jirsa commented on CASSANDRA-13060:


If the thought is that this is caused by CASSANDRA-12653 , it seems like a 
reasonable extra guard in addition to your work on 12653, right 
[~spo...@gmail.com] ? 

If it's not, then it seems like understanding how we get in this state is worth 
investigating?


> NPE in StorageService.java while bootstrapping
> --
>
> Key: CASSANDRA-13060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: lhf
> Fix For: 3.0.x
>
> Attachments: 13060-3.0.txt
>
>
> Lots of NPE happens when bootstrapping new node:
> {code}
> WARN  [SharedPool-Worker-1] 2016-12-19 23:09:09,034 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,main]: {}
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.service.StorageService.isRpcReady(StorageService.java:1829)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.service.StorageService.notifyUp(StorageService.java:1787)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.service.StorageService.onAlive(StorageService.java:2424) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.gms.Gossiper.realMarkAlive(Gossiper.java:999) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.gms.Gossiper$3.response(Gossiper.java:979) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.10.jar:3.0.10]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12172) Fail to bootstrap new node.

2017-01-02 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15794237#comment-15794237
 ] 

Jay Zhuang commented on CASSANDRA-12172:


Did some debug, for our case, gossipstage is blocked by updateTopology write 
lock. The root cause is CASSANDRA-12281. Confirmed [~spo...@gmail.com]'s patch 
fix the problem : [Avoid blocking gossip during pending range 
calculation|https://github.com/spodkowinski/cassandra/commit/2611e33720233ad724b2196f5c062ef5c58ecd10]
Should we close this as duplicate to CASSANDRA-12281?

> Fail to bootstrap new node.
> ---
>
> Key: CASSANDRA-12172
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12172
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dikang Gu
>
> When I try to bootstrap new node in the cluster, sometimes it failed because 
> of following exceptions.
> {code}
> 2016-07-12_05:14:55.58509 INFO  05:14:55 [main]: JOINING: Starting to 
> bootstrap...
> 2016-07-12_05:14:56.07491 INFO  05:14:56 [GossipTasks:1]: InetAddress 
> /2401:db00:2011:50c7:face:0:9:0 is now DOWN
> 2016-07-12_05:14:56.32219 Exception (java.lang.RuntimeException) encountered 
> during startup: A node required to move the data consistently is down 
> (/2401:db00:2011:50c7:face:0:9:0). If you wish to move the data from a 
> potentially inconsis
> tent replica, restart the node with -Dcassandra.consistent.rangemovement=false
> 2016-07-12_05:14:56.32582 ERROR 05:14:56 [main]: Exception encountered during 
> startup
> 2016-07-12_05:14:56.32583 java.lang.RuntimeException: A node required to move 
> the data consistently is down (/2401:db00:2011:50c7:face:0:9:0). If you wish 
> to move the data from a potentially inconsistent replica, restart the node 
> with -Dc
> assandra.consistent.rangemovement=false
> 2016-07-12_05:14:56.32584   at 
> org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:264)
>  ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32584   at 
> org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:147) 
> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32584   at 
> org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:82) 
> ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32584   at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1230)
>  ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32584   at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:924)
>  ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32585   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:709)
>  ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32585   at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:585)
>  ~[apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32585   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
> [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32586   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516)
>  [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32586   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625) 
> [apache-cassandra-2.2.5+git20160315.c29948b.jar:2.2.5+git20160315.c29948b]
> 2016-07-12_05:14:56.32730 WARN  05:14:56 [StorageServiceShutdownHook]: No 
> local state or state is in silent shutdown, not announcing shutdown
> {code}
> Here are more logs: 
> https://gist.github.com/DikangGu/c6a83eafdbc091250eade4a3bddcc40b
> I'm pretty sure there are no DOWN nodes or restarted nodes in the cluster, 
> but I still see a lot of nodes UP and DOWN in the gossip log, which failed 
> the bootstrap at the end, is this a known bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13060) NPE in StorageService.java while bootstrapping

2017-01-02 Thread Jay Zhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15794260#comment-15794260
 ] 

Jay Zhuang commented on CASSANDRA-13060:


The root cause is fixed by CASSANDRA-12281. Patched "[Avoid blocking gossip 
during pending range 
calculation|https://github.com/spodkowinski/cassandra/commit/2611e33720233ad724b2196f5c062ef5c58ecd10]";
 to 3.0.10 and confirmed no NPE. Thanks [~spo...@gmail.com] for the fix.

On the other hand, it's still worth doing null check in 
[StorageService|https://github.com/cooldoger/cassandra/commit/3f85e4ffb531c82ee16b8f6cf9b4c4e29ae1fd53]
 as a small improvement.

> NPE in StorageService.java while bootstrapping
> --
>
> Key: CASSANDRA-13060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13060
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Minor
>  Labels: lhf
> Fix For: 3.0.x
>
> Attachments: 13060-3.0.txt
>
>
> Lots of NPE happens when bootstrapping new node:
> {code}
> WARN  [SharedPool-Worker-1] 2016-12-19 23:09:09,034 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,main]: {}
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.service.StorageService.isRpcReady(StorageService.java:1829)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.service.StorageService.notifyUp(StorageService.java:1787)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.service.StorageService.onAlive(StorageService.java:2424) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.gms.Gossiper.realMarkAlive(Gossiper.java:999) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.gms.Gossiper$3.response(Gossiper.java:979) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.10.jar:3.0.10]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-13060) NPE in StorageService.java while bootstrapping

2017-01-02 Thread Jay Zhuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Zhuang updated CASSANDRA-13060:
---
  Priority: Trivial  (was: Minor)
Issue Type: Improvement  (was: Bug)

> NPE in StorageService.java while bootstrapping
> --
>
> Key: CASSANDRA-13060
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13060
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jay Zhuang
>Assignee: Jay Zhuang
>Priority: Trivial
>  Labels: lhf
> Fix For: 3.0.x
>
> Attachments: 13060-3.0.txt
>
>
> Lots of NPE happens when bootstrapping new node:
> {code}
> WARN  [SharedPool-Worker-1] 2016-12-19 23:09:09,034 
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread 
> Thread[SharedPool-Worker-1,5,main]: {}
> java.lang.NullPointerException: null
> at 
> org.apache.cassandra.service.StorageService.isRpcReady(StorageService.java:1829)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.service.StorageService.notifyUp(StorageService.java:1787)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.service.StorageService.onAlive(StorageService.java:2424) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.gms.Gossiper.realMarkAlive(Gossiper.java:999) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.gms.Gossiper$3.response(Gossiper.java:979) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:53)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_111]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
>  [apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.10.jar:3.0.10]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-13088) Update Cassandra Configuration File page with undocumented options

2017-01-02 Thread Alwyn Davis (JIRA)
Alwyn Davis created CASSANDRA-13088:
---

 Summary: Update Cassandra Configuration File page with 
undocumented options
 Key: CASSANDRA-13088
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13088
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation and Website
Reporter: Alwyn Davis
Priority: Trivial


The documentation site doesn't cover all configuration options 
(http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html),
 including (based on trunk and excluding deprecated options):

* auto_bootstrap
* commitlog_max_compression_buffers_in_pool
* credentials_cache_max_entries
* disk_access_mode
* disk_optimization_estimate_percentile
* disk_optimization_page_cross_chance
* dynamic_snitch
* enable_user_defined_functions_threads
* max_mutation_size_in_kb
* otc_coalescing_strategy
* otc_coalescing_window_us
* otc_coalescing_window_us_default
* outboundBindAny
* permissions_cache_max_entries
* roles_cache_max_entries
* rpc_listen_backlog
* user_defined_function_fail_timeout
* user_defined_function_warn_timeout
* user_function_timeout_policy

I'm not sure if some are intentionally omitted, but others would appear to be 
useful e.g. {{max_mutation_size_in_kb}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-13089) Sidebar can't scroll when affix longer than window

2017-01-02 Thread Alwyn Davis (JIRA)
Alwyn Davis created CASSANDRA-13089:
---

 Summary: Sidebar can't scroll when affix longer than window
 Key: CASSANDRA-13089
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13089
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation and Website
Reporter: Alwyn Davis
Priority: Trivial
 Attachments: doc-navigation-fixer.js

On the Cassandra Configuration page 
(http://cassandra.apache.org/doc/latest/configuration/cassandra_config_file.html),
 the side nav is very long (3853px for me), but can't be scrolled.

I'm not sure how the "fixed-navigation" javascript is being added but changing 
it to add a max-height and allow scrollbars would fix this (please see attached 
example).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-13079) Repair doesn't work after several replication factor changes

2017-01-02 Thread Vladimir Yudovin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15794367#comment-15794367
 ] 

Vladimir Yudovin commented on CASSANDRA-13079:
--

"Increased" should also include adding new DC (no matter existing or new) for 
replication , even if current factor for other DC is decreased, so total sum if 
unchanged or even decreased.

May be for simplicity we can reset repair state on any replication factor or 
class change. It's not often operation, besides maybe system_auth, but it's 
usually small keyspace.

> Repair doesn't work after several replication factor changes
> 
>
> Key: CASSANDRA-13079
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13079
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Debian 
>Reporter: Vladimir Yudovin
>Priority: Critical
>
> Scenario:
> Start two nodes cluster.
> Create keyspace with rep.factor *one*:
> CREATE KEYSPACE rep WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> CREATE TABLE rep.data (str text PRIMARY KEY );
> INSERT INTO rep.data (str) VALUES ( 'qwerty');
> Run *nodetool flush* on all nodes. On one of them table files are created.
> Change replication factor to *two*:
> ALTER KEYSPACE rep WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 2};
> Run repair, then *nodetool flush* on all nodes. On all nodes table files are 
> created.
> Change replication factor to *one*:
> ALTER KEYSPACE rep WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> Then *nodetool cleanup*, only on initial node remained data files.
> Change replication factor to *two* again:
> ALTER KEYSPACE rep WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 2};
> Run repair, then *nodetool flush* on all nodes. No data files on second node 
> (though expected, as after first repair/flush).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12348) Flaky failures in SSTableRewriterTest.basicTest2/getPositionsTest

2017-01-02 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-12348:
-
Fix Version/s: 3.0.x
   Status: Patch Available  (was: In Progress)

> Flaky failures in SSTableRewriterTest.basicTest2/getPositionsTest
> -
>
> Key: CASSANDRA-12348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12348
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Joel Knighton
>Assignee: Stefania
> Fix For: 3.0.x, 3.8
>
>
> Example failures:
> http://cassci.datastax.com/job/cassandra-3.9_testall/45/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/basicTest2/
> http://cassci.datastax.com/job/cassandra-3.9_testall/37/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/getPositionsTest/
> http://cassci.datastax.com/job/trunk_testall/1054/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/getPositionsTest/
> All failures look like the test is finding more files than expected after a 
> rewrite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12348) Flaky failures in SSTableRewriterTest.basicTest2/getPositionsTest

2017-01-02 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15794386#comment-15794386
 ] 

Stefania commented on CASSANDRA-12348:
--

The patch committed to 3.8 applies cleanly to 3.0, see 
[here|https://github.com/stef1927/cassandra/tree/12348-3.0]. I'm multiplexing 
it 
[here|https://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-testall-multiplex/61/].
 If the results are positive, are you OK to commit it to 3.0 as well, 
[~pauloricardomg]?

> Flaky failures in SSTableRewriterTest.basicTest2/getPositionsTest
> -
>
> Key: CASSANDRA-12348
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12348
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Joel Knighton
>Assignee: Stefania
> Fix For: 3.8, 3.0.x
>
>
> Example failures:
> http://cassci.datastax.com/job/cassandra-3.9_testall/45/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/basicTest2/
> http://cassci.datastax.com/job/cassandra-3.9_testall/37/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/getPositionsTest/
> http://cassci.datastax.com/job/trunk_testall/1054/testReport/junit/org.apache.cassandra.io.sstable/SSTableRewriterTest/getPositionsTest/
> All failures look like the test is finding more files than expected after a 
> rewrite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-13086) CAS resultset sometimes does not contain value column even though wasApplied is false

2017-01-02 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov reassigned CASSANDRA-13086:
---

Assignee: Alex Petrov

> CAS resultset sometimes does not contain value column even though wasApplied 
> is false
> -
>
> Key: CASSANDRA-13086
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13086
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Christian Spriegel
>Assignee: Alex Petrov
>Priority: Minor
>
> Every now and then I see a ResultSet for one of my CAS queries that contain 
> wasApplied=false, but does not contain my value column.
> I just now found another occurrence, which causes the following exception in 
> the driver:
> {code}
> ...
> Caused by: com.mycompany.MyDataaccessException: checkLock(ResultSet[ 
> exhausted: true, Columns[[applied](boolean)]])
> at com.mycompany.MyDAO._checkLock(MyDAO.java:408)
> at com.mycompany.MyDAO._releaseLock(MyDAO.java:314)
> ... 16 more
> Caused by: java.lang.IllegalArgumentException: value is not a column defined 
> in this metadata
> at 
> com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.java:266)
> at 
> com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinitions.java:272)
> at 
> com.datastax.driver.core.ArrayBackedRow.getIndexOf(ArrayBackedRow.java:81)
> at 
> com.datastax.driver.core.AbstractGettableData.getBytes(AbstractGettableData.java:151)
> at com.mycompany.MyDAO._checkLock(MyDAO.java:383)
> ... 17 more
> {code}
> The query the application was doing:
> delete from "Lock" where lockname=:lockname and id=:id if value=:value;
> I did some debugging recently and was able to track these ResultSets to 
> StorageProxy.cas() to the "CAS precondition does not match current values {}" 
> return statement.
> I saw this happening with Cassandra 3.0.10 and earlier versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)