[jira] [Created] (CASSANDRA-6101) Debian init script broken

2013-09-26 Thread Anton Winter (JIRA)
Anton Winter created CASSANDRA-6101:
---

 Summary: Debian init script broken
 Key: CASSANDRA-6101
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6101
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Anton Winter
Priority: Minor


The debian init script released in 2.0.1 contains 2 issues:

# The pidfile directory is not created if it doesn't already exist.
# Classpath not exported to the start-stop-daemon.

These lead to the init script not picking up jna.jar, or anything from the 
debian EXTRA_CLASSPATH environment variable, and the init script not being able 
to stop/restart Cassandra.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6101) Debian init script broken

2013-09-26 Thread Anton Winter (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Winter updated CASSANDRA-6101:


Attachment: 6101.txt

Attached patch with fixes.

> Debian init script broken
> -
>
> Key: CASSANDRA-6101
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6101
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Anton Winter
>Priority: Minor
> Attachments: 6101.txt
>
>
> The debian init script released in 2.0.1 contains 2 issues:
> # The pidfile directory is not created if it doesn't already exist.
> # Classpath not exported to the start-stop-daemon.
> These lead to the init script not picking up jna.jar, or anything from the 
> debian EXTRA_CLASSPATH environment variable, and the init script not being 
> able to stop/restart Cassandra.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1632) Thread workflow and cpu affinity

2013-09-26 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778558#comment-13778558
 ] 

Stu Hood commented on CASSANDRA-1632:
-

bq. while there is the work-stealing aspect, it's mostly, if I understand 
correctly, in the context of a task that can be broken down into smaller parts 
(like a recursive problem)
Yes, but no. ForkJoinPool has it's own interface, but it also implements 
AbstractExecutorService. In particular note that it implements AES, but doesn't 
accept a BlockingQueue like the other Executors do. This is because it has it's 
own work-stealing "queue" implementation. Treat it like any other executor.

> Thread workflow and cpu affinity
> 
>
> Key: CASSANDRA-1632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1632
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Chris Goffinet
>Assignee: Jason Brown
>  Labels: performance
>
> Here are some thoughts I wanted to write down, we need to run some serious 
> benchmarks to see the benefits:
> 1) All thread pools for our stages use a shared queue per stage. For some 
> stages we could move to a model where each thread has its own queue. This 
> would reduce lock contention on the shared queue. This workload only suits 
> the stages that have no variance, else you run into thread starvation. Some 
> stages that this might work: ROW-MUTATION.
> 2) Set cpu affinity for each thread in each stage. If we can pin threads to 
> specific cores, and control the workflow of a message from Thrift down to 
> each stage, we should see improvements on reducing L1 cache misses. We would 
> need to build a JNI extension (to set cpu affinity), as I could not find 
> anywhere in JDK where it was exposed. 
> 3) Batching the delivery of requests across stage boundaries. Peter Schuller 
> hasn't looked deep enough yet into the JDK, but he thinks there may be 
> significant improvements to be had there. Especially in high-throughput 
> situations. If on each consumption you were to consume everything in the 
> queue, rather than implying a synchronization point in between each request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-6102) CassandraStorage broken for bigints and ints

2013-09-26 Thread Janne Jalkanen (JIRA)
Janne Jalkanen created CASSANDRA-6102:
-

 Summary: CassandraStorage broken for bigints and ints
 Key: CASSANDRA-6102
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6102
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
 Environment: Cassandra 1.2.9 & 1.2.10, Pig 0.11.1, OSX 10.8.x
Reporter: Janne Jalkanen


I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle 
integer values.

Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for 
testing this. 

First a table:

{noformat}
> CREATE TABLE testc (
 key text PRIMARY KEY,
 ivalue int,
 svalue text,
 value bigint
) WITH COMPACT STORAGE;

> insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> select * from testc;

key | ivalue | svalue | value
-+++---
foo | 10 |bar | 65
{noformat}

For my Pig setup, I then use libraries from different C* versions to actually 
talk to my database (which stays on 1.2.10 all the time).

Cassandra 1.0.12 (using cassandra_storage.jar):

{noformat}
testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
dump testc
(foo,(svalue,bar),(ivalue,10),(value,65),{})
{noformat}

Cassandra 1.1.10:

{noformat}
testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
dump testc
(foo,(svalue,bar),(ivalue,10),(value,65),{})
{noformat}

Cassandra 1.2.10:

{noformat}
(testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
dump testc
foo,{(ivalue,
),(svalue,bar),(value,A)})
{noformat}


To me it appears that ints and bigints are interpreted as ascii values in cass 
1.2.10.  Did something change for CassandraStorage, is there a regression, or 
am I doing something wrong?  Quick perusal of the JIRA didn't reveal anything 
that I could directly pin on this.

Note that using compact storage does not seem to affect the issue, though it 
obviously changes the resulting pig format.

In addition, trying to use Pygmalion 

{noformat}
tf = foreach testc generate key, 
flatten(FromCassandraBag('ivalue,svalue,value',columns)) as 
(ivalue:int,svalue:chararray,lvalue:long);
dump tf

(foo,
,bar,A)
{noformat}

So no help there. Explicitly casting the values to (long) or (int) just results 
in a ClassCastException.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5981) Netty frame length exception when storing data to Cassandra using binary protocol

2013-09-26 Thread Daniel Norberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778587#comment-13778587
 ] 

Daniel Norberg commented on CASSANDRA-5981:
---

When handling the TooLongFrameException and sending an ErrorMessage reply with 
the InvalidRequestException without closing the connection, where is the stream 
id set on the ErrorMessage? I assume that without the stream id set and the 
connection still open, the client will be unable to infer that the request 
failed.


> Netty frame length exception when storing data to Cassandra using binary 
> protocol
> -
>
> Key: CASSANDRA-5981
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5981
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Linux, Java 7
>Reporter: Justin Sweeney
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.11
>
> Attachments: 0001-Correctly-catch-frame-too-long-exceptions.txt, 
> 0002-Allow-to-configure-the-max-frame-length.txt
>
>
> Using Cassandra 1.2.8, I am running into an issue where when I send a large 
> amount of data using the binary protocol, I get the following netty exception 
> in the Cassandra log file:
> {quote}
> ERROR 09:08:35,845 Unexpected exception during request
> org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame 
> length exceeds 268435456: 292413714 - discarded
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:441)
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:412)
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:372)
> at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:181)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:422)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:84)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:472)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:333)
> at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> {quote}
> I am using the Datastax driver and using CQL to execute insert queries. The 
> query that is failing is using atomic batching executing a large number of 
> statements (~55).
> Looking into the code a bit, I saw that in the 
> org.apache.cassandra.transport.Frame$Decoder class, the MAX_FRAME_LENGTH is 
> hard coded to 256 mb.
> Is this something that should be configurable or is this a hard limit that 
> will prevent batch statements of this size from executing for some reason?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5981) Netty frame length exception when storing data to Cassandra using binary protocol

2013-09-26 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778641#comment-13778641
 ] 

Sylvain Lebresne commented on CASSANDRA-5981:
-

Right, you're correct, the patch doesn't preserve the stream id correctly.

However, I have to say that I'm not too sure what's the easiest way to make 
that work correctly with Netty currently. To be able to use the stream id we've 
need to be able to start decoding the frame header before 
LengthFieldBasedFrameDecoder triggers the TooLongFrameException, but I don't 
know how to do that without knowing if LFBFD is in it's 
"discardingTooLongFrame" mode, and that's not exposed currently. Meaning that 
the only solutions I see so far are:
# push some feature request to netty so that LengthFieldBasedFrameDecoder 
exposes it's currently private discardingTooLongFrame field. Don't know if 
they'll be up for it and how quickly that'd get released.
# recode LengthFieldBasedFramedDecoder ourselves instead of using the netty 
one. Not the end of the world, it's not like it's a lot of code, but still a 
bit annoying in principle.

[~danielnorberg] Seeing any other simple solution that I would have missed?

> Netty frame length exception when storing data to Cassandra using binary 
> protocol
> -
>
> Key: CASSANDRA-5981
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5981
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Linux, Java 7
>Reporter: Justin Sweeney
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.11
>
> Attachments: 0001-Correctly-catch-frame-too-long-exceptions.txt, 
> 0002-Allow-to-configure-the-max-frame-length.txt
>
>
> Using Cassandra 1.2.8, I am running into an issue where when I send a large 
> amount of data using the binary protocol, I get the following netty exception 
> in the Cassandra log file:
> {quote}
> ERROR 09:08:35,845 Unexpected exception during request
> org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame 
> length exceeds 268435456: 292413714 - discarded
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:441)
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:412)
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:372)
> at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:181)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:422)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:84)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:472)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:333)
> at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> {quote}
> I am using the Datastax driver and using CQL to execute insert queries. The 
> query that is failing is using atomic batching executing a large number of 
> statements (~55).
> Looking into the code a bit, I saw that in the 
> org.apache.cassandra.transport.Frame$Decoder class, the MAX_FRAME_LENGTH is 
> hard coded to 256 mb.
> Is this something that should be configurable or is this a hard limit that 
> will prevent batch statements of this size from executing for some reason?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6095) INSERT query adds new value to collection type

2013-09-26 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778690#comment-13778690
 ] 

Sylvain Lebresne commented on CASSANDRA-6095:
-

You will need to provide a bit more info on how you reproduce (an example with 
cqlsh for instance) because on Cassandra 2.0.1:
{noformat}
[cqlsh 4.0.1 | Cassandra 2.0.1-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 
19.37.0]
Use HELP for help.
cqlsh> CREATE KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
cqlsh> USE ks;
cqlsh:ks> CREATE TABLE test (k int PRIMARY KEY, l list);
cqlsh:ks> INSERT INTO test (k, l) VALUES (0, [0, 1]);
cqlsh:ks> INSERT INTO test (k, l) VALUES (0, [3, 4]);
cqlsh:ks> SELECT * FROM test;

 k | l
---+
 0 | [3, 4]

(1 rows)
{noformat}
So, the second insert does correctly replace the previous value, it does not 
append new values.

That being said, maybe you are running into CASSANDRA-6069. It doesn't seem to 
exactly fit the problem you are describing, but you haven't given us a lot of 
details so...

> INSERT query adds new value to collection type
> --
>
> Key: CASSANDRA-6095
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6095
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ngoc Minh Vo
>Assignee: Sylvain Lebresne
>
> Hello,
> I don't know if somebody has reported this regression in v2.0.1: INSERT query 
> adds new value to collection type (eg. List) instead of replacing it.
> CQL3 docs:
> http://cassandra.apache.org/doc/cql3/CQL.html#collections
> {quote}
> Note: An INSERT will always replace the entire list.
> {quote}
> We do not encounter this issue with v1.2.9.
> Could you please have a look at the issue?
> Thanks for your help.
> Best regards,
> Minh

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CASSANDRA-6098) NullPointerException causing query timeout

2013-09-26 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne reassigned CASSANDRA-6098:
---

Assignee: Sylvain Lebresne

> NullPointerException causing query timeout
> --
>
> Key: CASSANDRA-6098
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6098
> Project: Cassandra
>  Issue Type: Bug
> Environment: CQLSH 4.0.0
> Cassandra 2.0.0
> Oracle Java 1.7.0_40
> Ubuntu 12.04.3 x64
>Reporter: Lex Lythius
>Assignee: Sylvain Lebresne
>
> A common SELECT query could not be completed failing with.
> {noformat}
> Request did not complete within rpc_timeout.
> {noformat}
> output.log showed this:
> {noformat}
> ERROR 15:38:04,036 Exception in thread Thread[ReadStage:170,5,main]
> java.lang.RuntimeException: java.lang.NullPointerException
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1867)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.cassandra.db.index.composites.CompositesIndexOnRegular.isStale(CompositesIndexOnRegular.java:97)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:247)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:102)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1651)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher.search(CompositesSearcher.java:50)
> at 
> org.apache.cassandra.db.index.SecondaryIndexManager.search(SecondaryIndexManager.java:525)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.search(ColumnFamilyStore.java:1639)
> at 
> org.apache.cassandra.db.RangeSliceCommand.executeLocally(RangeSliceCommand.java:135)
> at 
> org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1358)
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1863)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6098) NullPointerException causing query timeout

2013-09-26 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6098:


Fix Version/s: 2.0.2

> NullPointerException causing query timeout
> --
>
> Key: CASSANDRA-6098
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6098
> Project: Cassandra
>  Issue Type: Bug
> Environment: CQLSH 4.0.0
> Cassandra 2.0.0
> Oracle Java 1.7.0_40
> Ubuntu 12.04.3 x64
>Reporter: Lex Lythius
>Assignee: Sylvain Lebresne
> Fix For: 2.0.2
>
>
> A common SELECT query could not be completed failing with.
> {noformat}
> Request did not complete within rpc_timeout.
> {noformat}
> output.log showed this:
> {noformat}
> ERROR 15:38:04,036 Exception in thread Thread[ReadStage:170,5,main]
> java.lang.RuntimeException: java.lang.NullPointerException
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1867)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.cassandra.db.index.composites.CompositesIndexOnRegular.isStale(CompositesIndexOnRegular.java:97)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:247)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:102)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1651)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher.search(CompositesSearcher.java:50)
> at 
> org.apache.cassandra.db.index.SecondaryIndexManager.search(SecondaryIndexManager.java:525)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.search(ColumnFamilyStore.java:1639)
> at 
> org.apache.cassandra.db.RangeSliceCommand.executeLocally(RangeSliceCommand.java:135)
> at 
> org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1358)
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1863)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6098) NullPointerException causing query timeout

2013-09-26 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6098:


Attachment: 6098.txt

Right, that's a legit issue. Attaching simple patch to fix.

> NullPointerException causing query timeout
> --
>
> Key: CASSANDRA-6098
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6098
> Project: Cassandra
>  Issue Type: Bug
> Environment: CQLSH 4.0.0
> Cassandra 2.0.0
> Oracle Java 1.7.0_40
> Ubuntu 12.04.3 x64
>Reporter: Lex Lythius
>Assignee: Sylvain Lebresne
> Fix For: 2.0.2
>
> Attachments: 6098.txt
>
>
> A common SELECT query could not be completed failing with.
> {noformat}
> Request did not complete within rpc_timeout.
> {noformat}
> output.log showed this:
> {noformat}
> ERROR 15:38:04,036 Exception in thread Thread[ReadStage:170,5,main]
> java.lang.RuntimeException: java.lang.NullPointerException
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1867)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.cassandra.db.index.composites.CompositesIndexOnRegular.isStale(CompositesIndexOnRegular.java:97)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:247)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:102)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1651)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher.search(CompositesSearcher.java:50)
> at 
> org.apache.cassandra.db.index.SecondaryIndexManager.search(SecondaryIndexManager.java:525)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.search(ColumnFamilyStore.java:1639)
> at 
> org.apache.cassandra.db.RangeSliceCommand.executeLocally(RangeSliceCommand.java:135)
> at 
> org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1358)
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1863)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5981) Netty frame length exception when storing data to Cassandra using binary protocol

2013-09-26 Thread Daniel Norberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778704#comment-13778704
 ] 

Daniel Norberg commented on CASSANDRA-5981:
---

Right, that's annoying.

I'd be tempted to actually close the connection immediately. It doesn't seem 
very attractive to read and discard that huge frame, potentially using up a lot 
of bandwidth doing only that. IMO better to prioritize well behaved clients and 
let the offending client reconnect.

If you still want to keep the connection open and fail the request nicely I'd 
probably go for implementing a custom frame decoder.



> Netty frame length exception when storing data to Cassandra using binary 
> protocol
> -
>
> Key: CASSANDRA-5981
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5981
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Linux, Java 7
>Reporter: Justin Sweeney
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.11
>
> Attachments: 0001-Correctly-catch-frame-too-long-exceptions.txt, 
> 0002-Allow-to-configure-the-max-frame-length.txt
>
>
> Using Cassandra 1.2.8, I am running into an issue where when I send a large 
> amount of data using the binary protocol, I get the following netty exception 
> in the Cassandra log file:
> {quote}
> ERROR 09:08:35,845 Unexpected exception during request
> org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame 
> length exceeds 268435456: 292413714 - discarded
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:441)
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:412)
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:372)
> at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:181)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:422)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:84)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:472)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:333)
> at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> {quote}
> I am using the Datastax driver and using CQL to execute insert queries. The 
> query that is failing is using atomic batching executing a large number of 
> statements (~55).
> Looking into the code a bit, I saw that in the 
> org.apache.cassandra.transport.Frame$Decoder class, the MAX_FRAME_LENGTH is 
> hard coded to 256 mb.
> Is this something that should be configurable or is this a hard limit that 
> will prevent batch statements of this size from executing for some reason?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5981) Netty frame length exception when storing data to Cassandra using binary protocol

2013-09-26 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778731#comment-13778731
 ] 

Sylvain Lebresne commented on CASSANDRA-5981:
-

bq. I'd be tempted to actually close the connection immediately.

That was the initial intent, but now I feel closing the connection in that case 
is too harsh. If we do allows to configure the max frame length (reasonable if 
only because some may want to lower it from the relatively high default) then 
client libraries can't valid frame size on their side and this become a 
end-user error. And closing the connection on a end-user error feels wrong 
(especially because it potentially cuts other unrelated streams on that 
connection).

bq. I'd probably go for implementing a custom frame decoder

Agreed, that's probably the simpler. I'll work that out.


> Netty frame length exception when storing data to Cassandra using binary 
> protocol
> -
>
> Key: CASSANDRA-5981
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5981
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Linux, Java 7
>Reporter: Justin Sweeney
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.2.11
>
> Attachments: 0001-Correctly-catch-frame-too-long-exceptions.txt, 
> 0002-Allow-to-configure-the-max-frame-length.txt
>
>
> Using Cassandra 1.2.8, I am running into an issue where when I send a large 
> amount of data using the binary protocol, I get the following netty exception 
> in the Cassandra log file:
> {quote}
> ERROR 09:08:35,845 Unexpected exception during request
> org.jboss.netty.handler.codec.frame.TooLongFrameException: Adjusted frame 
> length exceeds 268435456: 292413714 - discarded
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.fail(LengthFieldBasedFrameDecoder.java:441)
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.failIfNecessary(LengthFieldBasedFrameDecoder.java:412)
> at 
> org.jboss.netty.handler.codec.frame.LengthFieldBasedFrameDecoder.decode(LengthFieldBasedFrameDecoder.java:372)
> at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:181)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:422)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:84)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:472)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:333)
> at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
> {quote}
> I am using the Datastax driver and using CQL to execute insert queries. The 
> query that is failing is using atomic batching executing a large number of 
> statements (~55).
> Looking into the code a bit, I saw that in the 
> org.apache.cassandra.transport.Frame$Decoder class, the MAX_FRAME_LENGTH is 
> hard coded to 256 mb.
> Is this something that should be configurable or is this a hard limit that 
> will prevent batch statements of this size from executing for some reason?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6095) INSERT query adds new value to collection type

2013-09-26 Thread Ngoc Minh Vo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778735#comment-13778735
 ] 

Ngoc Minh Vo commented on CASSANDRA-6095:
-

Hello Sylvain,

Thanks for your quick answer.

I've looked at the CASSANDRA-6069 before creating this new issue but we don't 
use IF NOT EXISTS clause.

Would it be possible to try with replication_factor > 1? (ours is 3)

Otherwise, I will wait for v2.0.2 to see whether it fixes the bug.

Best regards,
Minh

> INSERT query adds new value to collection type
> --
>
> Key: CASSANDRA-6095
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6095
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ngoc Minh Vo
>Assignee: Sylvain Lebresne
>
> Hello,
> I don't know if somebody has reported this regression in v2.0.1: INSERT query 
> adds new value to collection type (eg. List) instead of replacing it.
> CQL3 docs:
> http://cassandra.apache.org/doc/cql3/CQL.html#collections
> {quote}
> Note: An INSERT will always replace the entire list.
> {quote}
> We do not encounter this issue with v1.2.9.
> Could you please have a look at the issue?
> Thanks for your help.
> Best regards,
> Minh

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6095) INSERT query adds new value to collection type

2013-09-26 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778751#comment-13778751
 ] 

Sylvain Lebresne commented on CASSANDRA-6095:
-

bq. Would it be possible to try with replication_factor > 1? (ours is 3)

Yes, I did, and that still give expected results.

> INSERT query adds new value to collection type
> --
>
> Key: CASSANDRA-6095
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6095
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Ngoc Minh Vo
>Assignee: Sylvain Lebresne
>
> Hello,
> I don't know if somebody has reported this regression in v2.0.1: INSERT query 
> adds new value to collection type (eg. List) instead of replacing it.
> CQL3 docs:
> http://cassandra.apache.org/doc/cql3/CQL.html#collections
> {quote}
> Note: An INSERT will always replace the entire list.
> {quote}
> We do not encounter this issue with v1.2.9.
> Could you please have a look at the issue?
> Thanks for your help.
> Best regards,
> Minh

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6098) NullPointerException causing query timeout

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778765#comment-13778765
 ] 

Jonathan Ellis commented on CASSANDRA-6098:
---

Patch looks good, can you add a unit test?

> NullPointerException causing query timeout
> --
>
> Key: CASSANDRA-6098
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6098
> Project: Cassandra
>  Issue Type: Bug
> Environment: CQLSH 4.0.0
> Cassandra 2.0.0
> Oracle Java 1.7.0_40
> Ubuntu 12.04.3 x64
>Reporter: Lex Lythius
>Assignee: Sylvain Lebresne
> Fix For: 2.0.2
>
> Attachments: 6098.txt
>
>
> A common SELECT query could not be completed failing with.
> {noformat}
> Request did not complete within rpc_timeout.
> {noformat}
> output.log showed this:
> {noformat}
> ERROR 15:38:04,036 Exception in thread Thread[ReadStage:170,5,main]
> java.lang.RuntimeException: java.lang.NullPointerException
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1867)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:724)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.cassandra.db.index.composites.CompositesIndexOnRegular.isStale(CompositesIndexOnRegular.java:97)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:247)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher$1.computeNext(CompositesSearcher.java:102)
> at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
> at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1651)
> at 
> org.apache.cassandra.db.index.composites.CompositesSearcher.search(CompositesSearcher.java:50)
> at 
> org.apache.cassandra.db.index.SecondaryIndexManager.search(SecondaryIndexManager.java:525)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.search(ColumnFamilyStore.java:1639)
> at 
> org.apache.cassandra.db.RangeSliceCommand.executeLocally(RangeSliceCommand.java:135)
> at 
> org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1358)
> at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1863)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1632) Thread workflow and cpu affinity

2013-09-26 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778787#comment-13778787
 ] 

Jason Brown commented on CASSANDRA-1632:


And that is why I shouldn't have tried to read the FJP code at 1am :). Thanks, 
[~stuhood], will give it another shot.

> Thread workflow and cpu affinity
> 
>
> Key: CASSANDRA-1632
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1632
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Chris Goffinet
>Assignee: Jason Brown
>  Labels: performance
>
> Here are some thoughts I wanted to write down, we need to run some serious 
> benchmarks to see the benefits:
> 1) All thread pools for our stages use a shared queue per stage. For some 
> stages we could move to a model where each thread has its own queue. This 
> would reduce lock contention on the shared queue. This workload only suits 
> the stages that have no variance, else you run into thread starvation. Some 
> stages that this might work: ROW-MUTATION.
> 2) Set cpu affinity for each thread in each stage. If we can pin threads to 
> specific cores, and control the workflow of a message from Thrift down to 
> each stage, we should see improvements on reducing L1 cache misses. We would 
> need to build a JNI extension (to set cpu affinity), as I could not find 
> anywhere in JDK where it was exposed. 
> 3) Batching the delivery of requests across stage boundaries. Peter Schuller 
> hasn't looked deep enough yet into the JDK, but he thinks there may be 
> significant improvements to be had there. Especially in high-throughput 
> situations. If on each consumption you were to consume everything in the 
> queue, rather than implying a synchronization point in between each request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4131) Integrate Hive support to be in core cassandra

2013-09-26 Thread Marcel (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778823#comment-13778823
 ] 

Marcel commented on CASSANDRA-4131:
---

Will there be a update for the cassandra-handler for Cassandra 2.0.0?

I have it working on cassandra 2.0.0 version (with a slight problem connecting 
to the cassandra cluster) but i'm noticing that the number of mappers is equal 
to the number of vhosts (default 256 per node). I think this should be equal to 
the number of nodes.

The problem with connecting to the cassandra cluster was that the 
cassandra.host property past in de the create table statement in hive didn't 
get passed on. When I hardcoded it (replaced localhost with ip address) it 
worked.

> Integrate Hive support to be in core cassandra
> --
>
> Key: CASSANDRA-4131
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4131
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeremy Hanna
>Assignee: Edward Capriolo
>  Labels: hadoop, hive
>
> The standalone hive support (at https://github.com/riptano/hive) would be 
> great to have in-tree so that people don't have to go out to github to 
> download it and wonder if it's a left-for-dead external shim.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4206) AssertionError: originally calculated column size of 629444349 but now it is 588008950

2013-09-26 Thread Dan Kogan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778837#comment-13778837
 ] 

Dan Kogan commented on CASSANDRA-4206:
--

We are also seeing the same error during compaction.  We can provide the 
sstables if that helps resolving the issues.

INFO [CompactionExecutor:74953] 2013-09-26 14:23:53,978 
CompactionController.java (line 166) Compacting large row 
iqtell/mail_folder_data_subject_withdate_asc:97995 (131986528 bytes) 
incrementally
ERROR [CompactionExecutor:74953] 2013-09-26 14:24:01,126 CassandraDaemon.java 
(line 174) Exception in thread Thread[CompactionExecutor:74953,1,main]
java.lang.AssertionError: originally calculated column size of 131986437 but 
now it is 131986500
at 
org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:135)
at 
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:159)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:162)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:188)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)


> AssertionError: originally calculated column size of 629444349 but now it is 
> 588008950
> --
>
> Key: CASSANDRA-4206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4206
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.0.9
> Environment: Debian Squeeze Linux, kernel 2.6.32, sun-java6-bin 
> 6.26-0squeeze1
>Reporter: Patrik Modesto
>
> I've 4 node cluster of Cassandra 1.0.9. There is a rfTest3 keyspace with RF=3 
> and one CF with two secondary indexes. I'm importing data into this CF using 
> Hadoop Mapreduce job, each row has less than 10 colkumns. From JMX:
> MaxRowSize:  1597
> MeanRowSize: 369
> And there are some tens of millions of rows.
> It's write-heavy usage and there is a big pressure on each node, there are 
> quite some dropped mutations on each node. After ~12 hours of inserting I see 
> these assertion exceptiona on 3 out of four nodes:
> {noformat}
> ERROR 06:25:40,124 Fatal exception in thread Thread[HintedHandoff:1,1,main]
> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
> java.lang.AssertionError: originally calculated column size of 629444349 but 
> now it is 588008950
>at 
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:388)
>at 
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:256)
>at 
> org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:84)
>at 
> org.apache.cassandra.db.HintedHandOffManager$3.runMayThrow(HintedHandOffManager.java:437)
>at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>at java.lang.Thread.run(Thread.java:662)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.AssertionError: originally calculated column size of
> 629444349 but now it is 588008950
>at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>at 
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:384)
>... 7 more
> Caused by: java.lang.AssertionError: originally calculated column size
> of 629444349 but now it is 588008950
>at 
> org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:124)
>at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160)
>at 
> org.apache.c

[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-09-26 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778871#comment-13778871
 ] 

Marcus Eriksson commented on CASSANDRA-4338:


so, using a direct bytebuffer in SequentialWriter generates alot less garbage 
in my micro benchmarks (will post patch and graphs later) - mostly by not 
having to copy the incoming byte array, instead just pushing the data to a 
direct BB. It is also a bit faster (~5%), maybe just because of less gc.

Making it work with CompressedSequentialWriter is not as easy since we then 
need to either use a standard byte[] buffer and compress that before pushing it 
off-heap/to disk or copy to the heap, compress and then push it back. Neither 
will be any improvement.

but, then i found out that snappy can compress a direct byte buffer without 
copying anything to the heap:
https://github.com/xerial/snappy-java/blob/develop/src/main/java/org/xerial/snappy/Snappy.java#L126

problem is that LZ4 does not support that (yet?):
https://github.com/jpountz/lz4-java/issues/9

Hadoop seems to ship their own native code to solve this problem:
https://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/io/compress/

also related:
https://issues.apache.org/jira/browse/HADOOP-8148

i will experiment with making this work with snappy and see how much we can 
gain by doing it.

> Experiment with direct buffer in SequentialWriter
> -
>
> Key: CASSANDRA-4338
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a 
> copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778902#comment-13778902
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

/throws up the [~jpountz] signal

> Experiment with direct buffer in SequentialWriter
> -
>
> Key: CASSANDRA-4338
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a 
> copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CASSANDRA-6101) Debian init script broken

2013-09-26 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-6101:
---

Assignee: Eric Evans

> Debian init script broken
> -
>
> Key: CASSANDRA-6101
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6101
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Anton Winter
>Assignee: Eric Evans
>Priority: Minor
> Attachments: 6101.txt
>
>
> The debian init script released in 2.0.1 contains 2 issues:
> # The pidfile directory is not created if it doesn't already exist.
> # Classpath not exported to the start-stop-daemon.
> These lead to the init script not picking up jna.jar, or anything from the 
> debian EXTRA_CLASSPATH environment variable, and the init script not being 
> able to stop/restart Cassandra.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-6103) ConcurrentModificationException in TokenMetadata.cloneOnlyTokenMap

2013-09-26 Thread Mike Schrag (JIRA)
Mike Schrag created CASSANDRA-6103:
--

 Summary: ConcurrentModificationException in 
TokenMetadata.cloneOnlyTokenMap
 Key: CASSANDRA-6103
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6103
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Mike Schrag


This isn't reproducible for me, but it happened to one of the servers in our 
cluster while starting up. It went away on a restart, but I figured it was 
worth filing anyway:

ERROR [main] 2013-09-26 08:04:02,478 CassandraDaemon.java (line 464) Exception 
encountered during startup
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
at java.util.HashMap$EntryIterator.next(HashMap.java:834)
at java.util.HashMap$EntryIterator.next(HashMap.java:832)
at 
com.google.common.collect.AbstractBiMap$EntrySet$1.next(AbstractBiMap.java:294)
at 
com.google.common.collect.AbstractBiMap$EntrySet$1.next(AbstractBiMap.java:286)
at 
com.google.common.collect.AbstractBiMap.putAll(AbstractBiMap.java:160)
at com.google.common.collect.HashBiMap.putAll(HashBiMap.java:42)
at com.google.common.collect.HashBiMap.create(HashBiMap.java:72)
at 
org.apache.cassandra.locator.TokenMetadata.cloneOnlyTokenMap(TokenMetadata.java:561)
at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getAddressRanges(AbstractReplicationStrategy.java:192)
at 
org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1711)
at 
org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1692)
at 
org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1461)
at 
org.apache.cassandra.service.StorageService.onChange(StorageService.java:1228)
at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
at 
org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
at 
org.apache.cassandra.service.StorageService.setTokens(StorageService.java:214)
at 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:802)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:554)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:451)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CASSANDRA-6102) CassandraStorage broken for bigints and ints

2013-09-26 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-6102:
---

Assignee: Alex Liu

Hmm, ints, longs, and floats all work in the test_storage.pig script, but those 
are all populated by thrift.  We might have fixed this recently, but if not 
there's probably another bug with cql detection in ACS.

> CassandraStorage broken for bigints and ints
> 
>
> Key: CASSANDRA-6102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6102
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
> Environment: Cassandra 1.2.9 & 1.2.10, Pig 0.11.1, OSX 10.8.x
>Reporter: Janne Jalkanen
>Assignee: Alex Liu
>
> I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle 
> integer values.
> Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for 
> testing this. 
> First a table:
> {noformat}
> > CREATE TABLE testc (
>  key text PRIMARY KEY,
>  ivalue int,
>  svalue text,
>  value bigint
> ) WITH COMPACT STORAGE;
> > insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> > select * from testc;
> key | ivalue | svalue | value
> -+++---
> foo | 10 |bar | 65
> {noformat}
> For my Pig setup, I then use libraries from different C* versions to actually 
> talk to my database (which stays on 1.2.10 all the time).
> Cassandra 1.0.12 (using cassandra_storage.jar):
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.1.10:
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.2.10:
> {noformat}
> (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> foo,{(ivalue,
> ),(svalue,bar),(value,A)})
> {noformat}
> To me it appears that ints and bigints are interpreted as ascii values in 
> cass 1.2.10.  Did something change for CassandraStorage, is there a 
> regression, or am I doing something wrong?  Quick perusal of the JIRA didn't 
> reveal anything that I could directly pin on this.
> Note that using compact storage does not seem to affect the issue, though it 
> obviously changes the resulting pig format.
> In addition, trying to use Pygmalion 
> {noformat}
> tf = foreach testc generate key, 
> flatten(FromCassandraBag('ivalue,svalue,value',columns)) as 
> (ivalue:int,svalue:chararray,lvalue:long);
> dump tf
> (foo,
> ,bar,A)
> {noformat}
> So no help there. Explicitly casting the values to (long) or (int) just 
> results in a ClassCastException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6092) Leveled Compaction after ALTER TABLE creates pending but does not actually begin

2013-09-26 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6092:
--

Assignee: Daniel Meyer

Can you reproduce [~dmeyer]?

> Leveled Compaction after ALTER TABLE creates pending but does not actually 
> begin
> 
>
> Key: CASSANDRA-6092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6092
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 1.2.10
> Oracle Java 1.7.0_u40
> RHEL6.4
>Reporter: Karl Mueller
>Assignee: Daniel Meyer
>
> Running Cassandra 1.2.10.  N=5, RF=3
> On this Column Family (ProductGenomeDev/Node), it's been major compacted into 
> a single, large sstable.
> There's no activity on the table at the time of the ALTER command. I changed 
> it to Leveled Compaction with the command below.
> cqlsh:ProductGenomeDev> alter table "Node" with compaction = { 'class' : 
> 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 160 };
> Log entries confirm the change happened.
> [...]column_metadata={},compactionStrategyClass=class 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy,compactionStrategyOptions={sstable_size_in_mb=160}
>  [...]
> nodetool compactionstats shows pending compactions, but there's no activity:
> pending tasks: 750
> 12 hours later, nothing has still happened, same number pending. The 
> expectation would be that compactions would proceed immediately to convert 
> everything to Leveled Compaction as soon as the ALTER TABLE command goes.
> I try a simple write into the CF, and then flush the nodes. This kicks off 
> compaction on 3 nodes. (RF=3)
> cqlsh:ProductGenomeDev> insert into "Node" (key, column1, value) values 
> ('test123', 'test123', 'test123');
> cqlsh:ProductGenomeDev> select * from "Node" where key = 'test123';
>  key | column1 | value
> -+-+-
>  test123 | test123 | test123
> cqlsh:ProductGenomeDev> delete from "Node" where key = 'test123';
> After a flush on every node, now I see:
> [cassandra@dev-cass00 ~]$ cas exec nt compactionstats
> *** dev-cass00 (0) ***
> pending tasks: 750
> Active compaction remaining time :n/a
> *** dev-cass04 (0) ***
> pending tasks: 752
>   compaction typekeyspace   column family   completed 
>   total  unit  progress
>CompactionProductGenomeDevNode  341881
> 643290447928 bytes 0.53%
> Active compaction remaining time :n/a
> *** dev-cass01 (0) ***
> pending tasks: 750
> Active compaction remaining time :n/a
> *** dev-cass02 (0) ***
> pending tasks: 751
>   compaction typekeyspace   column family   completed 
>   total  unit  progress
>CompactionProductGenomeDevNode  3374975141
> 642764512481 bytes 0.53%
> Active compaction remaining time :n/a
> *** dev-cass03 (0) ***
> pending tasks: 751
>   compaction typekeyspace   column family   completed 
>   total  unit  progress
>CompactionProductGenomeDevNode  3591320948
> 643017643573 bytes 0.56%
> Active compaction remaining time :n/a
> After inserting and deleting more columns, enough that all nodes have new 
> data, and flushing, now compactions are proceeding on all nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6071) CqlStorage loading compact table adds an extraneous field to the pig schema

2013-09-26 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778934#comment-13778934
 ] 

Brandon Williams commented on CASSANDRA-6071:
-

Can you rebase [~beobal]? We committed a few pig things yesterday and it no 
longer applies :(

> CqlStorage loading compact table adds an extraneous field to the pig schema
> ---
>
> Key: CASSANDRA-6071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 1.2.11
>
> Attachments: 6071-2.txt, 6071.txt
>
>
> {code}
> CREATE TABLE t (
>   key text,
>   field1 int,
>   field2 int
>   PRIMARY KEY (key, field1)
> ) WITH COMPACT STORAGE;
> INSERT INTO t (key,field1,field2) VALUES ('key1',1,2);
> INSERT INTO t (key,field1,field2) VALUES ('key2',1,2);
> INSERT INTO t (key,field1,field2) VALUES ('key3',1,2);
> {code}
> {code}
> grunt> t = LOAD 'cql://ks/t' USING CqlStorage();
> grunt> describe t; 
> t: {key: chararray,field1: int,field2: int,value: int}
> dump t;
> (key1,1,2,)
> (key3,1,2,)
> (key2,1,2,)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6103) ConcurrentModificationException in TokenMetadata.cloneOnlyTokenMap

2013-09-26 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6103:


Fix Version/s: 1.2.11

Hmm, it looks like endpointToHostIdMap is being mutated, but it's not 
immediately clear how since the gossiper is blocked here and can't notify 
anything else, not to mention all the locking in TMD.

> ConcurrentModificationException in TokenMetadata.cloneOnlyTokenMap
> --
>
> Key: CASSANDRA-6103
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6103
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Mike Schrag
> Fix For: 1.2.11
>
>
> This isn't reproducible for me, but it happened to one of the servers in our 
> cluster while starting up. It went away on a restart, but I figured it was 
> worth filing anyway:
> ERROR [main] 2013-09-26 08:04:02,478 CassandraDaemon.java (line 464) 
> Exception encountered during startup
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
> at java.util.HashMap$EntryIterator.next(HashMap.java:834)
> at java.util.HashMap$EntryIterator.next(HashMap.java:832)
> at 
> com.google.common.collect.AbstractBiMap$EntrySet$1.next(AbstractBiMap.java:294)
> at 
> com.google.common.collect.AbstractBiMap$EntrySet$1.next(AbstractBiMap.java:286)
> at 
> com.google.common.collect.AbstractBiMap.putAll(AbstractBiMap.java:160)
> at com.google.common.collect.HashBiMap.putAll(HashBiMap.java:42)
> at com.google.common.collect.HashBiMap.create(HashBiMap.java:72)
> at 
> org.apache.cassandra.locator.TokenMetadata.cloneOnlyTokenMap(TokenMetadata.java:561)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getAddressRanges(AbstractReplicationStrategy.java:192)
> at 
> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1711)
> at 
> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1692)
> at 
> org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1461)
> at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1228)
> at 
> org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
> at 
> org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
> at 
> org.apache.cassandra.service.StorageService.setTokens(StorageService.java:214)
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:802)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:554)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:451)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-6104) Add additional limits in cassandra.conf provided by Debian package

2013-09-26 Thread J.B. Langston (JIRA)
J.B. Langston created CASSANDRA-6104:


 Summary: Add additional limits in cassandra.conf provided by 
Debian package
 Key: CASSANDRA-6104
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6104
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
Reporter: J.B. Langston
Priority: Trivial


/etc/security/limits.d/cassandra.conf distributed with DSC deb/rpm packages 
should contain additional settings. We have found these limits to be necessary 
for some customers through various support tickets.

{code}
cassandra - memlock  unlimited
cassandra - nofile  10
cassandra - nproc 32768
cassandra - as unlimited
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4191) Add `nodetool cfstats ` abilities

2013-09-26 Thread Lyuben Todorov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lyuben Todorov updated CASSANDRA-4191:
--

Attachment: 4809.patch

Changed the split to only split on the first occurrence of the ".", now 
keyspaces that contain dots can be searched for:
{code}
./nodetool cfstats "Keyspace1".Standard1.Idx1
{code}
The above will be interpreted as KS="Keyspace1" and CF=Standard1.Idx1

> Add `nodetool cfstats  ` abilities
> --
>
> Key: CASSANDRA-4191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4191
> Project: Cassandra
>  Issue Type: New Feature
>Affects Versions: 1.2.0 beta 1
>Reporter: Joaquin Casares
>Assignee: Lyuben Todorov
>Priority: Minor
>  Labels: datastax_qa
> Fix For: 1.2.9
>
> Attachments: 4191.patch, 4191_specific_cfstats.diff, 4809.patch
>
>
> This way cfstats will only print information per keyspace/column family 
> combinations.
> Another related proposal as an alternative to this ticket:
> Allow for `nodetool cfstats` to use --excludes or --includes to accept 
> keyspace and column family arguments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6104) Add additional limits in cassandra.conf provided by Debian package

2013-09-26 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778964#comment-13778964
 ] 

Brandon Williams commented on CASSANDRA-6104:
-

We already have these two:

{noformat}
cassandra  -  memlock  unlimited
cassandra  -  nofile   10
{noformat}

32k threads seems a little overboard.

> Add additional limits in cassandra.conf provided by Debian package
> --
>
> Key: CASSANDRA-6104
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6104
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: J.B. Langston
>Priority: Trivial
>
> /etc/security/limits.d/cassandra.conf distributed with DSC deb/rpm packages 
> should contain additional settings. We have found these limits to be 
> necessary for some customers through various support tickets.
> {code}
> cassandra - memlock  unlimited
> cassandra - nofile  10
> cassandra - nproc 32768
> cassandra - as unlimited
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4191) Add `nodetool cfstats ` abilities

2013-09-26 Thread Lyuben Todorov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lyuben Todorov updated CASSANDRA-4191:
--

Attachment: (was: 4809.patch)

> Add `nodetool cfstats  ` abilities
> --
>
> Key: CASSANDRA-4191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4191
> Project: Cassandra
>  Issue Type: New Feature
>Affects Versions: 1.2.0 beta 1
>Reporter: Joaquin Casares
>Assignee: Lyuben Todorov
>Priority: Minor
>  Labels: datastax_qa
> Fix For: 1.2.9
>
> Attachments: 4191.patch, 4191_specific_cfstats.diff
>
>
> This way cfstats will only print information per keyspace/column family 
> combinations.
> Another related proposal as an alternative to this ticket:
> Allow for `nodetool cfstats` to use --excludes or --includes to accept 
> keyspace and column family arguments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4191) Add `nodetool cfstats ` abilities

2013-09-26 Thread Lyuben Todorov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lyuben Todorov updated CASSANDRA-4191:
--

Attachment: 4191_v3.patch

Updated NodeToolHelp.yaml

> Add `nodetool cfstats  ` abilities
> --
>
> Key: CASSANDRA-4191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4191
> Project: Cassandra
>  Issue Type: New Feature
>Affects Versions: 1.2.0 beta 1
>Reporter: Joaquin Casares
>Assignee: Lyuben Todorov
>Priority: Minor
>  Labels: datastax_qa
> Fix For: 1.2.9
>
> Attachments: 4191.patch, 4191_specific_cfstats.diff, 4191_v3.patch
>
>
> This way cfstats will only print information per keyspace/column family 
> combinations.
> Another related proposal as an alternative to this ticket:
> Allow for `nodetool cfstats` to use --excludes or --includes to accept 
> keyspace and column family arguments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6092) Leveled Compaction after ALTER TABLE creates pending but does not actually begin

2013-09-26 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778991#comment-13778991
 ] 

Yuki Morishita commented on CASSANDRA-6092:
---

bq. it's been major compacted into a single, large sstable.

Single SSTable compaction only happens when tombstone ratio is above threshold.


> Leveled Compaction after ALTER TABLE creates pending but does not actually 
> begin
> 
>
> Key: CASSANDRA-6092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6092
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 1.2.10
> Oracle Java 1.7.0_u40
> RHEL6.4
>Reporter: Karl Mueller
>Assignee: Daniel Meyer
>
> Running Cassandra 1.2.10.  N=5, RF=3
> On this Column Family (ProductGenomeDev/Node), it's been major compacted into 
> a single, large sstable.
> There's no activity on the table at the time of the ALTER command. I changed 
> it to Leveled Compaction with the command below.
> cqlsh:ProductGenomeDev> alter table "Node" with compaction = { 'class' : 
> 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 160 };
> Log entries confirm the change happened.
> [...]column_metadata={},compactionStrategyClass=class 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy,compactionStrategyOptions={sstable_size_in_mb=160}
>  [...]
> nodetool compactionstats shows pending compactions, but there's no activity:
> pending tasks: 750
> 12 hours later, nothing has still happened, same number pending. The 
> expectation would be that compactions would proceed immediately to convert 
> everything to Leveled Compaction as soon as the ALTER TABLE command goes.
> I try a simple write into the CF, and then flush the nodes. This kicks off 
> compaction on 3 nodes. (RF=3)
> cqlsh:ProductGenomeDev> insert into "Node" (key, column1, value) values 
> ('test123', 'test123', 'test123');
> cqlsh:ProductGenomeDev> select * from "Node" where key = 'test123';
>  key | column1 | value
> -+-+-
>  test123 | test123 | test123
> cqlsh:ProductGenomeDev> delete from "Node" where key = 'test123';
> After a flush on every node, now I see:
> [cassandra@dev-cass00 ~]$ cas exec nt compactionstats
> *** dev-cass00 (0) ***
> pending tasks: 750
> Active compaction remaining time :n/a
> *** dev-cass04 (0) ***
> pending tasks: 752
>   compaction typekeyspace   column family   completed 
>   total  unit  progress
>CompactionProductGenomeDevNode  341881
> 643290447928 bytes 0.53%
> Active compaction remaining time :n/a
> *** dev-cass01 (0) ***
> pending tasks: 750
> Active compaction remaining time :n/a
> *** dev-cass02 (0) ***
> pending tasks: 751
>   compaction typekeyspace   column family   completed 
>   total  unit  progress
>CompactionProductGenomeDevNode  3374975141
> 642764512481 bytes 0.53%
> Active compaction remaining time :n/a
> *** dev-cass03 (0) ***
> pending tasks: 751
>   compaction typekeyspace   column family   completed 
>   total  unit  progress
>CompactionProductGenomeDevNode  3591320948
> 643017643573 bytes 0.56%
> Active compaction remaining time :n/a
> After inserting and deleting more columns, enough that all nodes have new 
> data, and flushing, now compactions are proceeding on all nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5950) Make snapshot/sequential repair the default

2013-09-26 Thread Lyuben Todorov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lyuben Todorov updated CASSANDRA-5950:
--

Attachment: 5950_v3.patch

Renamed variable to *sequential* and updated NodeToolHelp.yaml

> Make snapshot/sequential repair the default
> ---
>
> Key: CASSANDRA-5950
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5950
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Lyuben Todorov
>Priority: Minor
> Fix For: 2.0.2
>
> Attachments: 5950.patch, 5950_v2.patch, 5950_v3.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6092) Leveled Compaction after ALTER TABLE creates pending but does not actually begin

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779014#comment-13779014
 ] 

Jonathan Ellis commented on CASSANDRA-6092:
---

Oh, good point.

Do you think we should add a special case for this?

> Leveled Compaction after ALTER TABLE creates pending but does not actually 
> begin
> 
>
> Key: CASSANDRA-6092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6092
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 1.2.10
> Oracle Java 1.7.0_u40
> RHEL6.4
>Reporter: Karl Mueller
>Assignee: Daniel Meyer
>
> Running Cassandra 1.2.10.  N=5, RF=3
> On this Column Family (ProductGenomeDev/Node), it's been major compacted into 
> a single, large sstable.
> There's no activity on the table at the time of the ALTER command. I changed 
> it to Leveled Compaction with the command below.
> cqlsh:ProductGenomeDev> alter table "Node" with compaction = { 'class' : 
> 'LeveledCompactionStrategy', 'sstable_size_in_mb' : 160 };
> Log entries confirm the change happened.
> [...]column_metadata={},compactionStrategyClass=class 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy,compactionStrategyOptions={sstable_size_in_mb=160}
>  [...]
> nodetool compactionstats shows pending compactions, but there's no activity:
> pending tasks: 750
> 12 hours later, nothing has still happened, same number pending. The 
> expectation would be that compactions would proceed immediately to convert 
> everything to Leveled Compaction as soon as the ALTER TABLE command goes.
> I try a simple write into the CF, and then flush the nodes. This kicks off 
> compaction on 3 nodes. (RF=3)
> cqlsh:ProductGenomeDev> insert into "Node" (key, column1, value) values 
> ('test123', 'test123', 'test123');
> cqlsh:ProductGenomeDev> select * from "Node" where key = 'test123';
>  key | column1 | value
> -+-+-
>  test123 | test123 | test123
> cqlsh:ProductGenomeDev> delete from "Node" where key = 'test123';
> After a flush on every node, now I see:
> [cassandra@dev-cass00 ~]$ cas exec nt compactionstats
> *** dev-cass00 (0) ***
> pending tasks: 750
> Active compaction remaining time :n/a
> *** dev-cass04 (0) ***
> pending tasks: 752
>   compaction typekeyspace   column family   completed 
>   total  unit  progress
>CompactionProductGenomeDevNode  341881
> 643290447928 bytes 0.53%
> Active compaction remaining time :n/a
> *** dev-cass01 (0) ***
> pending tasks: 750
> Active compaction remaining time :n/a
> *** dev-cass02 (0) ***
> pending tasks: 751
>   compaction typekeyspace   column family   completed 
>   total  unit  progress
>CompactionProductGenomeDevNode  3374975141
> 642764512481 bytes 0.53%
> Active compaction remaining time :n/a
> *** dev-cass03 (0) ***
> pending tasks: 751
>   compaction typekeyspace   column family   completed 
>   total  unit  progress
>CompactionProductGenomeDevNode  3591320948
> 643017643573 bytes 0.56%
> Active compaction remaining time :n/a
> After inserting and deleting more columns, enough that all nodes have new 
> data, and flushing, now compactions are proceeding on all nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779023#comment-13779023
 ] 

Jonathan Ellis commented on CASSANDRA-5932:
---

Pushed one more set of changes to mine, not forced: 
https://github.com/jbellis/cassandra/commits/5932.  Goal is to make SRE less 
fragile when doing RR.

> Speculative read performance data show unexpected results
> -
>
> Key: CASSANDRA-5932
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5932
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan McGuire
>Assignee: Aleksey Yeschenko
> Fix For: 2.0.2
>
> Attachments: 5932.txt, compaction-makes-slow.png, 
> compaction-makes-slow-stats.png, eager-read-looks-promising.png, 
> eager-read-looks-promising-stats.png, eager-read-not-consistent.png, 
> eager-read-not-consistent-stats.png, node-down-increase-performance.png
>
>
> I've done a series of stress tests with eager retries enabled that show 
> undesirable behavior. I'm grouping these behaviours into one ticket as they 
> are most likely related.
> 1) Killing off a node in a 4 node cluster actually increases performance.
> 2) Compactions make nodes slow, even after the compaction is done.
> 3) Eager Reads tend to lessen the *immediate* performance impact of a node 
> going down, but not consistently.
> My Environment:
> 1 stress machine: node0
> 4 C* nodes: node4, node5, node6, node7
> My script:
> node0 writes some data: stress -d node4 -F 3000 -n 3000 -i 5 -l 2 -K 
> 20
> node0 reads some data: stress -d node4 -n 3000 -o read -i 5 -K 20
> h3. Examples:
> h5. A node going down increases performance:
> !node-down-increase-performance.png!
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> At 450s, I kill -9 one of the nodes. There is a brief decrease in performance 
> as the snitch adapts, but then it recovers... to even higher performance than 
> before.
> h5. Compactions make nodes permanently slow:
> !compaction-makes-slow.png!
> !compaction-makes-slow-stats.png!
> The green and orange lines represent trials with eager retry enabled, they 
> never recover their op-rate from before the compaction as the red and blue 
> lines do.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.compaction.2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. Speculative Read tends to lessen the *immediate* impact:
> !eager-read-looks-promising.png!
> !eager-read-looks-promising-stats.png!
> This graph looked the most promising to me, the two trials with eager retry, 
> the green and orange line, at 450s showed the smallest dip in performance. 
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. But not always:
> !eager-read-not-consistent.png!
> !eager-read-not-consistent-stats.png!
> This is a retrial with the same settings as above, yet the 95percentile eager 
> retry (red line) did poorly this time at 450s.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.rc1.try2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6102) CassandraStorage broken for bigints and ints

2013-09-26 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779047#comment-13779047
 ] 

Alex Liu commented on CASSANDRA-6102:
-

[~jalkanen] Can you try it with CqlStorage which should work? We recommend to 
use CqlStorage unless you have to use CassandraStorage. There is some issue 
with CassandraStorage which can't get the right validator type for the columns 
based on system tables. 

We may needs fall back to thrift api to get the metadata for COMPACT STORAGE 
cql table.

> CassandraStorage broken for bigints and ints
> 
>
> Key: CASSANDRA-6102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6102
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
> Environment: Cassandra 1.2.9 & 1.2.10, Pig 0.11.1, OSX 10.8.x
>Reporter: Janne Jalkanen
>Assignee: Alex Liu
>
> I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle 
> integer values.
> Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for 
> testing this. 
> First a table:
> {noformat}
> > CREATE TABLE testc (
>  key text PRIMARY KEY,
>  ivalue int,
>  svalue text,
>  value bigint
> ) WITH COMPACT STORAGE;
> > insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> > select * from testc;
> key | ivalue | svalue | value
> -+++---
> foo | 10 |bar | 65
> {noformat}
> For my Pig setup, I then use libraries from different C* versions to actually 
> talk to my database (which stays on 1.2.10 all the time).
> Cassandra 1.0.12 (using cassandra_storage.jar):
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.1.10:
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.2.10:
> {noformat}
> (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> foo,{(ivalue,
> ),(svalue,bar),(value,A)})
> {noformat}
> To me it appears that ints and bigints are interpreted as ascii values in 
> cass 1.2.10.  Did something change for CassandraStorage, is there a 
> regression, or am I doing something wrong?  Quick perusal of the JIRA didn't 
> reveal anything that I could directly pin on this.
> Note that using compact storage does not seem to affect the issue, though it 
> obviously changes the resulting pig format.
> In addition, trying to use Pygmalion 
> {noformat}
> tf = foreach testc generate key, 
> flatten(FromCassandraBag('ivalue,svalue,value',columns)) as 
> (ivalue:int,svalue:chararray,lvalue:long);
> dump tf
> (foo,
> ,bar,A)
> {noformat}
> So no help there. Explicitly casting the values to (long) or (int) just 
> results in a ClassCastException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5730) Re-add ScrubTest post single-pass compaction

2013-09-26 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-5730:
---

Attachment: 0001-Revive-and-update-ScrubTest.patch

Patch 0001 (and 
[branch|https://github.com/thobbs/cassandra/tree/CASSANDRA-5730]) brings back 
ScrubTest with updated SSTables and test code (only testScrubOutOfOrder really 
required any changes).

> Re-add ScrubTest post single-pass compaction
> 
>
> Key: CASSANDRA-5730
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5730
> Project: Cassandra
>  Issue Type: Test
>Reporter: Sylvain Lebresne
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.0.2
>
> Attachments: 0001-Revive-and-update-ScrubTest.patch
>
>
> Follow up of CASSANDRA-5429 for adding back a ScrubTest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6103) ConcurrentModificationException in TokenMetadata.cloneOnlyTokenMap

2013-09-26 Thread Mike Schrag (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779053#comment-13779053
 ] 

Mike Schrag commented on CASSANDRA-6103:


Just to provide as much info as I can, the entire cluster was stopped. I then 
ran a script to start all of them back up at the same time. This particular 
cluster is 44 nodes. All 43 other nodes started fine. This one node died at 
startup with this exception. After a couple minutes I restarted the one node, 
and it came up fine. Priority "Major" is maybe a little aggressive for this one 
:)

> ConcurrentModificationException in TokenMetadata.cloneOnlyTokenMap
> --
>
> Key: CASSANDRA-6103
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6103
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Mike Schrag
> Fix For: 1.2.11
>
>
> This isn't reproducible for me, but it happened to one of the servers in our 
> cluster while starting up. It went away on a restart, but I figured it was 
> worth filing anyway:
> ERROR [main] 2013-09-26 08:04:02,478 CassandraDaemon.java (line 464) 
> Exception encountered during startup
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
> at java.util.HashMap$EntryIterator.next(HashMap.java:834)
> at java.util.HashMap$EntryIterator.next(HashMap.java:832)
> at 
> com.google.common.collect.AbstractBiMap$EntrySet$1.next(AbstractBiMap.java:294)
> at 
> com.google.common.collect.AbstractBiMap$EntrySet$1.next(AbstractBiMap.java:286)
> at 
> com.google.common.collect.AbstractBiMap.putAll(AbstractBiMap.java:160)
> at com.google.common.collect.HashBiMap.putAll(HashBiMap.java:42)
> at com.google.common.collect.HashBiMap.create(HashBiMap.java:72)
> at 
> org.apache.cassandra.locator.TokenMetadata.cloneOnlyTokenMap(TokenMetadata.java:561)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getAddressRanges(AbstractReplicationStrategy.java:192)
> at 
> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1711)
> at 
> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1692)
> at 
> org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1461)
> at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1228)
> at 
> org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
> at 
> org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
> at 
> org.apache.cassandra.service.StorageService.setTokens(StorageService.java:214)
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:802)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:554)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:451)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[3/3] git commit: Merge branch 'cassandra-2.0' into trunk

2013-09-26 Thread jbellis
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2c7b61b7
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2c7b61b7
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2c7b61b7

Branch: refs/heads/trunk
Commit: 2c7b61b76ec034afe4267fdaecd0905db16b40eb
Parents: 5832cc8 d493030
Author: Jonathan Ellis 
Authored: Thu Sep 26 13:20:25 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 13:20:25 2013 -0500

--
 .../Keyspace1-Standard3-ja-1-CRC.db | Bin 8 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Data.db| Bin 354 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Digest.sha1|   1 -
 .../Keyspace1-Standard3-ja-1-Filter.db  | Bin 176 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Index.db   | Bin 90 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Statistics.db  | Bin 4377 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Summary.db | Bin 83 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-TOC.txt|   8 -
 .../Keyspace1-Standard3-jb-1-CompressionInfo.db | Bin 0 -> 43 bytes
 .../Keyspace1-Standard3-jb-1-Data.db| Bin 0 -> 133 bytes
 .../Keyspace1-Standard3-jb-1-Filter.db  | Bin 0 -> 24 bytes
 .../Keyspace1-Standard3-jb-1-Index.db   | Bin 0 -> 90 bytes
 .../Keyspace1-Standard3-jb-1-Statistics.db  | Bin 0 -> 4390 bytes
 .../Keyspace1-Standard3-jb-1-Summary.db | Bin 0 -> 71 bytes
 .../Keyspace1-Standard3-jb-1-TOC.txt|   7 +
 .../unit/org/apache/cassandra/db/ScrubTest.java | 206 +++
 16 files changed, 213 insertions(+), 9 deletions(-)
--




[2/3] git commit: add back ScrubTest patch by Tyler Hobbs; reviewed by jbellis for CASSANDRA-5730

2013-09-26 Thread jbellis
add back ScrubTest
patch by Tyler Hobbs; reviewed by jbellis for CASSANDRA-5730


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d4930307
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d4930307
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d4930307

Branch: refs/heads/trunk
Commit: d49303078459b0b2e0d40d9f79660ccdc0dc1fe0
Parents: b5c23cf
Author: Jonathan Ellis 
Authored: Thu Sep 26 13:18:02 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 13:19:04 2013 -0500

--
 .../Keyspace1-Standard3-ja-1-CRC.db | Bin 8 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Data.db| Bin 354 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Digest.sha1|   1 -
 .../Keyspace1-Standard3-ja-1-Filter.db  | Bin 176 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Index.db   | Bin 90 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Statistics.db  | Bin 4377 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Summary.db | Bin 83 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-TOC.txt|   8 -
 .../Keyspace1-Standard3-jb-1-CompressionInfo.db | Bin 0 -> 43 bytes
 .../Keyspace1-Standard3-jb-1-Data.db| Bin 0 -> 133 bytes
 .../Keyspace1-Standard3-jb-1-Filter.db  | Bin 0 -> 24 bytes
 .../Keyspace1-Standard3-jb-1-Index.db   | Bin 0 -> 90 bytes
 .../Keyspace1-Standard3-jb-1-Statistics.db  | Bin 0 -> 4390 bytes
 .../Keyspace1-Standard3-jb-1-Summary.db | Bin 0 -> 71 bytes
 .../Keyspace1-Standard3-jb-1-TOC.txt|   7 +
 .../unit/org/apache/cassandra/db/ScrubTest.java | 206 +++
 16 files changed, 213 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-CRC.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-CRC.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-CRC.db
deleted file mode 100644
index 3cc23b0..000
Binary files a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-CRC.db and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Data.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Data.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Data.db
deleted file mode 100644
index 70e64e0..000
Binary files a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Data.db and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Digest.sha1
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Digest.sha1 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Digest.sha1
deleted file mode 100644
index c53d478..000
--- a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Digest.sha1
+++ /dev/null
@@ -1 +0,0 @@
-a9fbab0c12f097cfbf91a7b8731a20363daef547  Keyspace1-Standard3-ja-1-Data.db
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Filter.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Filter.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Filter.db
deleted file mode 100644
index df0734d..000
Binary files a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Filter.db 
and /dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Index.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Index.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Index.db
deleted file mode 100644
index 5da2914..000
Binary files a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Index.db and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Statistics.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Statistics.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Statistics.db
deleted file mode 100644
index 28e250b..000
Binary files 
a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Statistics.db and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Summary.db
---

[1/3] git commit: add back ScrubTest patch by Tyler Hobbs; reviewed by jbellis for CASSANDRA-5730

2013-09-26 Thread jbellis
Updated Branches:
  refs/heads/cassandra-2.0 b5c23cf74 -> d49303078
  refs/heads/trunk 5832cc839 -> 2c7b61b76


add back ScrubTest
patch by Tyler Hobbs; reviewed by jbellis for CASSANDRA-5730


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d4930307
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d4930307
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d4930307

Branch: refs/heads/cassandra-2.0
Commit: d49303078459b0b2e0d40d9f79660ccdc0dc1fe0
Parents: b5c23cf
Author: Jonathan Ellis 
Authored: Thu Sep 26 13:18:02 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 13:19:04 2013 -0500

--
 .../Keyspace1-Standard3-ja-1-CRC.db | Bin 8 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Data.db| Bin 354 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Digest.sha1|   1 -
 .../Keyspace1-Standard3-ja-1-Filter.db  | Bin 176 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Index.db   | Bin 90 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Statistics.db  | Bin 4377 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-Summary.db | Bin 83 -> 0 bytes
 .../Keyspace1-Standard3-ja-1-TOC.txt|   8 -
 .../Keyspace1-Standard3-jb-1-CompressionInfo.db | Bin 0 -> 43 bytes
 .../Keyspace1-Standard3-jb-1-Data.db| Bin 0 -> 133 bytes
 .../Keyspace1-Standard3-jb-1-Filter.db  | Bin 0 -> 24 bytes
 .../Keyspace1-Standard3-jb-1-Index.db   | Bin 0 -> 90 bytes
 .../Keyspace1-Standard3-jb-1-Statistics.db  | Bin 0 -> 4390 bytes
 .../Keyspace1-Standard3-jb-1-Summary.db | Bin 0 -> 71 bytes
 .../Keyspace1-Standard3-jb-1-TOC.txt|   7 +
 .../unit/org/apache/cassandra/db/ScrubTest.java | 206 +++
 16 files changed, 213 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-CRC.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-CRC.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-CRC.db
deleted file mode 100644
index 3cc23b0..000
Binary files a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-CRC.db and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Data.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Data.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Data.db
deleted file mode 100644
index 70e64e0..000
Binary files a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Data.db and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Digest.sha1
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Digest.sha1 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Digest.sha1
deleted file mode 100644
index c53d478..000
--- a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Digest.sha1
+++ /dev/null
@@ -1 +0,0 @@
-a9fbab0c12f097cfbf91a7b8731a20363daef547  Keyspace1-Standard3-ja-1-Data.db
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Filter.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Filter.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Filter.db
deleted file mode 100644
index df0734d..000
Binary files a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Filter.db 
and /dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Index.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Index.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Index.db
deleted file mode 100644
index 5da2914..000
Binary files a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Index.db and 
/dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/d4930307/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Statistics.db
--
diff --git a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Statistics.db 
b/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Statistics.db
deleted file mode 100644
index 28e250b..000
Binary files 
a/test/data/corrupt-sstables/Keyspace1-Standard3-ja-1-Statistics.db and 
/dev/null differ

http://git-wi

[jira] [Updated] (CASSANDRA-5663) Add write batching for the native protocol

2013-09-26 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5663:
--

Labels: performance  (was: )

Daniel, can you update your write-batching code for trunk?

> Add write batching for the native protocol
> --
>
> Key: CASSANDRA-5663
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5663
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Daniel Norberg
>Priority: Minor
>  Labels: performance
> Fix For: 2.1
>
>
> As discussed in CASSANDRA-5422, adding write batching to the native protocol 
> implementation is likely to improve throughput in a number of cases. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4191) Add `nodetool cfstats ` abilities

2013-09-26 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779069#comment-13779069
 ] 

Brandon Williams commented on CASSANDRA-4191:
-

There's a compiler warning for passing null varargs to printColumnFamilyStats 
in the CFSTATS case.  I *thought* we could just do away with the if statement 
there and just pass empty arguments, but this breaks the bare, no args 
invocation of cfstats.  It does seem like the cleanest way to solve this though.

> Add `nodetool cfstats  ` abilities
> --
>
> Key: CASSANDRA-4191
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4191
> Project: Cassandra
>  Issue Type: New Feature
>Affects Versions: 1.2.0 beta 1
>Reporter: Joaquin Casares
>Assignee: Lyuben Todorov
>Priority: Minor
>  Labels: datastax_qa
> Fix For: 1.2.9
>
> Attachments: 4191.patch, 4191_specific_cfstats.diff, 4191_v3.patch
>
>
> This way cfstats will only print information per keyspace/column family 
> combinations.
> Another related proposal as an alternative to this ticket:
> Allow for `nodetool cfstats` to use --excludes or --includes to accept 
> keyspace and column family arguments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6071) CqlStorage loading compact table adds an extraneous field to the pig schema

2013-09-26 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-6071:
---

Attachment: 6071-3.txt

attached rebased patch

> CqlStorage loading compact table adds an extraneous field to the pig schema
> ---
>
> Key: CASSANDRA-6071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 1.2.11
>
> Attachments: 6071-2.txt, 6071-3.txt, 6071.txt
>
>
> {code}
> CREATE TABLE t (
>   key text,
>   field1 int,
>   field2 int
>   PRIMARY KEY (key, field1)
> ) WITH COMPACT STORAGE;
> INSERT INTO t (key,field1,field2) VALUES ('key1',1,2);
> INSERT INTO t (key,field1,field2) VALUES ('key2',1,2);
> INSERT INTO t (key,field1,field2) VALUES ('key3',1,2);
> {code}
> {code}
> grunt> t = LOAD 'cql://ks/t' USING CqlStorage();
> grunt> describe t; 
> t: {key: chararray,field1: int,field2: int,value: int}
> dump t;
> (key1,1,2,)
> (key3,1,2,)
> (key2,1,2,)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6102) CassandraStorage broken for bigints and ints

2013-09-26 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779083#comment-13779083
 ] 

Alex Liu commented on CASSANDRA-6102:
-

I propose that we implement the following
{code}
CqlStorage supports all kind of tables/column families including old thrift 
column families, new Cql tables with/without Compact storage. (this is already 
done)

CassandraStorage supports only old thrift column families PLUS Cql tables with 
Compact storage. It DOES NOT support other Cql tables. (I am changing code for 
this)
{code}

Any objection/thought?

> CassandraStorage broken for bigints and ints
> 
>
> Key: CASSANDRA-6102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6102
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
> Environment: Cassandra 1.2.9 & 1.2.10, Pig 0.11.1, OSX 10.8.x
>Reporter: Janne Jalkanen
>Assignee: Alex Liu
>
> I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle 
> integer values.
> Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for 
> testing this. 
> First a table:
> {noformat}
> > CREATE TABLE testc (
>  key text PRIMARY KEY,
>  ivalue int,
>  svalue text,
>  value bigint
> ) WITH COMPACT STORAGE;
> > insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> > select * from testc;
> key | ivalue | svalue | value
> -+++---
> foo | 10 |bar | 65
> {noformat}
> For my Pig setup, I then use libraries from different C* versions to actually 
> talk to my database (which stays on 1.2.10 all the time).
> Cassandra 1.0.12 (using cassandra_storage.jar):
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.1.10:
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.2.10:
> {noformat}
> (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> foo,{(ivalue,
> ),(svalue,bar),(value,A)})
> {noformat}
> To me it appears that ints and bigints are interpreted as ascii values in 
> cass 1.2.10.  Did something change for CassandraStorage, is there a 
> regression, or am I doing something wrong?  Quick perusal of the JIRA didn't 
> reveal anything that I could directly pin on this.
> Note that using compact storage does not seem to affect the issue, though it 
> obviously changes the resulting pig format.
> In addition, trying to use Pygmalion 
> {noformat}
> tf = foreach testc generate key, 
> flatten(FromCassandraBag('ivalue,svalue,value',columns)) as 
> (ivalue:int,svalue:chararray,lvalue:long);
> dump tf
> (foo,
> ,bar,A)
> {noformat}
> So no help there. Explicitly casting the values to (long) or (int) just 
> results in a ClassCastException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[5/6] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-09-26 Thread brandonwilliams
Merge branch 'cassandra-1.2' into cassandra-2.0

Conflicts:
src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/006eec4a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/006eec4a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/006eec4a

Branch: refs/heads/cassandra-2.0
Commit: 006eec4a5dc76d79f3147ab1e1e78e17e304a88c
Parents: d493030 389ff55
Author: Brandon Williams 
Authored: Thu Sep 26 13:53:46 2013 -0500
Committer: Brandon Williams 
Committed: Thu Sep 26 13:53:46 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 151 ++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |   2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java | 144 +-
 3 files changed, 153 insertions(+), 144 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/006eec4a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/006eec4a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --cc src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index e66f585,09171a0..c9afff0
--- a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@@ -700,12 -698,20 +700,12 @@@ public class CassandraStorage extends A
  
  /** get a list of column for the column family */
  protected List getColumnMetadata(Cassandra.Client client, 
boolean cql3Table) 
 -throws InvalidRequestException, 
 -UnavailableException, 
 -TimedOutException, 
 -SchemaDisagreementException, 
 -TException,
 -CharacterCodingException,
 -org.apache.cassandra.exceptions.InvalidRequestException,
 -ConfigurationException,
 -NotFoundException
 +throws TException, CharacterCodingException, InvalidRequestException, 
ConfigurationException
  {
  if (cql3Table)
 -return new ArrayList();
 +return new ArrayList<>();
  
- return getColumnMeta(client, true);
+ return getColumnMeta(client, true, true);
  }
  
  /** convert key to a tuple */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/006eec4a/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
--
diff --cc src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
index 86fe338,79abc2c..b96d10e
--- a/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
@@@ -23,6 -23,9 +23,8 @@@ import java.nio.charset.CharacterCoding
  import java.util.*;
  
  
+ import org.apache.cassandra.cql3.CFDefinition;
+ import org.apache.cassandra.cql3.ColumnIdentifier;
 -import org.apache.cassandra.db.IColumn;
  import org.apache.cassandra.db.Column;
  import org.apache.cassandra.db.marshal.*;
  import org.apache.cassandra.exceptions.ConfigurationException;



[2/6] git commit: Don't add extraneous field with CqlStorage Patch by Sam Tunnicliffe, reviewed by Alex Liu for CASSANDRA-6071

2013-09-26 Thread brandonwilliams
Don't add extraneous field with CqlStorage
Patch by Sam Tunnicliffe, reviewed by Alex Liu for CASSANDRA-6071


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/389ff55e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/389ff55e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/389ff55e

Branch: refs/heads/cassandra-2.0
Commit: 389ff55e2bbc3046a6ad1aba85bdaab0e38dc6e8
Parents: 00e871d
Author: Brandon Williams 
Authored: Thu Sep 26 13:49:07 2013 -0500
Committer: Brandon Williams 
Committed: Thu Sep 26 13:49:07 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 151 ++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |   2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java | 144 +-
 3 files changed, 153 insertions(+), 144 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/389ff55e/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --git 
a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index 50671da..ce92014 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@ -641,7 +641,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 NotFoundException;
 
 /** get column meta data */
-protected List getColumnMeta(Cassandra.Client client, boolean 
cassandraStorage)
+protected List getColumnMeta(Cassandra.Client client, boolean 
cassandraStorage, boolean includeCompactValueColumn)
 throws InvalidRequestException,
 UnavailableException,
 TimedOutException,
@@ -666,9 +666,13 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 
 List rows = result.rows;
 List columnDefs = new ArrayList();
-if (!cassandraStorage && (rows == null || rows.isEmpty()))
+if (rows == null || rows.isEmpty())
 {
-// check classic thrift tables
+// if CassandraStorage, just return the empty list
+if (cassandraStorage)
+return columnDefs;
+
+// otherwise for CqlStorage, check metadata for classic thrift 
tables
 CFDefinition cfDefinition = getCfDefinition(keyspace, 
column_family, client);
 for (ColumnIdentifier column : cfDefinition.metadata.keySet())
 {
@@ -680,7 +684,9 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 cDef.validation_class = type;
 columnDefs.add(cDef);
 }
-if (columnDefs.size() == 0)
+// we may not need to include the value column for compact tables 
as we 
+// could have already processed it as 
schema_columnfamilies.value_alias
+if (columnDefs.size() == 0 && includeCompactValueColumn)
 {
 String value = cfDefinition.value != null ? 
cfDefinition.value.toString() : null;
 if ("value".equals(value))
@@ -693,8 +699,6 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 }
 return columnDefs;
 }
-else if (rows == null || rows.isEmpty())
-return columnDefs;
 
 Iterator iterator = rows.iterator();
 while (iterator.hasNext())
@@ -711,138 +715,6 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 return columnDefs;
 }
 
-/** get keys meta data */
-protected List getKeysMeta(Cassandra.Client client)
-throws Exception
-{
-String query = "SELECT key_aliases, " +
-   "   column_aliases, " +
-   "   key_validator, " +
-   "   comparator, " +
-   "   keyspace_name, " +
-   "   value_alias, " +
-   "   default_validator " +
-   "FROM system.schema_columnfamilies " +
-   "WHERE keyspace_name = '%s'" +
-   "  AND columnfamily_name = '%s' ";
-
-CqlResult result = client.execute_cql3_query(
-
ByteBufferUtil.bytes(String.format(query, keyspace, column_family)),
-Compression.NONE,
-ConsistencyLevel.ONE);
-
-if (result == null || result.rows == null || result.rows.isEmpty())
-return null;
-

[1/6] git commit: Don't add extraneous field with CqlStorage Patch by Sam Tunnicliffe, reviewed by Alex Liu for CASSANDRA-6071

2013-09-26 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.2 00e871d0f -> 389ff55e2
  refs/heads/cassandra-2.0 d49303078 -> 006eec4a5
  refs/heads/trunk 2c7b61b76 -> 246fefabf


Don't add extraneous field with CqlStorage
Patch by Sam Tunnicliffe, reviewed by Alex Liu for CASSANDRA-6071


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/389ff55e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/389ff55e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/389ff55e

Branch: refs/heads/cassandra-1.2
Commit: 389ff55e2bbc3046a6ad1aba85bdaab0e38dc6e8
Parents: 00e871d
Author: Brandon Williams 
Authored: Thu Sep 26 13:49:07 2013 -0500
Committer: Brandon Williams 
Committed: Thu Sep 26 13:49:07 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 151 ++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |   2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java | 144 +-
 3 files changed, 153 insertions(+), 144 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/389ff55e/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --git 
a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index 50671da..ce92014 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@ -641,7 +641,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 NotFoundException;
 
 /** get column meta data */
-protected List getColumnMeta(Cassandra.Client client, boolean 
cassandraStorage)
+protected List getColumnMeta(Cassandra.Client client, boolean 
cassandraStorage, boolean includeCompactValueColumn)
 throws InvalidRequestException,
 UnavailableException,
 TimedOutException,
@@ -666,9 +666,13 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 
 List rows = result.rows;
 List columnDefs = new ArrayList();
-if (!cassandraStorage && (rows == null || rows.isEmpty()))
+if (rows == null || rows.isEmpty())
 {
-// check classic thrift tables
+// if CassandraStorage, just return the empty list
+if (cassandraStorage)
+return columnDefs;
+
+// otherwise for CqlStorage, check metadata for classic thrift 
tables
 CFDefinition cfDefinition = getCfDefinition(keyspace, 
column_family, client);
 for (ColumnIdentifier column : cfDefinition.metadata.keySet())
 {
@@ -680,7 +684,9 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 cDef.validation_class = type;
 columnDefs.add(cDef);
 }
-if (columnDefs.size() == 0)
+// we may not need to include the value column for compact tables 
as we 
+// could have already processed it as 
schema_columnfamilies.value_alias
+if (columnDefs.size() == 0 && includeCompactValueColumn)
 {
 String value = cfDefinition.value != null ? 
cfDefinition.value.toString() : null;
 if ("value".equals(value))
@@ -693,8 +699,6 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 }
 return columnDefs;
 }
-else if (rows == null || rows.isEmpty())
-return columnDefs;
 
 Iterator iterator = rows.iterator();
 while (iterator.hasNext())
@@ -711,138 +715,6 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 return columnDefs;
 }
 
-/** get keys meta data */
-protected List getKeysMeta(Cassandra.Client client)
-throws Exception
-{
-String query = "SELECT key_aliases, " +
-   "   column_aliases, " +
-   "   key_validator, " +
-   "   comparator, " +
-   "   keyspace_name, " +
-   "   value_alias, " +
-   "   default_validator " +
-   "FROM system.schema_columnfamilies " +
-   "WHERE keyspace_name = '%s'" +
-   "  AND columnfamily_name = '%s' ";
-
-CqlResult result = client.execute_cql3_query(
-
ByteBufferUtil.bytes(String.format(query, keyspace, column_family)),
-Compression.NONE,
-

[6/6] git commit: Merge branch 'cassandra-2.0' into trunk

2013-09-26 Thread brandonwilliams
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/246fefab
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/246fefab
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/246fefab

Branch: refs/heads/trunk
Commit: 246fefabfbae20b6c17a1898bb838d7463ccad0a
Parents: 2c7b61b 006eec4
Author: Brandon Williams 
Authored: Thu Sep 26 13:53:58 2013 -0500
Committer: Brandon Williams 
Committed: Thu Sep 26 13:53:58 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 151 ++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |   2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java | 144 +-
 3 files changed, 153 insertions(+), 144 deletions(-)
--




[3/6] git commit: Don't add extraneous field with CqlStorage Patch by Sam Tunnicliffe, reviewed by Alex Liu for CASSANDRA-6071

2013-09-26 Thread brandonwilliams
Don't add extraneous field with CqlStorage
Patch by Sam Tunnicliffe, reviewed by Alex Liu for CASSANDRA-6071


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/389ff55e
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/389ff55e
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/389ff55e

Branch: refs/heads/trunk
Commit: 389ff55e2bbc3046a6ad1aba85bdaab0e38dc6e8
Parents: 00e871d
Author: Brandon Williams 
Authored: Thu Sep 26 13:49:07 2013 -0500
Committer: Brandon Williams 
Committed: Thu Sep 26 13:49:07 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 151 ++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |   2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java | 144 +-
 3 files changed, 153 insertions(+), 144 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/389ff55e/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--
diff --git 
a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java 
b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
index 50671da..ce92014 100644
--- a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
@@ -641,7 +641,7 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 NotFoundException;
 
 /** get column meta data */
-protected List getColumnMeta(Cassandra.Client client, boolean 
cassandraStorage)
+protected List getColumnMeta(Cassandra.Client client, boolean 
cassandraStorage, boolean includeCompactValueColumn)
 throws InvalidRequestException,
 UnavailableException,
 TimedOutException,
@@ -666,9 +666,13 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 
 List rows = result.rows;
 List columnDefs = new ArrayList();
-if (!cassandraStorage && (rows == null || rows.isEmpty()))
+if (rows == null || rows.isEmpty())
 {
-// check classic thrift tables
+// if CassandraStorage, just return the empty list
+if (cassandraStorage)
+return columnDefs;
+
+// otherwise for CqlStorage, check metadata for classic thrift 
tables
 CFDefinition cfDefinition = getCfDefinition(keyspace, 
column_family, client);
 for (ColumnIdentifier column : cfDefinition.metadata.keySet())
 {
@@ -680,7 +684,9 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 cDef.validation_class = type;
 columnDefs.add(cDef);
 }
-if (columnDefs.size() == 0)
+// we may not need to include the value column for compact tables 
as we 
+// could have already processed it as 
schema_columnfamilies.value_alias
+if (columnDefs.size() == 0 && includeCompactValueColumn)
 {
 String value = cfDefinition.value != null ? 
cfDefinition.value.toString() : null;
 if ("value".equals(value))
@@ -693,8 +699,6 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 }
 return columnDefs;
 }
-else if (rows == null || rows.isEmpty())
-return columnDefs;
 
 Iterator iterator = rows.iterator();
 while (iterator.hasNext())
@@ -711,138 +715,6 @@ public abstract class AbstractCassandraStorage extends 
LoadFunc implements Store
 return columnDefs;
 }
 
-/** get keys meta data */
-protected List getKeysMeta(Cassandra.Client client)
-throws Exception
-{
-String query = "SELECT key_aliases, " +
-   "   column_aliases, " +
-   "   key_validator, " +
-   "   comparator, " +
-   "   keyspace_name, " +
-   "   value_alias, " +
-   "   default_validator " +
-   "FROM system.schema_columnfamilies " +
-   "WHERE keyspace_name = '%s'" +
-   "  AND columnfamily_name = '%s' ";
-
-CqlResult result = client.execute_cql3_query(
-
ByteBufferUtil.bytes(String.format(query, keyspace, column_family)),
-Compression.NONE,
-ConsistencyLevel.ONE);
-
-if (result == null || result.rows == null || result.rows.isEmpty())
-return null;
-
-   

[jira] [Comment Edited] (CASSANDRA-6102) CassandraStorage broken for bigints and ints

2013-09-26 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779083#comment-13779083
 ] 

Alex Liu edited comment on CASSANDRA-6102 at 9/26/13 6:54 PM:
--

I propose that we implement the following
{code}
CqlStorage supports all kind of tables/column families including old thrift 
column families, 
new Cql tables with/without Compact storage. (this is already done)

CassandraStorage supports only old thrift column families PLUS Cql tables with 
Compact storage. 
It DOES NOT support other Cql tables. (I am changing code for this)
{code}

Any objection/thought?

  was (Author: alexliu68):
I propose that we implement the following
{code}
CqlStorage supports all kind of tables/column families including old thrift 
column families, new Cql tables with/without Compact storage. (this is already 
done)

CassandraStorage supports only old thrift column families PLUS Cql tables with 
Compact storage. It DOES NOT support other Cql tables. (I am changing code for 
this)
{code}

Any objection/thought?
  
> CassandraStorage broken for bigints and ints
> 
>
> Key: CASSANDRA-6102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6102
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
> Environment: Cassandra 1.2.9 & 1.2.10, Pig 0.11.1, OSX 10.8.x
>Reporter: Janne Jalkanen
>Assignee: Alex Liu
>
> I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle 
> integer values.
> Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for 
> testing this. 
> First a table:
> {noformat}
> > CREATE TABLE testc (
>  key text PRIMARY KEY,
>  ivalue int,
>  svalue text,
>  value bigint
> ) WITH COMPACT STORAGE;
> > insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> > select * from testc;
> key | ivalue | svalue | value
> -+++---
> foo | 10 |bar | 65
> {noformat}
> For my Pig setup, I then use libraries from different C* versions to actually 
> talk to my database (which stays on 1.2.10 all the time).
> Cassandra 1.0.12 (using cassandra_storage.jar):
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.1.10:
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.2.10:
> {noformat}
> (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> foo,{(ivalue,
> ),(svalue,bar),(value,A)})
> {noformat}
> To me it appears that ints and bigints are interpreted as ascii values in 
> cass 1.2.10.  Did something change for CassandraStorage, is there a 
> regression, or am I doing something wrong?  Quick perusal of the JIRA didn't 
> reveal anything that I could directly pin on this.
> Note that using compact storage does not seem to affect the issue, though it 
> obviously changes the resulting pig format.
> In addition, trying to use Pygmalion 
> {noformat}
> tf = foreach testc generate key, 
> flatten(FromCassandraBag('ivalue,svalue,value',columns)) as 
> (ivalue:int,svalue:chararray,lvalue:long);
> dump tf
> (foo,
> ,bar,A)
> {noformat}
> So no help there. Explicitly casting the values to (long) or (int) just 
> results in a ClassCastException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2013-09-26 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779098#comment-13779098
 ] 

Adrien Grand commented on CASSANDRA-4338:
-

Interesting, I was wondering whether people actually need to compress from/to 
byte buffers. Now that I know that some do, I can try to move this issue 
forward.

> Experiment with direct buffer in SequentialWriter
> -
>
> Key: CASSANDRA-4338
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Marcus Eriksson
>Priority: Minor
>  Labels: performance
> Fix For: 2.1
>
> Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png
>
>
> Using a direct buffer instead of a heap-based byte[] should let us avoid a 
> copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6102) CassandraStorage broken for bigints and ints

2013-09-26 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779106#comment-13779106
 ] 

Brandon Williams commented on CASSANDRA-6102:
-

I'm mostly ok with it, except I really want to get away from the cli for 
CASSANDRA-5709, and long-term I think we need to.  On the other hand, if we had 
CASSANDRA-5695 we could just get rid of those tests, so I'm still undecided.  
Can you explain what the problem is here?

> CassandraStorage broken for bigints and ints
> 
>
> Key: CASSANDRA-6102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6102
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
> Environment: Cassandra 1.2.9 & 1.2.10, Pig 0.11.1, OSX 10.8.x
>Reporter: Janne Jalkanen
>Assignee: Alex Liu
>
> I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle 
> integer values.
> Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for 
> testing this. 
> First a table:
> {noformat}
> > CREATE TABLE testc (
>  key text PRIMARY KEY,
>  ivalue int,
>  svalue text,
>  value bigint
> ) WITH COMPACT STORAGE;
> > insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> > select * from testc;
> key | ivalue | svalue | value
> -+++---
> foo | 10 |bar | 65
> {noformat}
> For my Pig setup, I then use libraries from different C* versions to actually 
> talk to my database (which stays on 1.2.10 all the time).
> Cassandra 1.0.12 (using cassandra_storage.jar):
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.1.10:
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.2.10:
> {noformat}
> (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> foo,{(ivalue,
> ),(svalue,bar),(value,A)})
> {noformat}
> To me it appears that ints and bigints are interpreted as ascii values in 
> cass 1.2.10.  Did something change for CassandraStorage, is there a 
> regression, or am I doing something wrong?  Quick perusal of the JIRA didn't 
> reveal anything that I could directly pin on this.
> Note that using compact storage does not seem to affect the issue, though it 
> obviously changes the resulting pig format.
> In addition, trying to use Pygmalion 
> {noformat}
> tf = foreach testc generate key, 
> flatten(FromCassandraBag('ivalue,svalue,value',columns)) as 
> (ivalue:int,svalue:chararray,lvalue:long);
> dump tf
> (foo,
> ,bar,A)
> {noformat}
> So no help there. Explicitly casting the values to (long) or (int) just 
> results in a ClassCastException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[4/6] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-09-26 Thread brandonwilliams
Merge branch 'cassandra-1.2' into cassandra-2.0

Conflicts:
src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/006eec4a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/006eec4a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/006eec4a

Branch: refs/heads/trunk
Commit: 006eec4a5dc76d79f3147ab1e1e78e17e304a88c
Parents: d493030 389ff55
Author: Brandon Williams 
Authored: Thu Sep 26 13:53:46 2013 -0500
Committer: Brandon Williams 
Committed: Thu Sep 26 13:53:46 2013 -0500

--
 .../hadoop/pig/AbstractCassandraStorage.java| 151 ++-
 .../cassandra/hadoop/pig/CassandraStorage.java  |   2 +-
 .../apache/cassandra/hadoop/pig/CqlStorage.java | 144 +-
 3 files changed, 153 insertions(+), 144 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/006eec4a/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/006eec4a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
--
diff --cc src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
index e66f585,09171a0..c9afff0
--- a/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
@@@ -700,12 -698,20 +700,12 @@@ public class CassandraStorage extends A
  
  /** get a list of column for the column family */
  protected List getColumnMetadata(Cassandra.Client client, 
boolean cql3Table) 
 -throws InvalidRequestException, 
 -UnavailableException, 
 -TimedOutException, 
 -SchemaDisagreementException, 
 -TException,
 -CharacterCodingException,
 -org.apache.cassandra.exceptions.InvalidRequestException,
 -ConfigurationException,
 -NotFoundException
 +throws TException, CharacterCodingException, InvalidRequestException, 
ConfigurationException
  {
  if (cql3Table)
 -return new ArrayList();
 +return new ArrayList<>();
  
- return getColumnMeta(client, true);
+ return getColumnMeta(client, true, true);
  }
  
  /** convert key to a tuple */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/006eec4a/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
--
diff --cc src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
index 86fe338,79abc2c..b96d10e
--- a/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
+++ b/src/java/org/apache/cassandra/hadoop/pig/CqlStorage.java
@@@ -23,6 -23,9 +23,8 @@@ import java.nio.charset.CharacterCoding
  import java.util.*;
  
  
+ import org.apache.cassandra.cql3.CFDefinition;
+ import org.apache.cassandra.cql3.ColumnIdentifier;
 -import org.apache.cassandra.db.IColumn;
  import org.apache.cassandra.db.Column;
  import org.apache.cassandra.db.marshal.*;
  import org.apache.cassandra.exceptions.ConfigurationException;



[jira] [Commented] (CASSANDRA-6103) ConcurrentModificationException in TokenMetadata.cloneOnlyTokenMap

2013-09-26 Thread Mikhail Stepura (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779115#comment-13779115
 ] 

Mikhail Stepura commented on CASSANDRA-6103:


Off the top of my head 
{{org.apache.cassandra.locator.TokenMetadata.updateHostId(UUID, InetAddress)}} 
calls {{endpointToHostIdMap.forcePut(endpoint, hostId);}} but doesn't acquire a 
write lock before that.  

> ConcurrentModificationException in TokenMetadata.cloneOnlyTokenMap
> --
>
> Key: CASSANDRA-6103
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6103
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Mike Schrag
> Fix For: 1.2.11
>
>
> This isn't reproducible for me, but it happened to one of the servers in our 
> cluster while starting up. It went away on a restart, but I figured it was 
> worth filing anyway:
> ERROR [main] 2013-09-26 08:04:02,478 CassandraDaemon.java (line 464) 
> Exception encountered during startup
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
> at java.util.HashMap$EntryIterator.next(HashMap.java:834)
> at java.util.HashMap$EntryIterator.next(HashMap.java:832)
> at 
> com.google.common.collect.AbstractBiMap$EntrySet$1.next(AbstractBiMap.java:294)
> at 
> com.google.common.collect.AbstractBiMap$EntrySet$1.next(AbstractBiMap.java:286)
> at 
> com.google.common.collect.AbstractBiMap.putAll(AbstractBiMap.java:160)
> at com.google.common.collect.HashBiMap.putAll(HashBiMap.java:42)
> at com.google.common.collect.HashBiMap.create(HashBiMap.java:72)
> at 
> org.apache.cassandra.locator.TokenMetadata.cloneOnlyTokenMap(TokenMetadata.java:561)
> at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getAddressRanges(AbstractReplicationStrategy.java:192)
> at 
> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1711)
> at 
> org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1692)
> at 
> org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:1461)
> at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1228)
> at 
> org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:949)
> at 
> org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:1116)
> at 
> org.apache.cassandra.service.StorageService.setTokens(StorageService.java:214)
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:802)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:554)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:451)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5515) Track sstable coldness

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779119#comment-13779119
 ] 

Jonathan Ellis commented on CASSANDRA-5515:
---

- Could we put the clearSSTableReadMeter in CFS or SSTR?  Or put it in 
SSTableDeletingDask instead of making it a notification at all.  Alternatively, 
since it's just one row I'd lean towards just letting TTL take care of it.
- Does TTL need to be so long if we're persisting every 5m? 
- It looks like having the increments in the read section of the iterators 
means we only increment in index lookup ({{getPosition}}) is successful.  IMO 
we should increment before getPosition.  May be cleaner to do this in 
collationController but iterator constructor also works.
- Use Keyspace.SYSTEM_KS instead of hardcoding {{"system"}}
- What can we do to restore coldness data from a snapshot?

> Track sstable coldness
> --
>
> Key: CASSANDRA-5515
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5515
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Tyler Hobbs
> Fix For: 2.0.2
>
> Attachments: 0001-Track-row-read-counts-in-SSTR.patch, 5515-2.0-v1.txt
>
>
> Keeping a count of reads per-sstable would allow STCS to automatically ignore 
> cold data rather than recompacting it constantly with hot data, dramatically 
> reducing compaction load for typical time series applications and others with 
> time-correlated access patterns.  We would not need a separate age-tiered 
> compaction strategy.
> (This will really be useful in conjunction with CASSANDRA-5514.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5515) Track sstable coldness

2013-09-26 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779147#comment-13779147
 ] 

Tyler Hobbs commented on CASSANDRA-5515:


bq. Could we put the clearSSTableReadMeter in CFS or SSTR? Or put it in 
SSTableDeletingDask instead of making it a notification at all. Alternatively, 
since it's just one row I'd lean towards just letting TTL take care of it.

Going with SSTR would result in duplicate notifications.  I think 
SSTableDeletingTask is a good spot, I just wasn't sure if that would be 
appropriate.  If the TTL was short (say, less than 1 day), I think that would 
be okay, but...

bq. Does TTL need to be so long if we're persisting every 5m?

I considered making it smaller, but in the case where a node goes down for some 
amount of time, it would be nice to not lose all of the stats when it comes 
back up.

bq. It looks like having the increments in the read section of the iterators 
means we only increment in index lookup (getPosition) is successful. IMO we 
should increment before getPosition. May be cleaner to do this in 
collationController but iterator constructor also works.

Just to be check, getPosition() failure indicates a BF false-positive, but we 
want to include those in the count?  I can see this making sense for managing 
in-memory index summary sizes, but maybe not for STCS optimization.  (It should 
be a small enough number not to matter much either way.)

bq. What can we do to restore coldness data from a snapshot?

Not much with the current storage strategy.  Do we have any 
workarounds/suggestions for restoring TTLed data in general?

> Track sstable coldness
> --
>
> Key: CASSANDRA-5515
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5515
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Tyler Hobbs
> Fix For: 2.0.2
>
> Attachments: 0001-Track-row-read-counts-in-SSTR.patch, 5515-2.0-v1.txt
>
>
> Keeping a count of reads per-sstable would allow STCS to automatically ignore 
> cold data rather than recompacting it constantly with hot data, dramatically 
> reducing compaction load for typical time series applications and others with 
> time-correlated access patterns.  We would not need a separate age-tiered 
> compaction strategy.
> (This will really be useful in conjunction with CASSANDRA-5514.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results

2013-09-26 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779192#comment-13779192
 ] 

Aleksey Yeschenko commented on CASSANDRA-5932:
--

+1, I'm out of OCD juice.

> Speculative read performance data show unexpected results
> -
>
> Key: CASSANDRA-5932
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5932
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan McGuire
>Assignee: Aleksey Yeschenko
> Fix For: 2.0.2
>
> Attachments: 5932.txt, compaction-makes-slow.png, 
> compaction-makes-slow-stats.png, eager-read-looks-promising.png, 
> eager-read-looks-promising-stats.png, eager-read-not-consistent.png, 
> eager-read-not-consistent-stats.png, node-down-increase-performance.png
>
>
> I've done a series of stress tests with eager retries enabled that show 
> undesirable behavior. I'm grouping these behaviours into one ticket as they 
> are most likely related.
> 1) Killing off a node in a 4 node cluster actually increases performance.
> 2) Compactions make nodes slow, even after the compaction is done.
> 3) Eager Reads tend to lessen the *immediate* performance impact of a node 
> going down, but not consistently.
> My Environment:
> 1 stress machine: node0
> 4 C* nodes: node4, node5, node6, node7
> My script:
> node0 writes some data: stress -d node4 -F 3000 -n 3000 -i 5 -l 2 -K 
> 20
> node0 reads some data: stress -d node4 -n 3000 -o read -i 5 -K 20
> h3. Examples:
> h5. A node going down increases performance:
> !node-down-increase-performance.png!
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> At 450s, I kill -9 one of the nodes. There is a brief decrease in performance 
> as the snitch adapts, but then it recovers... to even higher performance than 
> before.
> h5. Compactions make nodes permanently slow:
> !compaction-makes-slow.png!
> !compaction-makes-slow-stats.png!
> The green and orange lines represent trials with eager retry enabled, they 
> never recover their op-rate from before the compaction as the red and blue 
> lines do.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.compaction.2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. Speculative Read tends to lessen the *immediate* impact:
> !eager-read-looks-promising.png!
> !eager-read-looks-promising-stats.png!
> This graph looked the most promising to me, the two trials with eager retry, 
> the green and orange line, at 450s showed the smallest dip in performance. 
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. But not always:
> !eager-read-not-consistent.png!
> !eager-read-not-consistent-stats.png!
> This is a retrial with the same settings as above, yet the 95percentile eager 
> retry (red line) did poorly this time at 450s.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.rc1.try2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[3/3] git commit: Merge branch 'cassandra-2.0' into trunk

2013-09-26 Thread jbellis
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/84f6c26a
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/84f6c26a
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/84f6c26a

Branch: refs/heads/trunk
Commit: 84f6c26aaf7cba286f8b98a02cf408fa5bb2131a
Parents: 246fefa 20c419b
Author: Jonathan Ellis 
Authored: Thu Sep 26 15:38:44 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 15:38:44 2013 -0500

--
 CHANGES.txt |   1 +
 .../cassandra/service/AbstractReadExecutor.java | 338 ---
 .../apache/cassandra/service/ReadCallback.java  |   2 +-
 .../apache/cassandra/service/StorageProxy.java  |  30 +-
 4 files changed, 226 insertions(+), 145 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/84f6c26a/CHANGES.txt
--
diff --cc CHANGES.txt
index d2b1310,3d4d19c..d475170
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,12 -1,5 +1,13 @@@
 +2.1
 + * change logging from log4j to logback (CASSANDRA-5883)
 + * switch to LZ4 compression for internode communication (CASSANDRA-5887)
 + * Stop using Thrift-generated Index* classes internally (CASSANDRA-5971)
 + * Remove 1.2 network compatibility code (CASSANDRA-5960)
 + * Remove leveled json manifest migration code (CASSANDRA-5996)
 +
 +
  2.0.2
+  * Fixes for speculative retry (CASSANDRA-5932)
   * Improve memory usage of metadata min/max column names (CASSANDRA-6077)
   * Fix thrift validation refusing row markers on CQL3 tables (CASSANDRA-6081)
   * Fix insertion of collections with CAS (CASSANDRA-6069)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/84f6c26a/src/java/org/apache/cassandra/service/StorageProxy.java
--



[2/3] git commit: Fixes for speculative retry patch by ayeschenko and jbellis for CASSANDRA-5932

2013-09-26 Thread jbellis
Fixes for speculative retry
patch by ayeschenko and jbellis for CASSANDRA-5932


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/20c419b9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/20c419b9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/20c419b9

Branch: refs/heads/trunk
Commit: 20c419b9480e0e5b3c1da53a106b2a6760be35b9
Parents: 006eec4
Author: Jonathan Ellis 
Authored: Thu Sep 26 15:38:13 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 15:38:36 2013 -0500

--
 CHANGES.txt |   1 +
 .../cassandra/service/AbstractReadExecutor.java | 338 ---
 .../apache/cassandra/service/ReadCallback.java  |   2 +-
 .../apache/cassandra/service/StorageProxy.java  |  30 +-
 4 files changed, 226 insertions(+), 145 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c419b9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index cc3daf6..3d4d19c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.2
+ * Fixes for speculative retry (CASSANDRA-5932)
  * Improve memory usage of metadata min/max column names (CASSANDRA-6077)
  * Fix thrift validation refusing row markers on CQL3 tables (CASSANDRA-6081)
  * Fix insertion of collections with CAS (CASSANDRA-6069)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c419b9/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
--
diff --git a/src/java/org/apache/cassandra/service/AbstractReadExecutor.java 
b/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
index 2ebc0b3..83368c2 100644
--- a/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
+++ b/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
@@ -18,12 +18,17 @@
 package org.apache.cassandra.service;
 
 import java.net.InetAddress;
+import java.util.Collections;
 import java.util.List;
 import java.util.concurrent.TimeUnit;
 
+import com.google.common.collect.Iterables;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
 import org.apache.cassandra.concurrent.Stage;
 import org.apache.cassandra.concurrent.StageManager;
-import org.apache.cassandra.config.CFMetaData;
+import org.apache.cassandra.config.CFMetaData.SpeculativeRetry.RetryType;
 import org.apache.cassandra.config.Schema;
 import org.apache.cassandra.config.ReadRepairDecision;
 import org.apache.cassandra.db.ColumnFamilyStore;
@@ -39,142 +44,225 @@ import org.apache.cassandra.net.MessageOut;
 import org.apache.cassandra.net.MessagingService;
 import org.apache.cassandra.service.StorageProxy.LocalReadRunnable;
 import org.apache.cassandra.utils.FBUtilities;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 
+/**
+ * Sends a read request to the replicas needed to satisfy a given 
ConsistencyLevel.
+ *
+ * Optionally, may perform additional requests to provide redundancy against 
replica failure:
+ * AlwaysSpeculatingReadExecutor will always send a request to one extra 
replica, while
+ * SpeculatingReadExecutor will wait until it looks like the original request 
is in danger
+ * of timing out before performing extra reads.
+ */
 public abstract class AbstractReadExecutor
 {
 private static final Logger logger = 
LoggerFactory.getLogger(AbstractReadExecutor.class);
-protected final ReadCallback handler;
+
 protected final ReadCommand command;
+protected final List targetReplicas;
 protected final RowDigestResolver resolver;
-protected final List unfiltered;
-protected final List endpoints;
-protected final ColumnFamilyStore cfs;
-
-AbstractReadExecutor(ColumnFamilyStore cfs,
- ReadCommand command,
- ConsistencyLevel consistency_level,
- List allReplicas,
- List queryTargets)
-throws UnavailableException
+protected final ReadCallback handler;
+
+AbstractReadExecutor(ReadCommand command, ConsistencyLevel 
consistencyLevel, List targetReplicas)
 {
-unfiltered = allReplicas;
-this.endpoints = queryTargets;
-this.resolver = new RowDigestResolver(command.ksName, command.key);
-this.handler = new ReadCallback(resolver, 
consistency_level, command, this.endpoints);
 this.command = command;
-this.cfs = cfs;
+this.targetReplicas = targetReplicas;
+resolver = new RowDigestResolver(command.ksName, command.key);
+handler = new ReadCallback<>(resolver, consistencyLevel, command, 
targetReplicas);
+}
 
-handler.assureSufficientLiveNodes();
-assert !handler.endpoints.isEmpty();
+private static boolean isLoc

[1/3] git commit: Fixes for speculative retry patch by ayeschenko and jbellis for CASSANDRA-5932

2013-09-26 Thread jbellis
Updated Branches:
  refs/heads/cassandra-2.0 006eec4a5 -> 20c419b94
  refs/heads/trunk 246fefabf -> 84f6c26aa


Fixes for speculative retry
patch by ayeschenko and jbellis for CASSANDRA-5932


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/20c419b9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/20c419b9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/20c419b9

Branch: refs/heads/cassandra-2.0
Commit: 20c419b9480e0e5b3c1da53a106b2a6760be35b9
Parents: 006eec4
Author: Jonathan Ellis 
Authored: Thu Sep 26 15:38:13 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 15:38:36 2013 -0500

--
 CHANGES.txt |   1 +
 .../cassandra/service/AbstractReadExecutor.java | 338 ---
 .../apache/cassandra/service/ReadCallback.java  |   2 +-
 .../apache/cassandra/service/StorageProxy.java  |  30 +-
 4 files changed, 226 insertions(+), 145 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c419b9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index cc3daf6..3d4d19c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.2
+ * Fixes for speculative retry (CASSANDRA-5932)
  * Improve memory usage of metadata min/max column names (CASSANDRA-6077)
  * Fix thrift validation refusing row markers on CQL3 tables (CASSANDRA-6081)
  * Fix insertion of collections with CAS (CASSANDRA-6069)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20c419b9/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
--
diff --git a/src/java/org/apache/cassandra/service/AbstractReadExecutor.java 
b/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
index 2ebc0b3..83368c2 100644
--- a/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
+++ b/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
@@ -18,12 +18,17 @@
 package org.apache.cassandra.service;
 
 import java.net.InetAddress;
+import java.util.Collections;
 import java.util.List;
 import java.util.concurrent.TimeUnit;
 
+import com.google.common.collect.Iterables;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
 import org.apache.cassandra.concurrent.Stage;
 import org.apache.cassandra.concurrent.StageManager;
-import org.apache.cassandra.config.CFMetaData;
+import org.apache.cassandra.config.CFMetaData.SpeculativeRetry.RetryType;
 import org.apache.cassandra.config.Schema;
 import org.apache.cassandra.config.ReadRepairDecision;
 import org.apache.cassandra.db.ColumnFamilyStore;
@@ -39,142 +44,225 @@ import org.apache.cassandra.net.MessageOut;
 import org.apache.cassandra.net.MessagingService;
 import org.apache.cassandra.service.StorageProxy.LocalReadRunnable;
 import org.apache.cassandra.utils.FBUtilities;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 
+/**
+ * Sends a read request to the replicas needed to satisfy a given 
ConsistencyLevel.
+ *
+ * Optionally, may perform additional requests to provide redundancy against 
replica failure:
+ * AlwaysSpeculatingReadExecutor will always send a request to one extra 
replica, while
+ * SpeculatingReadExecutor will wait until it looks like the original request 
is in danger
+ * of timing out before performing extra reads.
+ */
 public abstract class AbstractReadExecutor
 {
 private static final Logger logger = 
LoggerFactory.getLogger(AbstractReadExecutor.class);
-protected final ReadCallback handler;
+
 protected final ReadCommand command;
+protected final List targetReplicas;
 protected final RowDigestResolver resolver;
-protected final List unfiltered;
-protected final List endpoints;
-protected final ColumnFamilyStore cfs;
-
-AbstractReadExecutor(ColumnFamilyStore cfs,
- ReadCommand command,
- ConsistencyLevel consistency_level,
- List allReplicas,
- List queryTargets)
-throws UnavailableException
+protected final ReadCallback handler;
+
+AbstractReadExecutor(ReadCommand command, ConsistencyLevel 
consistencyLevel, List targetReplicas)
 {
-unfiltered = allReplicas;
-this.endpoints = queryTargets;
-this.resolver = new RowDigestResolver(command.ksName, command.key);
-this.handler = new ReadCallback(resolver, 
consistency_level, command, this.endpoints);
 this.command = command;
-this.cfs = cfs;
+this.targetReplicas = targetReplicas;
+resolver = new RowDigestResolver(command.ksName, command.key);
+handler = new ReadCallback<>(resolver, consistencyLevel, command, 
targetReplicas);
+}
 
-

[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results

2013-09-26 Thread Li Zou (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779202#comment-13779202
 ] 

Li Zou commented on CASSANDRA-5932:
---

Hello [~iamaleksey] and [~jbellis],

I took a quick look at the code changes. The new code looks very good to me. 
But I saw one potential issue in 
{{AlwaysSpeculatingReadExecutor.executeAsync()}}, in which it makes at least 
*two* data / digest requests. This will cause problems for a data center with 
only one Cassandra server node (e.g. bring up an embedded Cassandra node in JVM 
for JUnit test) or a deployed production data center of two Cassandra server 
nodes with one node shut down for maintenance. In the above mentioned two 
cases, {{AbstractReadExecutor.getReadExecutor()}} will return the 
{{AlwaysSpeculatingReadExecutor}} as condition {{(targetReplicas.size() == 
allReplicas.size())}} is met, though the tables may / may not be configured 
with ??Speculative ALWAYS??.

It is true for our legacy products we are considering to deploy each data 
center with only two Cassandra server nodes with RF = 2 and CL = 1.


> Speculative read performance data show unexpected results
> -
>
> Key: CASSANDRA-5932
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5932
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan McGuire
>Assignee: Aleksey Yeschenko
> Fix For: 2.0.2
>
> Attachments: 5932.txt, compaction-makes-slow.png, 
> compaction-makes-slow-stats.png, eager-read-looks-promising.png, 
> eager-read-looks-promising-stats.png, eager-read-not-consistent.png, 
> eager-read-not-consistent-stats.png, node-down-increase-performance.png
>
>
> I've done a series of stress tests with eager retries enabled that show 
> undesirable behavior. I'm grouping these behaviours into one ticket as they 
> are most likely related.
> 1) Killing off a node in a 4 node cluster actually increases performance.
> 2) Compactions make nodes slow, even after the compaction is done.
> 3) Eager Reads tend to lessen the *immediate* performance impact of a node 
> going down, but not consistently.
> My Environment:
> 1 stress machine: node0
> 4 C* nodes: node4, node5, node6, node7
> My script:
> node0 writes some data: stress -d node4 -F 3000 -n 3000 -i 5 -l 2 -K 
> 20
> node0 reads some data: stress -d node4 -n 3000 -o read -i 5 -K 20
> h3. Examples:
> h5. A node going down increases performance:
> !node-down-increase-performance.png!
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> At 450s, I kill -9 one of the nodes. There is a brief decrease in performance 
> as the snitch adapts, but then it recovers... to even higher performance than 
> before.
> h5. Compactions make nodes permanently slow:
> !compaction-makes-slow.png!
> !compaction-makes-slow-stats.png!
> The green and orange lines represent trials with eager retry enabled, they 
> never recover their op-rate from before the compaction as the red and blue 
> lines do.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.compaction.2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. Speculative Read tends to lessen the *immediate* impact:
> !eager-read-looks-promising.png!
> !eager-read-looks-promising-stats.png!
> This graph looked the most promising to me, the two trials with eager retry, 
> the green and orange line, at 450s showed the smallest dip in performance. 
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. But not always:
> !eager-read-not-consistent.png!
> !eager-read-not-consistent-stats.png!
> This is a retrial with the same settings as above, yet the 95percentile eager 
> retry (red line) did poorly this time at 450s.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.rc1.try2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6084) java.io.IOException: Could not get input splits

2013-09-26 Thread Cyril Scetbon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779215#comment-13779215
 ] 

Cyril Scetbon commented on CASSANDRA-6084:
--

{quote}It's not the amount of connections so much as how fast they're trying to 
spawn and connect{quote}
right :(

> java.io.IOException: Could not get input splits
> ---
>
> Key: CASSANDRA-6084
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6084
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
> Environment: Osx 10.8.5
> Java 1.7
> Cassandra 1.2.10
> Pig 0.9.2/0.11.1
>Reporter: Cyril Scetbon
>Assignee: Alex Liu
> Attachments: 6084_debug.txt
>
>
> see http://www.mail-archive.com/user@cassandra.apache.org/msg32414.html
> I've noticed that if I restart Cassandra I get more errors for the first 
> minutes, although it's accessible through cqlsh without issue.
> I have tested on a 1-node (Osx Laptop) and 4-nodes (Ubuntu servers) and got 
> the same error. I tried with version 1.2.6, 1.2.8, 1.2.9, 1.2.10 without 
> success

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779217#comment-13779217
 ] 

Jonathan Ellis commented on CASSANDRA-5932:
---

The logic looks like this:

# Figure out how many replicas we need to contact to satisfy the desired 
consistencyLevel + Read Repair settings
# If that ends up being all the replicas, then use ASRE to get some redundancy 
on the data reads.  This will allow the read to succeed even if a digest for RR 
times out.  Of course if you are reading at CL.ALL and a replica times out 
there's nothing we can do.
# Otherwise, use SRE and make an "extra" request later, if it looks like one of 
the minimal set isn't going to respond in time

Note that performing extra data requests does not affect handler.blockfor -- 
just makes it possible for the request to proceed if it gets enough responses 
back, no matter which replicas they come from.

> Speculative read performance data show unexpected results
> -
>
> Key: CASSANDRA-5932
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5932
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan McGuire
>Assignee: Aleksey Yeschenko
> Fix For: 2.0.2
>
> Attachments: 5932.txt, compaction-makes-slow.png, 
> compaction-makes-slow-stats.png, eager-read-looks-promising.png, 
> eager-read-looks-promising-stats.png, eager-read-not-consistent.png, 
> eager-read-not-consistent-stats.png, node-down-increase-performance.png
>
>
> I've done a series of stress tests with eager retries enabled that show 
> undesirable behavior. I'm grouping these behaviours into one ticket as they 
> are most likely related.
> 1) Killing off a node in a 4 node cluster actually increases performance.
> 2) Compactions make nodes slow, even after the compaction is done.
> 3) Eager Reads tend to lessen the *immediate* performance impact of a node 
> going down, but not consistently.
> My Environment:
> 1 stress machine: node0
> 4 C* nodes: node4, node5, node6, node7
> My script:
> node0 writes some data: stress -d node4 -F 3000 -n 3000 -i 5 -l 2 -K 
> 20
> node0 reads some data: stress -d node4 -n 3000 -o read -i 5 -K 20
> h3. Examples:
> h5. A node going down increases performance:
> !node-down-increase-performance.png!
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> At 450s, I kill -9 one of the nodes. There is a brief decrease in performance 
> as the snitch adapts, but then it recovers... to even higher performance than 
> before.
> h5. Compactions make nodes permanently slow:
> !compaction-makes-slow.png!
> !compaction-makes-slow-stats.png!
> The green and orange lines represent trials with eager retry enabled, they 
> never recover their op-rate from before the compaction as the red and blue 
> lines do.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.compaction.2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. Speculative Read tends to lessen the *immediate* impact:
> !eager-read-looks-promising.png!
> !eager-read-looks-promising-stats.png!
> This graph looked the most promising to me, the two trials with eager retry, 
> the green and orange line, at 450s showed the smallest dip in performance. 
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. But not always:
> !eager-read-not-consistent.png!
> !eager-read-not-consistent-stats.png!
> This is a retrial with the same settings as above, yet the 95percentile eager 
> retry (red line) did poorly this time at 450s.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.rc1.try2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-6105) Cassandra Triggers to execute on replicas

2013-09-26 Thread Michael (JIRA)
Michael created CASSANDRA-6105:
--

 Summary: Cassandra Triggers to execute on replicas
 Key: CASSANDRA-6105
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6105
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Michael
Priority: Minor


We would like to keep ElasticSearch eventually consistent across data centers 
while keeping ElasticSearch clusters local to each data center. The idea is, 
utilize Cassandra to replicate data across data centers and use triggers to 
kick off an event which would populate data into ElasticSearch Clusters. Thus 
keeping disperse ElasticSearch clusters eventually consistent while not 
extending ElasticSearch across data centers. 
That in mind, it would be very useful if a trigger could be made to execute on 
every replica. Or at least one replica per data center.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results

2013-09-26 Thread Li Zou (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779233#comment-13779233
 ] 

Li Zou commented on CASSANDRA-5932:
---

The logic for {{AlwaysSpeculatingReadExecutor}} is good. What I meant in my 
previous comment is that when {{targetReplicas.size() == allReplicas.size()}} 
and {{targetReplicas.size() == 1}}, then 
{{AlwaysSpeculatingReadExecutor.executeAsync()}} will throw an exception as 
there is only one endpoint in {{targetReplicas}}, but it tries to access two 
endpoints in {{targetReplicas}}.

> Speculative read performance data show unexpected results
> -
>
> Key: CASSANDRA-5932
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5932
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan McGuire
>Assignee: Aleksey Yeschenko
> Fix For: 2.0.2
>
> Attachments: 5932.txt, compaction-makes-slow.png, 
> compaction-makes-slow-stats.png, eager-read-looks-promising.png, 
> eager-read-looks-promising-stats.png, eager-read-not-consistent.png, 
> eager-read-not-consistent-stats.png, node-down-increase-performance.png
>
>
> I've done a series of stress tests with eager retries enabled that show 
> undesirable behavior. I'm grouping these behaviours into one ticket as they 
> are most likely related.
> 1) Killing off a node in a 4 node cluster actually increases performance.
> 2) Compactions make nodes slow, even after the compaction is done.
> 3) Eager Reads tend to lessen the *immediate* performance impact of a node 
> going down, but not consistently.
> My Environment:
> 1 stress machine: node0
> 4 C* nodes: node4, node5, node6, node7
> My script:
> node0 writes some data: stress -d node4 -F 3000 -n 3000 -i 5 -l 2 -K 
> 20
> node0 reads some data: stress -d node4 -n 3000 -o read -i 5 -K 20
> h3. Examples:
> h5. A node going down increases performance:
> !node-down-increase-performance.png!
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> At 450s, I kill -9 one of the nodes. There is a brief decrease in performance 
> as the snitch adapts, but then it recovers... to even higher performance than 
> before.
> h5. Compactions make nodes permanently slow:
> !compaction-makes-slow.png!
> !compaction-makes-slow-stats.png!
> The green and orange lines represent trials with eager retry enabled, they 
> never recover their op-rate from before the compaction as the red and blue 
> lines do.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.compaction.2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. Speculative Read tends to lessen the *immediate* impact:
> !eager-read-looks-promising.png!
> !eager-read-looks-promising-stats.png!
> This graph looked the most promising to me, the two trials with eager retry, 
> the green and orange line, at 450s showed the smallest dip in performance. 
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. But not always:
> !eager-read-not-consistent.png!
> !eager-read-not-consistent-stats.png!
> This is a retrial with the same settings as above, yet the 95percentile eager 
> retry (red line) did poorly this time at 450s.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.rc1.try2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[1/3] git commit: fix subList bug

2013-09-26 Thread jbellis
Updated Branches:
  refs/heads/cassandra-2.0 20c419b94 -> 7a87fc118
  refs/heads/trunk 84f6c26aa -> 2520f2f87


fix subList bug


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7a87fc11
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7a87fc11
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7a87fc11

Branch: refs/heads/cassandra-2.0
Commit: 7a87fc1186f39678382cf9b3e1dd224d9c71aead
Parents: 20c419b
Author: Jonathan Ellis 
Authored: Thu Sep 26 16:10:15 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 16:10:15 2013 -0500

--
 src/java/org/apache/cassandra/service/AbstractReadExecutor.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7a87fc11/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
--
diff --git a/src/java/org/apache/cassandra/service/AbstractReadExecutor.java 
b/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
index 83368c2..280715a 100644
--- a/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
+++ b/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
@@ -321,7 +321,7 @@ public abstract class AbstractReadExecutor
 @Override
 public void executeAsync()
 {
-makeDataRequests(targetReplicas.subList(0, 2));
+makeDataRequests(targetReplicas.subList(0, targetReplicas.size() > 
1 ? 2 : 1));
 if (targetReplicas.size() > 2)
 makeDigestRequests(targetReplicas.subList(2, 
targetReplicas.size()));
 cfs.metric.speculativeRetry.inc();



[2/3] git commit: fix subList bug

2013-09-26 Thread jbellis
fix subList bug


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7a87fc11
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7a87fc11
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7a87fc11

Branch: refs/heads/trunk
Commit: 7a87fc1186f39678382cf9b3e1dd224d9c71aead
Parents: 20c419b
Author: Jonathan Ellis 
Authored: Thu Sep 26 16:10:15 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 16:10:15 2013 -0500

--
 src/java/org/apache/cassandra/service/AbstractReadExecutor.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7a87fc11/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
--
diff --git a/src/java/org/apache/cassandra/service/AbstractReadExecutor.java 
b/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
index 83368c2..280715a 100644
--- a/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
+++ b/src/java/org/apache/cassandra/service/AbstractReadExecutor.java
@@ -321,7 +321,7 @@ public abstract class AbstractReadExecutor
 @Override
 public void executeAsync()
 {
-makeDataRequests(targetReplicas.subList(0, 2));
+makeDataRequests(targetReplicas.subList(0, targetReplicas.size() > 
1 ? 2 : 1));
 if (targetReplicas.size() > 2)
 makeDigestRequests(targetReplicas.subList(2, 
targetReplicas.size()));
 cfs.metric.speculativeRetry.inc();



[3/3] git commit: Merge branch 'cassandra-2.0' into trunk

2013-09-26 Thread jbellis
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2520f2f8
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2520f2f8
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2520f2f8

Branch: refs/heads/trunk
Commit: 2520f2f87c2fbca97524778e29d0c114adb2cd63
Parents: 84f6c26 7a87fc1
Author: Jonathan Ellis 
Authored: Thu Sep 26 16:10:22 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 16:10:22 2013 -0500

--
 src/java/org/apache/cassandra/service/AbstractReadExecutor.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[jira] [Commented] (CASSANDRA-5932) Speculative read performance data show unexpected results

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779240#comment-13779240
 ] 

Jonathan Ellis commented on CASSANDRA-5932:
---

I see what you mean.  Fixed in 7a87fc1186f39678382cf9b3e1dd224d9c71aead.

> Speculative read performance data show unexpected results
> -
>
> Key: CASSANDRA-5932
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5932
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan McGuire
>Assignee: Aleksey Yeschenko
> Fix For: 2.0.2
>
> Attachments: 5932.txt, compaction-makes-slow.png, 
> compaction-makes-slow-stats.png, eager-read-looks-promising.png, 
> eager-read-looks-promising-stats.png, eager-read-not-consistent.png, 
> eager-read-not-consistent-stats.png, node-down-increase-performance.png
>
>
> I've done a series of stress tests with eager retries enabled that show 
> undesirable behavior. I'm grouping these behaviours into one ticket as they 
> are most likely related.
> 1) Killing off a node in a 4 node cluster actually increases performance.
> 2) Compactions make nodes slow, even after the compaction is done.
> 3) Eager Reads tend to lessen the *immediate* performance impact of a node 
> going down, but not consistently.
> My Environment:
> 1 stress machine: node0
> 4 C* nodes: node4, node5, node6, node7
> My script:
> node0 writes some data: stress -d node4 -F 3000 -n 3000 -i 5 -l 2 -K 
> 20
> node0 reads some data: stress -d node4 -n 3000 -o read -i 5 -K 20
> h3. Examples:
> h5. A node going down increases performance:
> !node-down-increase-performance.png!
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> At 450s, I kill -9 one of the nodes. There is a brief decrease in performance 
> as the snitch adapts, but then it recovers... to even higher performance than 
> before.
> h5. Compactions make nodes permanently slow:
> !compaction-makes-slow.png!
> !compaction-makes-slow-stats.png!
> The green and orange lines represent trials with eager retry enabled, they 
> never recover their op-rate from before the compaction as the red and blue 
> lines do.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.compaction.2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. Speculative Read tends to lessen the *immediate* impact:
> !eager-read-looks-promising.png!
> !eager-read-looks-promising-stats.png!
> This graph looked the most promising to me, the two trials with eager retry, 
> the green and orange line, at 450s showed the smallest dip in performance. 
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
> h5. But not always:
> !eager-read-not-consistent.png!
> !eager-read-not-consistent-stats.png!
> This is a retrial with the same settings as above, yet the 95percentile eager 
> retry (red line) did poorly this time at 450s.
> [Data for this test 
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.eager_retry.node_killed.just_20.rc1.try2.json&metric=interval_op_rate&operation=stress-read&smoothing=1]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6053) system.peers table not updated after decommissioning nodes in C* 2.0

2013-09-26 Thread Christopher J. Bottaro (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779272#comment-13779272
 ] 

Christopher J. Bottaro commented on CASSANDRA-6053:
---

We're seeing this on a 1.2.9 cluster as well.

> system.peers table not updated after decommissioning nodes in C* 2.0
> 
>
> Key: CASSANDRA-6053
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6053
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Datastax AMI running EC2 m1.xlarge instances
>Reporter: Guyon Moree
>Assignee: Brandon Williams
> Attachments: peers
>
>
> After decommissioning my cluster from 20 to 9 nodes using opscenter, I found 
> all but one of the nodes had incorrect system.peers tables.
> This became a problem (afaik) when using the python-driver, since this 
> queries the peers table to set up its connection pool. Resulting in very slow 
> startup times, because of timeouts.
> The output of nodetool didn't seem to be affected. After removing the 
> incorrect entries from the peers tables, the connection issues seem to have 
> disappeared for us. 
> Would like some feedback on if this was the right way to handle the issue or 
> if I'm still left with a broken cluster.
> Attached is the output of nodetool status, which shows the correct 9 nodes. 
> Below that the output of the system.peers tables on the individual nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6102) CassandraStorage broken for bigints and ints

2013-09-26 Thread Alex Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779296#comment-13779296
 ] 

Alex Liu commented on CASSANDRA-6102:
-

Let me double check it, I may fix the issue.

> CassandraStorage broken for bigints and ints
> 
>
> Key: CASSANDRA-6102
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6102
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
> Environment: Cassandra 1.2.9 & 1.2.10, Pig 0.11.1, OSX 10.8.x
>Reporter: Janne Jalkanen
>Assignee: Alex Liu
>
> I am seeing something rather strange in the way Cass 1.2 + Pig seem to handle 
> integer values.
> Setup: Cassandra 1.2.10, OSX 10.8, JDK 1.7u40, Pig 0.11.1.  Single node for 
> testing this. 
> First a table:
> {noformat}
> > CREATE TABLE testc (
>  key text PRIMARY KEY,
>  ivalue int,
>  svalue text,
>  value bigint
> ) WITH COMPACT STORAGE;
> > insert into testc (key,ivalue,svalue,value) values ('foo',10,'bar',65);
> > select * from testc;
> key | ivalue | svalue | value
> -+++---
> foo | 10 |bar | 65
> {noformat}
> For my Pig setup, I then use libraries from different C* versions to actually 
> talk to my database (which stays on 1.2.10 all the time).
> Cassandra 1.0.12 (using cassandra_storage.jar):
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.1.10:
> {noformat}
> testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> (foo,(svalue,bar),(ivalue,10),(value,65),{})
> {noformat}
> Cassandra 1.2.10:
> {noformat}
> (testc = LOAD 'cassandra://keyspace/testc' USING CassandraStorage();
> dump testc
> foo,{(ivalue,
> ),(svalue,bar),(value,A)})
> {noformat}
> To me it appears that ints and bigints are interpreted as ascii values in 
> cass 1.2.10.  Did something change for CassandraStorage, is there a 
> regression, or am I doing something wrong?  Quick perusal of the JIRA didn't 
> reveal anything that I could directly pin on this.
> Note that using compact storage does not seem to affect the issue, though it 
> obviously changes the resulting pig format.
> In addition, trying to use Pygmalion 
> {noformat}
> tf = foreach testc generate key, 
> flatten(FromCassandraBag('ivalue,svalue,value',columns)) as 
> (ivalue:int,svalue:chararray,lvalue:long);
> dump tf
> (foo,
> ,bar,A)
> {noformat}
> So no help there. Explicitly casting the values to (long) or (int) just 
> results in a ClassCastException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[2/6] git commit: stronger warning to avoid 0.0.0.0 rpc_address

2013-09-26 Thread jbellis
stronger warning to avoid 0.0.0.0 rpc_address


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bdb269af
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bdb269af
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bdb269af

Branch: refs/heads/cassandra-2.0
Commit: bdb269af00b31ef8bf0a15c386175170deec42ac
Parents: 389ff55
Author: Jonathan Ellis 
Authored: Thu Sep 26 17:10:21 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 17:10:21 2013 -0500

--
 conf/cassandra.yaml | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bdb269af/conf/cassandra.yaml
--
diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index a3cdec7..712f134 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -361,9 +361,8 @@ start_rpc: true
 # (i.e. it will be based on the configured hostname of the node).
 #
 # Note that unlike ListenAddress above, it is allowed to specify 0.0.0.0
-# here if you want to listen on all interfaces but is not best practice
-# as it is known to confuse the node auto-discovery features of some
-# client drivers.
+# here if you want to listen on all interfaces, but that will break clients 
+# that rely on node auto-discovery.
 rpc_address: localhost
 # port for Thrift to listen for clients on
 rpc_port: 9160



[4/6] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-09-26 Thread jbellis
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2fb089e4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2fb089e4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2fb089e4

Branch: refs/heads/trunk
Commit: 2fb089e4c8430e4e46f335cf85ee791e1299de35
Parents: 7a87fc1 bdb269a
Author: Jonathan Ellis 
Authored: Thu Sep 26 17:10:39 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 17:10:39 2013 -0500

--
 conf/cassandra.yaml | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2fb089e4/conf/cassandra.yaml
--



[1/6] git commit: stronger warning to avoid 0.0.0.0 rpc_address

2013-09-26 Thread jbellis
Updated Branches:
  refs/heads/cassandra-1.2 389ff55e2 -> bdb269af0
  refs/heads/cassandra-2.0 7a87fc118 -> 2fb089e4c
  refs/heads/trunk 2520f2f87 -> bce594f38


stronger warning to avoid 0.0.0.0 rpc_address


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bdb269af
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bdb269af
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bdb269af

Branch: refs/heads/cassandra-1.2
Commit: bdb269af00b31ef8bf0a15c386175170deec42ac
Parents: 389ff55
Author: Jonathan Ellis 
Authored: Thu Sep 26 17:10:21 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 17:10:21 2013 -0500

--
 conf/cassandra.yaml | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bdb269af/conf/cassandra.yaml
--
diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index a3cdec7..712f134 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -361,9 +361,8 @@ start_rpc: true
 # (i.e. it will be based on the configured hostname of the node).
 #
 # Note that unlike ListenAddress above, it is allowed to specify 0.0.0.0
-# here if you want to listen on all interfaces but is not best practice
-# as it is known to confuse the node auto-discovery features of some
-# client drivers.
+# here if you want to listen on all interfaces, but that will break clients 
+# that rely on node auto-discovery.
 rpc_address: localhost
 # port for Thrift to listen for clients on
 rpc_port: 9160



[3/6] git commit: stronger warning to avoid 0.0.0.0 rpc_address

2013-09-26 Thread jbellis
stronger warning to avoid 0.0.0.0 rpc_address


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bdb269af
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bdb269af
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bdb269af

Branch: refs/heads/trunk
Commit: bdb269af00b31ef8bf0a15c386175170deec42ac
Parents: 389ff55
Author: Jonathan Ellis 
Authored: Thu Sep 26 17:10:21 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 17:10:21 2013 -0500

--
 conf/cassandra.yaml | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bdb269af/conf/cassandra.yaml
--
diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index a3cdec7..712f134 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -361,9 +361,8 @@ start_rpc: true
 # (i.e. it will be based on the configured hostname of the node).
 #
 # Note that unlike ListenAddress above, it is allowed to specify 0.0.0.0
-# here if you want to listen on all interfaces but is not best practice
-# as it is known to confuse the node auto-discovery features of some
-# client drivers.
+# here if you want to listen on all interfaces, but that will break clients 
+# that rely on node auto-discovery.
 rpc_address: localhost
 # port for Thrift to listen for clients on
 rpc_port: 9160



[6/6] git commit: Merge branch 'cassandra-2.0' into trunk

2013-09-26 Thread jbellis
Merge branch 'cassandra-2.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bce594f3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bce594f3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bce594f3

Branch: refs/heads/trunk
Commit: bce594f38b2507266610e092058bd08095e56255
Parents: 2520f2f 2fb089e
Author: Jonathan Ellis 
Authored: Thu Sep 26 17:10:51 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 17:10:51 2013 -0500

--
 conf/cassandra.yaml | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
--




[5/6] git commit: Merge branch 'cassandra-1.2' into cassandra-2.0

2013-09-26 Thread jbellis
Merge branch 'cassandra-1.2' into cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2fb089e4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2fb089e4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2fb089e4

Branch: refs/heads/cassandra-2.0
Commit: 2fb089e4c8430e4e46f335cf85ee791e1299de35
Parents: 7a87fc1 bdb269a
Author: Jonathan Ellis 
Authored: Thu Sep 26 17:10:39 2013 -0500
Committer: Jonathan Ellis 
Committed: Thu Sep 26 17:10:39 2013 -0500

--
 conf/cassandra.yaml | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2fb089e4/conf/cassandra.yaml
--



[jira] [Commented] (CASSANDRA-5515) Track sstable coldness

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779394#comment-13779394
 ] 

Jonathan Ellis commented on CASSANDRA-5515:
---

bq. I can see this making sense for managing in-memory index summary sizes, but 
maybe not for STCS optimization.

I actually think it makes sense for both -- if there's a hot partition that BF 
isn't rejecting, then we should go ahead and compact it with other hot sstables 
even if it's being caused by a FP.  (Also, note that it's a bit more complex 
than just BF FP -- we can reject a slice from a partition that *does* exist, 
from cell index data in IndexedBlockFetcher.)

I guess if you want we could increment once for an index read, again for a data 
read.  But I'm not sure if that actually buys us anything useful.

bq. Do we have any workarounds/suggestions for restoring TTLed data in general?

No.  I was thinking more along the lines of writing out a summary file, or 
copying it into a snapshot metadata system table.

> Track sstable coldness
> --
>
> Key: CASSANDRA-5515
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5515
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Tyler Hobbs
> Fix For: 2.0.2
>
> Attachments: 0001-Track-row-read-counts-in-SSTR.patch, 5515-2.0-v1.txt
>
>
> Keeping a count of reads per-sstable would allow STCS to automatically ignore 
> cold data rather than recompacting it constantly with hot data, dramatically 
> reducing compaction load for typical time series applications and others with 
> time-correlated access patterns.  We would not need a separate age-tiered 
> compaction strategy.
> (This will really be useful in conjunction with CASSANDRA-5514.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Christopher Smith (JIRA)
Christopher Smith created CASSANDRA-6106:


 Summary: QueryState.getTimestamp() & FBUtilities.timestampMicros() 
reads current timestamp with System.currentTimeMillis() * 1000 instead of 
System.nanoTime() / 1000
 Key: CASSANDRA-6106
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE Cassandra 3.1, but also HEAD
Reporter: Christopher Smith
Priority: Minor


I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779492#comment-13779492
 ] 

Jonathan Ellis commented on CASSANDRA-6106:
---

Unfortunately it's not that easy to get a high-resolution timestamp in Java.  
Here's what nanotime javadoc says:

{quote}
This method can only be used to measure elapsed time and is not related to any 
other notion of system or wall-clock time. The value returned represents 
nanoseconds since some fixed but arbitrary origin time (perhaps in the future, 
so values may be negative). The same origin is used by all invocations of this 
method in an instance of a Java virtual machine; other virtual machine 
instances are likely to use a different origin...

The values returned by this method become meaningful only when the difference 
between two such values, obtained within the same instance of a Java virtual 
machine, is computed.
{quote}

It is NOT time since any particular epoch and you WILL have problems if you 
treat it like that.  (We've actually had a bug where that slipped past code 
review: CASSANDRA-4432.)

http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#nanoTime()

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Christopher Smith (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Smith updated CASSANDRA-6106:
-

Attachment: microtimstamp.patch

Here's proposed patch to HEAD that fixes at least the two cases I found 
immediately. Uses currentTimeMillis() & nanoTime() to get a precise but 
calibrated time stamp.

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779561#comment-13779561
 ] 

Jonathan Ellis commented on CASSANDRA-6106:
---

I'm not sure what this buys us -- it basically starts the micros component at a 
random point in time.

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Rick Branson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779564#comment-13779564
 ] 

Rick Branson commented on CASSANDRA-6106:
-

Could this re-sample the base wallclock at an interval so that it doesn't drift 
too far? (say, every 1s or 100ms)

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5906) Avoid allocating over-large bloom filters

2013-09-26 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779565#comment-13779565
 ] 

Yuki Morishita commented on CASSANDRA-5906:
---

v2 pushed to: https://github.com/yukim/cassandra/commits/5906-v2

HLL++ support for out-of-library hashing is 
merged(https://github.com/clearspring/stream-lib/pull/50), but not officially 
released yet, so v2 contains custom built stream-lib.
HLL++ parameter is unchanged from first version.

Other change I made was to let SSTableMetadata to support reading HLL++ on 
compaction time only. To make this change a little bit easier, I also rewrite 
mutateLevel not to deserialize SSTableMetadata 
object(https://github.com/yukim/cassandra/commit/016c89b68ba74ca15fcac9fa6e6c37faeaee7bcd).

> Avoid allocating over-large bloom filters
> -
>
> Key: CASSANDRA-5906
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5906
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Yuki Morishita
> Fix For: 2.0.2
>
>
> We conservatively estimate the number of partitions post-compaction to be the 
> total number of partitions pre-compaction.  That is, we assume the worst-case 
> scenario of no partition overlap at all.
> This can result in substantial memory wasted in sstables resulting from 
> highly overlapping compactions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779567#comment-13779567
 ] 

Jonathan Ellis commented on CASSANDRA-6106:
---

Maybe if you spin millis until it rolls to the next value, and grab nanos then?

Still seems kind of iffy to me :)

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Rick Branson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779573#comment-13779573
 ] 

Rick Branson commented on CASSANDRA-6106:
-

I think you'd actually have to use the discrepancy between currentTimeMillis() 
now-start delta vs nanoTime() now-start delta to detect wall-clock drift/change 
and correct the nanoTime().

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779577#comment-13779577
 ] 

Jonathan Ellis commented on CASSANDRA-6106:
---

I'm saying that even before you start talking about drift, we're correlating 
millis/nanos and saying, This Is The Reference Point For Zero Micros, but we 
could actually be 100 micros into the current ms, we could be 900, we have no 
way to tell.

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Rick Branson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779586#comment-13779586
 ] 

Rick Branson commented on CASSANDRA-6106:
-

Yeah you're right. The thread will probably occasionally get interrupted by the 
kernel between the currentTimeMillis() and the nanoTime() call. It'd probably 
take a bunch of iterations to properly "calibrate" the base clock. This is 
getting pretty far into the extremely-fragile territory.

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Christopher Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779600#comment-13779600
 ] 

Christopher Smith commented on CASSANDRA-6106:
--

Jonathan:
If you look at my patch, it calibrates against currentTimeMillis() so that you 
do get about as proper a "microseconds since the epoch" as is possible. The one 
thing you could do to improve it would be to periodically recalibrate with 
currentTimeMillis, but I'd argue that is actually a *bad* thing, as that would 
introduce the possibility of timestamps that go back in time.

What this does is dramatically reduce the probability that two concurrent 
writes sent to two different nodes will collide (and therefore Cassandra 
violates its atomicity "guarantee").

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Christopher Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779602#comment-13779602
 ] 

Christopher Smith commented on CASSANDRA-6106:
--

Regarding the "could be 100 micros in to the current ms, we could be 900, we 
have no way to tell".

That is correct. There is no way, with Java, to have a universal timestamp 
across nodes that is guaranteed to be ordered properly at a precision less than 
2 milliseconds.

However, given that the wobble in network latency is generally >= 1 millisecond 
*anyway*, that is FAR less of a concern. If a client wants to ensure which of 
two writes happens first, it probably can't use server side timestamps.

The big concern is the chance that two concurrent writes which have two or more 
overlapping cells be assigned the same timestamp. This is a *real risk* for any 
app with lots of concurrent writes to the same cells, regardless of its time 
sensitivity.

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Christopher Smith (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Smith updated CASSANDRA-6106:
-

Description: 
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of 
this patch is NOT to improve the precision of ordering guarantees for 
concurrent writes to cells. The purpose of this patch is to reduce the 
probability that concurrent writes to cells are deemed as having occurred at 
*the same time*, which is when Cassandra violates its atomicity guarantee.

  was:
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().


> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().
> There seems to be some confusion here, so I'd like to clarify: the purpose of 
> this patch is NOT to improve the precision of ordering guarantees for 
> concurrent writes to cells. The purpose of this patch is to reduce the 
> probability that concurrent writes to cells are deemed as having occurred at 
> *the same time*, which is when Cassandra violates its atomicity guarantee.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779603#comment-13779603
 ] 

Jonathan Ellis commented on CASSANDRA-6106:
---

I dunno.  If it's a problem, then don't paper over it ("less likely" is a 
synonym for "still possible"); use LWT or fix your data model.

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().
> There seems to be some confusion here, so I'd like to clarify: the purpose of 
> this patch is NOT to improve the precision of ordering guarantees for 
> concurrent writes to cells. The purpose of this patch is to reduce the 
> probability that concurrent writes to cells are deemed as having occurred at 
> *the same time*, which is when Cassandra violates its atomicity guarantee.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Christopher Smith (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Smith updated CASSANDRA-6106:
-

Description: 
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of 
this patch is NOT to improve the precision of ordering guarantees for 
concurrent writes to cells. The purpose of this patch is to reduce the 
probability that concurrent writes to cells are deemed as having occurred at 
*the same time*, which is when Cassandra violates its atomicity guarantee.

To clarify the failure scenario. Cassandra promises that writes to the same 
record are "atomic", so if you do something like:

create table foo {
i int PRIMARY KEY,
x int,
y int,
};

and then send these two queries concurrently:

insert into foo (i, x, y) values (1, 8, -8);
insert into foo (i, x, y) values (1, -8, 8);

you can't be quite sure which of the two writes will be the "last" one, but you 
do know that if you do:

select x, y from foo where i = 1;

you don't know if x is "8" or "-8".
you don't know if y is "-8" or "8".
YOU DO KNOW: x + y will equal 0.

EXCEPT... if the timestamps assigned to the two queries are *exactly* the same, 
in which case x + y = 16. :-( Now your writes are not atomic.

  was:
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of 
this patch is NOT to improve the precision of ordering guarantees for 
concurrent writes to cells. The purpose of this patch is to reduce the 
probability that concurrent writes to cells are deemed as having occurred at 
*the same time*, which is when Cassandra violates its atomicity guarantee.


> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().
> There seems to be some confusion here, so I'd like to clarify: the purpose of 
> this patch is NOT to improve the precision of ordering guarantees for 
> concurrent writes to cells. The purpose of this patch is to reduce the 
> probability that concurrent writes to cells are deemed as having occurred at 
> *the same time*, which is when Cassandra violates its atomicity guarantee.
> To clarify the failure scenario. Cassandra promises that writes to the same 
> record are "atomic", so if you do something like:
> create table foo {
> i int PRIMARY KEY,

[jira] [Comment Edited] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Christopher Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779623#comment-13779623
 ] 

Christopher Smith edited comment on CASSANDRA-6106 at 9/27/13 3:35 AM:
---

Look at the above description and also look at the article. LWT doesn't fix 
this. You could use a vector clock, but then you have all the hell that comes 
with that.

I agree the "still possible" is really dumb and a violation of the guarantees 
that Cassandra documents. As long as Cassandra has this mechanism though, we 
should make the probabilities way, way lower. With this change the probability 
of a collision gets to around the kind of odds as UUID collisions 
(clarification: the odds are still much higher than type 1 UUID's as they are 
128-bit, but at least it is as good as you can do with 64b-bit... note that the 
mechanism employed is quite similar to how type 1's try to avoid collisions), 
which I think for practical purposes is "good enough".

Note that the current "+1" trick also creates potentially backwards ordering 
problems (if you write 2 times in one millisecond to node A and once in the 
same millisecond to node B, the second write to node A is treated as having 
been last, even if it happened 999 microseconds before the write to node B).

Cassandra should use a different mechanism to resolve concurrent writes with 
the same timestamp. I would propose something more like this:

If two nodes have different values for a cell, but have the same timestamp for 
the cell:

1) Compute the "token" for the record.
2) Compute replicas 1 to N for that token and assign them those values 1 to N 
to each node in the datacenter.
3) If there is a tie, win goes to the replica with the node with the highest 
value for #2.
4) If there are two datacenters, each with the same highest value node (note 
this favours data centers with higher replication factors, which seems... good 
to me), you resolve in favour of the datacenter whose name alphasorts lowest.


  was (Author: xcbsmith):
Look at the above description and also look at the article. LWT doesn't fix 
this. You could use a vector clock, but then you have all the hell that comes 
with that.

I agree the "still possible" is really dumb and a violation of the guarantees 
that Cassandra documents. As long as Cassandra has this mechanism though, we 
should make the probabilities way, way lower. With this change the probability 
of a collision gets to around the kind of odds as UUID collisions, which I 
think for practical purposes is "good enough".

Note that the current "+1" trick also creates potentially backwards ordering 
problems (if you write 2 times in one millisecond to node A and once in the 
same millisecond to node B, the second write to node A is treated as having 
been last, even if it happened 999 microseconds before the write to node B).

Cassandra should use a different mechanism to resolve concurrent writes with 
the same timestamp. I would propose something more like this:

If two nodes have different values for a cell, but have the same timestamp for 
the cell:

1) Compute the "token" for the record.
2) Compute replicas 1 to N for that token and assign them those values 1 to N 
to each node in the datacenter.
3) If there is a tie, win goes to the replica with the node with the highest 
value for #2.
4) If there are two datacenters, each with the same highest value node (note 
this favours data centers with higher replication factors, which seems... good 
to me), you resolve in favour of the datacenter whose name alphasorts lowest.

  
> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure

[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Christopher Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779623#comment-13779623
 ] 

Christopher Smith commented on CASSANDRA-6106:
--

Look at the above description and also look at the article. LWT doesn't fix 
this. You could use a vector clock, but then you have all the hell that comes 
with that.

I agree the "still possible" is really dumb and a violation of the guarantees 
that Cassandra documents. As long as Cassandra has this mechanism though, we 
should make the probabilities way, way lower. With this change the probability 
of a collision gets to around the kind of odds as UUID collisions, which I 
think for practical purposes is "good enough".

Note that the current "+1" trick also creates potentially backwards ordering 
problems (if you write 2 times in one millisecond to node A and once in the 
same millisecond to node B, the second write to node A is treated as having 
been last, even if it happened 999 microseconds before the write to node B).

Cassandra should use a different mechanism to resolve concurrent writes with 
the same timestamp. I would propose something more like this:

If two nodes have different values for a cell, but have the same timestamp for 
the cell:

1) Compute the "token" for the record.
2) Compute replicas 1 to N for that token and assign them those values 1 to N 
to each node in the datacenter.
3) If there is a tie, win goes to the replica with the node with the highest 
value for #2.
4) If there are two datacenters, each with the same highest value node (note 
this favours data centers with higher replication factors, which seems... good 
to me), you resolve in favour of the datacenter whose name alphasorts lowest.


> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().
> There seems to be some confusion here, so I'd like to clarify: the purpose of 
> this patch is NOT to improve the precision of ordering guarantees for 
> concurrent writes to cells. The purpose of this patch is to reduce the 
> probability that concurrent writes to cells are deemed as having occurred at 
> *the same time*, which is when Cassandra violates its atomicity guarantee.
> To clarify the failure scenario. Cassandra promises that writes to the same 
> record are "atomic", so if you do something like:
> create table foo {
> i int PRIMARY KEY,
> x int,
> y int,
> };
> and then send these two queries concurrently:
> insert into foo (i, x, y) values (1, 8, -8);
> insert into foo (i, x, y) values (1, -8, 8);
> you can't be quite sure which of the two writes will be the "last" one, but 
> you do know that if you do:
> select x, y from foo where i = 1;
> you don't know if x is "8" or "-8".
> you don't know if y is "-8" or "8".
> YOU DO KNOW: x + y will equal 0.
> EXCEPT... if the timestamps assigned to the two queries are *exactly* the 
> same, in which case x + y = 16. :-( Now your writes are not atomic.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Christopher Smith (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Smith updated CASSANDRA-6106:
-

Description: 
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of 
this patch is NOT to improve the precision of ordering guarantees for 
concurrent writes to cells. The purpose of this patch is to reduce the 
probability that concurrent writes to cells are deemed as having occurred at 
*the same time*, which is when Cassandra violates its atomicity guarantee.

To clarify the failure scenario. Cassandra promises that writes to the same 
record are "atomic", so if you do something like:

{quote}
create table foo {
  i int PRIMARY KEY,
  x int,
  y int,
};
{quote}
and then send these two queries concurrently (separate connections, potentially 
to separate nodes):

{quote}
insert into foo (i, x, y) values (1, 8, -8);
insert into foo (i, x, y) values (1, -8, 8);
{quote}

you can't be quite sure which of the two writes will be the "last" one, but you 
do know that if you do:

{quote}
select x, y from foo where i = 1;
{quote}

you don't know if x is "8" or "-8".
you don't know if y is "-8" or "8".
YOU DO KNOW: x + y will equal 0.

EXCEPT... if the timestamps assigned to the two queries are *exactly* the same, 
in which case x + y = 16. :-( Now your writes are not atomic.

  was:
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of 
this patch is NOT to improve the precision of ordering guarantees for 
concurrent writes to cells. The purpose of this patch is to reduce the 
probability that concurrent writes to cells are deemed as having occurred at 
*the same time*, which is when Cassandra violates its atomicity guarantee.

To clarify the failure scenario. Cassandra promises that writes to the same 
record are "atomic", so if you do something like:

create table foo {
i int PRIMARY KEY,
x int,
y int,
};

and then send these two queries concurrently:

insert into foo (i, x, y) values (1, 8, -8);
insert into foo (i, x, y) values (1, -8, 8);

you can't be quite sure which of the two writes will be the "last" one, but you 
do know that if you do:

select x, y from foo where i = 1;

you don't know if x is "8" or "-8".
you don't know if y is "-8" or "8".
YOU DO KNOW: x + y will equal 0.

EXCEPT... if the timestamps assigned to the two queries are *exactly* the same, 
in which case x + y = 16. :-( Now your writes are not atomic.


> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially in

[jira] [Updated] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6106:
--

Description: 
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of 
this patch is NOT to improve the precision of ordering guarantees for 
concurrent writes to cells. The purpose of this patch is to reduce the 
probability that concurrent writes to cells are deemed as having occurred at 
*the same time*, which is when Cassandra violates its atomicity guarantee.

  was:
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of 
this patch is NOT to improve the precision of ordering guarantees for 
concurrent writes to cells. The purpose of this patch is to reduce the 
probability that concurrent writes to cells are deemed as having occurred at 
*the same time*, which is when Cassandra violates its atomicity guarantee.

To clarify the failure scenario. Cassandra promises that writes to the same 
record are "atomic", so if you do something like:

{quote}
create table foo {
  i int PRIMARY KEY,
  x int,
  y int,
};
{quote}
and then send these two queries concurrently (separate connections, potentially 
to separate nodes):

{quote}
insert into foo (i, x, y) values (1, 8, -8);
insert into foo (i, x, y) values (1, -8, 8);
{quote}

you can't be quite sure which of the two writes will be the "last" one, but you 
do know that if you do:

{quote}
select x, y from foo where i = 1;
{quote}

you don't know if x is "8" or "-8".
you don't know if y is "-8" or "8".
YOU DO KNOW: x + y will equal 0.

EXCEPT... if the timestamps assigned to the two queries are *exactly* the same, 
in which case x + y = 16. :-( Now your writes are not atomic.


> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().
> There seems to be some confusion here, so I'd like to clarify: the purpose of 
> this patch is NOT to improve the precision of ordering guarantees for 
> concurrent writes to cells. The purpose of this patch is to reduce the 
> probability that concurrent writes to cells are deemed as having occurred at 
> *the same time*, which is when Cassandra violates its atomicity guarantee.

--
This message is automatically generated by JIRA.
If you think it w

[jira] [Commented] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779630#comment-13779630
 ] 

Jonathan Ellis commented on CASSANDRA-6106:
---

Don't retroactively change the description a ton, it ruins the context for the 
subsequent comments.  Here's your addendum:

{quote}
There seems to be some confusion here, so I'd like to clarify: the purpose of 
this patch is NOT to improve the precision of ordering guarantees for 
concurrent writes to cells. The purpose of this patch is to reduce the 
probability that concurrent writes to cells are deemed as having occurred at 
*the same time*, which is when Cassandra violates its atomicity guarantee.

To clarify the failure scenario. Cassandra promises that writes to the same 
record are "atomic", so if you do something like:

create table foo {
i int PRIMARY KEY,
x int,
y int,
};

and then send these two queries concurrently:

insert into foo (i, x, y) values (1, 8, -8);
insert into foo (i, x, y) values (1, -8, 8);

you can't be quite sure which of the two writes will be the "last" one, but you 
do know that if you do:

select x, y from foo where i = 1;

you don't know if x is "8" or "-8".
you don't know if y is "-8" or "8".
YOU DO KNOW: x + y will equal 0.

EXCEPT... if the timestamps assigned to the two queries are *exactly* the same, 
in which case x + y = 16. :-( Now your writes are not atomic.
{quote}

> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().
> There seems to be some confusion here, so I'd like to clarify: the purpose of 
> this patch is NOT to improve the precision of ordering guarantees for 
> concurrent writes to cells. The purpose of this patch is to reduce the 
> probability that concurrent writes to cells are deemed as having occurred at 
> *the same time*, which is when Cassandra violates its atomicity guarantee.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-6106) QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() / 1000

2013-09-26 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6106:
--

Description: 
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().


  was:
I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
mentioned issues with millisecond rounding in timestamps and was able to 
reproduce the issue. If I specify a timestamp in a mutating query, I get 
microsecond precision, but if I don't, I get timestamps rounded to the nearest 
millisecond, at least for my first query on a given connection, which 
substantially increases the possibilities of collision.

I believe I found the offending code, though I am by no means sure this is 
comprehensive. I think we probably need a fairly comprehensive replacement of 
all uses of System.currentTimeMillis() with System.nanoTime().

There seems to be some confusion here, so I'd like to clarify: the purpose of 
this patch is NOT to improve the precision of ordering guarantees for 
concurrent writes to cells. The purpose of this patch is to reduce the 
probability that concurrent writes to cells are deemed as having occurred at 
*the same time*, which is when Cassandra violates its atomicity guarantee.


> QueryState.getTimestamp() & FBUtilities.timestampMicros() reads current 
> timestamp with System.currentTimeMillis() * 1000 instead of System.nanoTime() 
> / 1000
> 
>
> Key: CASSANDRA-6106
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6106
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: DSE Cassandra 3.1, but also HEAD
>Reporter: Christopher Smith
>Priority: Minor
>  Labels: collision, conflict, timestamp
> Attachments: microtimstamp.patch
>
>
> I noticed this blog post: http://aphyr.com/posts/294-call-me-maybe-cassandra 
> mentioned issues with millisecond rounding in timestamps and was able to 
> reproduce the issue. If I specify a timestamp in a mutating query, I get 
> microsecond precision, but if I don't, I get timestamps rounded to the 
> nearest millisecond, at least for my first query on a given connection, which 
> substantially increases the possibilities of collision.
> I believe I found the offending code, though I am by no means sure this is 
> comprehensive. I think we probably need a fairly comprehensive replacement of 
> all uses of System.currentTimeMillis() with System.nanoTime().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >