[jira] [Commented] (CASSANDRA-9978) Split/Scrub tools no longer remove original sstable

2015-08-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659555#comment-14659555
 ] 

Stefania commented on CASSANDRA-9978:
-

I've noticed some more problems when scrubbing with an offline transaction: we 
need to release the references of the new sstables or we have LEAK errors. 
testScrubOutOfOrder should no longer release a reference to the old table (now 
done by the transaction) and it should wait for deletions before loading the 
new sstables (same race as in CASSANDRA-9908).

Further, BigTableWriterTest was also broken by CASSANDRA-7066 in that the 
commit method of SSTableTxnWriter can throw (since LifecycleTransaction can) 
and so we should declare it in the test. In addition, SSTableTxnWriter should 
call txn.commit() before writer.commit() since the former may throw. I am going 
to fix these two things as well here if that's OK.

3.0 patch [here|https://github.com/stef1927/cassandra/commits/9978-3.0].

CI still pending:

http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9978-3.0-testall/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-9978-3.0-dtest/

> Split/Scrub tools no longer remove original sstable
> ---
>
> Key: CASSANDRA-9978
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9978
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: Stefania
>Priority: Minor
> Fix For: 3.0 beta 1
>
>
> Looks like CASSANDRA-7066 broke the scrub and split tools.  The orig sstable 
> is no longer removed.
> I fixed the sstablesplit dtest so you should see the issue now.  The max 
> sstable size doesn't match expected because it's the original 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659385#comment-14659385
 ] 

 Brian Hess edited comment on CASSANDRA-9302 at 8/6/15 4:17 AM:


There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That ticket 
has been tagged as "Later", but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552)


was (Author: brianmhess):
There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That ticket 
has been tagged as "Later" for some reason, but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552)

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659424#comment-14659424
 ] 

Jonathan Ellis commented on CASSANDRA-5220:
---

I'm okay with adding this to 3.0, since otherwise we'll need to wait for either 
8110 or 4.0, and I don't think that's fair to Marcus since he had the first 
version written months ago.

> Repair improvements when using vnodes
> -
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Brandon Williams
>Assignee: Marcus Olsson
>  Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
> cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
> cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659423#comment-14659423
 ] 

Jonathan Ellis commented on CASSANDRA-9302:
---

There's no need to be passive aggressive.  Here's the reason it was tagged 
Later, straight from the comments:

bq. Whatever we end up with under the hood, I think that cqlsh and COPY are the 
right front end to present to users rather than a separate loader executable.


> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659416#comment-14659416
 ] 

Stefania commented on CASSANDRA-5220:
-

Thanks, I made a couple more really tiny changes 
[here|https://github.com/stef1927/cassandra/commit/dbd5c88c6f89ff303f4fece9bb8c5ffa6c3825a1].
 The TODO comment above was misplaced sorry, I meant it for {{MerkleTrees}}. 
You're quite right that we don't need to change the existing trunk behavior. 

About _repair_history_, I verified it would result in an exception when 
upgrading from 2.2 with some sstables already on disk. Although I believe we 
could ask people to wipe this data on a major upgrade, I don't see why 
inconvenience people and so I went ahead and reverted the old format and 
inserted one line per rage, see commit 
[here|https://github.com/stef1927/cassandra/commit/92bd923a8b2d9976dc711f1b7007d25db30d06f9].
 Thanks for spotting this.

If you confirm these final changes are OK, then I am +1 to commit once CI 
completes.

[~jbellis] I assume we want this on 3.0? If so I ported the patch to 
_cassandra-3.0_ [here|https://github.com/stef1927/cassandra/commits/5220-3.0]. 
It is identical to the [trunk 
patch|https://github.com/stef1927/cassandra/commits/5220] as it applied with no 
conflicts. You can pick whichever you need depending on where you want to 
commit to and discard the other one.

CI results for trunk will appear here:

http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-testall/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-dtest/

CI results for 3.0 are instead here:

http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-3.0-testall/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-5220-3.0-dtest/


> Repair improvements when using vnodes
> -
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Brandon Williams
>Assignee: Marcus Olsson
>  Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
> cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
> cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Brian Hess (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hess updated CASSANDRA-9302:
--
Comment: was deleted

(was: There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That 
ticket has been tagged as "Later" for some reason, but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552))

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659387#comment-14659387
 ] 

 Brian Hess commented on CASSANDRA-9302:


There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That ticket 
has been tagged as "Later" for some reason, but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552)

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659385#comment-14659385
 ] 

 Brian Hess commented on CASSANDRA-9302:


There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That ticket 
has been tagged as "Later" for some reason, but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552)

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9961) cqlsh should have DESCRIBE MATERIALIZED VIEW

2015-08-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659201#comment-14659201
 ] 

Stefania commented on CASSANDRA-9961:
-

Thank you for the updates, I'll monitor the status of these two tickets.

> cqlsh should have DESCRIBE MATERIALIZED VIEW
> 
>
> Key: CASSANDRA-9961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9961
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Carl Yeksigian
>Assignee: Stefania
>  Labels: client-impacting, materializedviews
> Fix For: 3.0 beta 1
>
>
> cqlsh doesn't currently produce describe output that can be used to recreate 
> a MV. Needs to add a new {{DESCRIBE MATERIALIZED VIEW}} command, and also add 
> to {{DESCRIBE KEYSPACE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9265) Add checksum to saved cache files

2015-08-05 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659129#comment-14659129
 ] 

Ariel Weisberg commented on CASSANDRA-9265:
---

Anyone object to backporting this change to 2.2?

I thought cache versions shared the database version, but they don't. They are 
versioned independently. Which is a little scary if the persisted cache uses 
serialization that changes between database versions. It depends on 
RowIndexEntry and CachedPartition serializations and whatever those depend on.

> Add checksum to saved cache files
> -
>
> Key: CASSANDRA-9265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9265
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ariel Weisberg
> Fix For: 2.2.1, 3.0 beta 1
>
> Attachments: 
> 0001-Add-checksum-to-saved-cache-files-CASSANDRA-9265.patch, 
> 0002-trunk-CASSANDRA-9265.patch
>
>
> Saved caches are not covered by a checksum. We should at least emit a 
> checksum. My suggestion is a large checksum of the whole file (convenient 
> offline validation), and then smaller per record checksums after each record 
> is written (possibly a subset of the incrementally maintained larger 
> checksum).
> I wouldn't go for anything fancy to try to recover from corruption since it 
> is just a saved cache. If corruption is detected while reading I would just 
> have it bail out. I would rather have less code to review and test in this 
> instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9777) If you have a ~/.cqlshrc and a ~/.cassandra/cqlshrc, cqlsh will overwrite the latter with the former

2015-08-05 Thread David Kua (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659093#comment-14659093
 ] 

David Kua commented on CASSANDRA-9777:
--

https://github.com/dkua/cassandra/tree/cass-9777

I've rebased the branch to be up to date with the current 2.2 branch and added 
a commit that changed the warning message to be more detailed as per [~thobbs] 
suggestion.

> If you have a ~/.cqlshrc and a ~/.cassandra/cqlshrc, cqlsh will overwrite the 
> latter with the former
> 
>
> Key: CASSANDRA-9777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9777
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jon Moses
>Assignee: David Kua
>  Labels: cqlsh
> Fix For: 2.2.x
>
>
> If you have a .cqlshrc file, and a ~/.cassandra/cqlshrc file, when you run 
> `cqlsh`, it will overwrite the latter with the former.  
> https://github.com/apache/cassandra/blob/trunk/bin/cqlsh#L202
> If the 'new' path exists (~/.cassandra/cqlsh), cqlsh should either WARN or 
> just leave the files alone.
> {noformat}
> ~$ cat .cqlshrc
> [authentication]
> ~$ cat .cassandra/cqlshrc
> [connection]
> ~$ cqlsh
> ~$ cat .cqlshrc
> cat: .cqlshrc: No such file or directory
> ~$ cat .cassandra/cqlshrc
> [authentication]
> ~$
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Adam Holmberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659091#comment-14659091
 ] 

Adam Holmberg commented on CASSANDRA-9302:
--

Thanks for the background.

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9265) Add checksum to saved cache files

2015-08-05 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-9265:
--
Fix Version/s: (was: 3.x)
   3.0 beta 1
   2.2.1

> Add checksum to saved cache files
> -
>
> Key: CASSANDRA-9265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9265
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ariel Weisberg
> Fix For: 2.2.1, 3.0 beta 1
>
> Attachments: 
> 0001-Add-checksum-to-saved-cache-files-CASSANDRA-9265.patch, 
> 0002-trunk-CASSANDRA-9265.patch
>
>
> Saved caches are not covered by a checksum. We should at least emit a 
> checksum. My suggestion is a large checksum of the whole file (convenient 
> offline validation), and then smaller per record checksums after each record 
> is written (possibly a subset of the incrementally maintained larger 
> checksum).
> I wouldn't go for anything fancy to try to recover from corruption since it 
> is just a saved cache. If corruption is detected while reading I would just 
> have it bail out. I would rather have less code to review and test in this 
> instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9927) Security for MaterializedViews

2015-08-05 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659089#comment-14659089
 ] 

Aleksey Yeschenko commented on CASSANDRA-9927:
--

[~tjake] [~carlyeks] actually, how is the ticket description different from 
what's already in trunk?

> Security for MaterializedViews
> --
>
> Key: CASSANDRA-9927
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
> Project: Cassandra
>  Issue Type: Task
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.0 beta 1
>
>
> We need to think about how to handle security wrt materialized views. Since 
> they are based on a source table we should possibly inherit the same security 
> model as that table.  
> However I can see cases where users would want to create different security 
> auth for different views.  esp once we have CASSANDRA-9664 and users can 
> filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659067#comment-14659067
 ] 

Tyler Hobbs commented on CASSANDRA-9302:


bq. Has there been discussion anywhere about implementing a loader on the Java 
driver, now that it's bundled with the server?

Yes, there was quite a bit on CASSANDRA-8225.  To summarize, we're making cqlsh 
"good enough" for most cases, and planning on using Spark for everything else.

bq. I hope nobody is surprised that the Python implementation is much slower 
than the C implementation. \[...\] Hopefully we can amortize this with batching 
by partition and/or giving it more processes.

If wide partitions are used, I think this will be okay.  We could perhaps take 
a quick sample of the file to determine if that's the case, and if not, skip 
using TAR with murmur3.

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9265) Add checksum to saved cache files

2015-08-05 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-9265:
--
Reviewer: Ariel Weisberg

> Add checksum to saved cache files
> -
>
> Key: CASSANDRA-9265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9265
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ariel Weisberg
> Fix For: 3.x
>
> Attachments: 
> 0001-Add-checksum-to-saved-cache-files-CASSANDRA-9265.patch, 
> 0002-trunk-CASSANDRA-9265.patch
>
>
> Saved caches are not covered by a checksum. We should at least emit a 
> checksum. My suggestion is a large checksum of the whole file (convenient 
> offline validation), and then smaller per record checksums after each record 
> is written (possibly a subset of the incrementally maintained larger 
> checksum).
> I wouldn't go for anything fancy to try to recover from corruption since it 
> is just a saved cache. If corruption is detected while reading I would just 
> have it bail out. I would rather have less code to review and test in this 
> instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[08/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2015-08-05 Thread yukim
Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c3ed25b0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c3ed25b0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c3ed25b0

Branch: refs/heads/cassandra-3.0
Commit: c3ed25b0ad43aad0deaade1b915ff8310c9ca3fc
Parents: 90e0013 32bc8b0
Author: Yuki Morishita 
Authored: Wed Aug 5 16:10:33 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:10:33 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++
 .../apache/cassandra/tools/nodetool/Info.java   | 23 ++--
 3 files changed, 19 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c3ed25b0/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c3ed25b0/src/java/org/apache/cassandra/tools/NodeProbe.java
--



[10/10] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

2015-08-05 Thread yukim
Merge branch 'cassandra-3.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c1aff4fa
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c1aff4fa
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c1aff4fa

Branch: refs/heads/trunk
Commit: c1aff4fa61e09396de56cfa365c56dbe256393ee
Parents: 760dbd9 c3ed25b
Author: Yuki Morishita 
Authored: Wed Aug 5 16:10:42 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:10:42 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++
 .../apache/cassandra/tools/nodetool/Info.java   | 23 ++--
 3 files changed, 19 insertions(+), 21 deletions(-)
--




[07/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

2015-08-05 Thread yukim
Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/32bc8b0b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/32bc8b0b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/32bc8b0b

Branch: refs/heads/trunk
Commit: 32bc8b0b182176f0132522f821a1b13919efc63a
Parents: 5c59d5a 20f12e9
Author: Yuki Morishita 
Authored: Wed Aug 5 16:10:22 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:10:22 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++
 .../apache/cassandra/tools/nodetool/Info.java   | 23 ++--
 3 files changed, 19 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/32bc8b0b/CHANGES.txt
--
diff --cc CHANGES.txt
index 66e5a0c,9a475ea..72ad3cd
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,27 -1,8 +1,28 @@@
 -2.1.9
 +2.2.1
 + * Log warning when using an aggregate without partition key (CASSANDRA-9737)
 + * Avoid grouping sstables for anticompaction with DTCS (CASSANDRA-9900)
 + * UDF / UDA execution time in trace (CASSANDRA-9723)
 + * Fix broken internode SSL (CASSANDRA-9884)
 +Merged from 2.1:
   * Cannot replace token does not exist - DN node removed as Fat Client 
(CASSANDRA-9871)
   * Fix handling of enable/disable autocompaction (CASSANDRA-9899)
 - * Commit log segment recycling is disabled by default (CASSANDRA-9896)
   * Add consistency level to tracing ouput (CASSANDRA-9827)
 + * Remove repair snapshot leftover on startup (CASSANDRA-7357)
 + * Use random nodes for batch log when only 2 racks (CASSANDRA-8735)
 + * Ensure atomicity inside thrift and stream session (CASSANDRA-7757)
++ * Fix nodetool info error when the node is not joined (CASSANDRA-9031)
 +Merged from 2.0:
 + * Log when messages are dropped due to cross_node_timeout (CASSANDRA-9793)
 + * Don't track hotness when opening from snapshot for validation 
(CASSANDRA-9382)
 +
 +
 +2.2.0
 + * Allow the selection of columns together with aggregates (CASSANDRA-9767)
 + * Fix cqlsh copy methods and other windows specific issues (CASSANDRA-9795)
 + * Don't wrap byte arrays in SequentialWriter (CASSANDRA-9797)
 + * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
 + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
 +Merged from 2.1:
   * Fix MarshalException when upgrading superColumn family (CASSANDRA-9582)
   * Fix broken logging for "empty" flushes in Memtable (CASSANDRA-9837)
   * Handle corrupt files on startup (CASSANDRA-9686)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/32bc8b0b/src/java/org/apache/cassandra/tools/NodeProbe.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/32bc8b0b/src/java/org/apache/cassandra/tools/nodetool/Info.java
--
diff --cc src/java/org/apache/cassandra/tools/nodetool/Info.java
index 5852fc7,000..0d9bd73
mode 100644,00..100644
--- a/src/java/org/apache/cassandra/tools/nodetool/Info.java
+++ b/src/java/org/apache/cassandra/tools/nodetool/Info.java
@@@ -1,153 -1,0 +1,162 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * "License"); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + * http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is distributed on an "AS IS" BASIS,
 + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 + * See the License for the specific language governing permissions and
 + * limitations under the License.
 + */
 +package org.apache.cassandra.tools.nodetool;
 +
 +import io.airlift.command.Command;
 +import io.airlift.command.Option;
 +
 +import java.lang.management.MemoryUsage;
 +import java.util.Iterator;
 +import java.util.List;
 +import java.util.Map;
 +import java.util.Map.Entry;
 +
 +import javax.management.InstanceNotFoundException;
 +
 +import org.apache.cassandra.db.ColumnFamilyStoreMBean;
 +import org.apache.cassandra.io.util.FileUtils;
 +import org.apache.cassandra.service.CacheServiceMBean;
 +import org.apache.cassandra.tools.NodeProbe;
 +import org.apache.cassandra.tools.NodeTool.NodeToolCmd;
 +
 +@Command(name =

[02/10] cassandra git commit: Fix nodetool info error when the node is not joined

2015-08-05 Thread yukim
Fix nodetool info error when the node is not joined

patch by yukim; reviewed by stefania for CASSANDRA-9031


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/20f12e97
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/20f12e97
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/20f12e97

Branch: refs/heads/cassandra-2.2
Commit: 20f12e97446eee55461a8d3512a94389a67e79ee
Parents: 1a2c1bc
Author: Yuki Morishita 
Authored: Wed Aug 5 15:58:36 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:01:53 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++-
 .../org/apache/cassandra/tools/NodeTool.java| 21 ++--
 3 files changed, 18 insertions(+), 20 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index de7cfa8..9a475ea 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -12,6 +12,7 @@
  * Remove repair snapshot leftover on startup (CASSANDRA-7357)
  * Use random nodes for batch log when only 2 racks (CASSANDRA-8735)
  * Ensure atomicity inside thrift and stream session (CASSANDRA-7757)
+ * Fix nodetool info error when the node is not joined (CASSANDRA-9031)
 Merged from 2.0:
  * Don't cast expected bf size to an int (CASSANDRA-9959)
  * Log when messages are dropped due to cross_node_timeout (CASSANDRA-9793)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/src/java/org/apache/cassandra/tools/NodeProbe.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeProbe.java 
b/src/java/org/apache/cassandra/tools/NodeProbe.java
index d3bce4d..caa12c3 100644
--- a/src/java/org/apache/cassandra/tools/NodeProbe.java
+++ b/src/java/org/apache/cassandra/tools/NodeProbe.java
@@ -807,20 +807,8 @@ public class NodeProbe implements AutoCloseable
 
 public String getEndpoint()
 {
-// Try to find the endpoint using the local token, doing so in a crazy 
manner
-// to maintain backwards compatibility with the MBean interface
-String stringToken = ssProxy.getTokens().get(0);
-Map tokenToEndpoint = ssProxy.getTokenToEndpointMap();
-
-for (Map.Entry pair : tokenToEndpoint.entrySet())
-{
-if (pair.getKey().equals(stringToken))
-{
-return pair.getValue();
-}
-}
-
-throw new RuntimeException("Could not find myself in the endpoint 
list, something is very wrong!  Is the Cassandra node fully started?");
+Map hostIdToEndpoint = ssProxy.getHostIdMap();
+return hostIdToEndpoint.get(ssProxy.getLocalHostId());
 }
 
 public String getDataCenter()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index a2d4ead..6a7a930 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -463,13 +463,22 @@ public class NodeTool
 probe.getCacheMetric("CounterCache", "HitRate"),
 cacheService.getCounterCacheSavePeriodInSeconds());
 
-// Tokens
-List tokens = probe.getTokens();
-if (tokens.size() == 1 || this.tokens)
-for (String token : tokens)
-System.out.printf("%-23s: %s%n", "Token", token);
+// check if node is already joined, before getting tokens, since 
it throws exception if not.
+if (probe.isJoined())
+{
+// Tokens
+List tokens = probe.getTokens();
+if (tokens.size() == 1 || this.tokens)
+for (String token : tokens)
+System.out.printf("%-23s: %s%n", "Token", token);
+else
+System.out.printf("%-23s: (invoke with -T/--tokens to see 
all %d tokens)%n", "Token",
+  tokens.size());
+}
 else
-System.out.printf("%-23s: (invoke with -T/--tokens to see all 
%d tokens)%n", "Token", tokens.size());
+{
+System.out.printf("%-23s: (node is not joined to the 
cluster)%n", "Token");
+}
 }
 
 /**



[01/10] cassandra git commit: Fix nodetool info error when the node is not joined

2015-08-05 Thread yukim
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 1a2c1bcdc -> 20f12e974
  refs/heads/cassandra-2.2 5c59d5af7 -> 32bc8b0b1
  refs/heads/cassandra-3.0 90e001312 -> c3ed25b0a
  refs/heads/trunk 760dbd957 -> c1aff4fa6


Fix nodetool info error when the node is not joined

patch by yukim; reviewed by stefania for CASSANDRA-9031


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/20f12e97
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/20f12e97
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/20f12e97

Branch: refs/heads/cassandra-2.1
Commit: 20f12e97446eee55461a8d3512a94389a67e79ee
Parents: 1a2c1bc
Author: Yuki Morishita 
Authored: Wed Aug 5 15:58:36 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:01:53 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++-
 .../org/apache/cassandra/tools/NodeTool.java| 21 ++--
 3 files changed, 18 insertions(+), 20 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index de7cfa8..9a475ea 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -12,6 +12,7 @@
  * Remove repair snapshot leftover on startup (CASSANDRA-7357)
  * Use random nodes for batch log when only 2 racks (CASSANDRA-8735)
  * Ensure atomicity inside thrift and stream session (CASSANDRA-7757)
+ * Fix nodetool info error when the node is not joined (CASSANDRA-9031)
 Merged from 2.0:
  * Don't cast expected bf size to an int (CASSANDRA-9959)
  * Log when messages are dropped due to cross_node_timeout (CASSANDRA-9793)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/src/java/org/apache/cassandra/tools/NodeProbe.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeProbe.java 
b/src/java/org/apache/cassandra/tools/NodeProbe.java
index d3bce4d..caa12c3 100644
--- a/src/java/org/apache/cassandra/tools/NodeProbe.java
+++ b/src/java/org/apache/cassandra/tools/NodeProbe.java
@@ -807,20 +807,8 @@ public class NodeProbe implements AutoCloseable
 
 public String getEndpoint()
 {
-// Try to find the endpoint using the local token, doing so in a crazy 
manner
-// to maintain backwards compatibility with the MBean interface
-String stringToken = ssProxy.getTokens().get(0);
-Map tokenToEndpoint = ssProxy.getTokenToEndpointMap();
-
-for (Map.Entry pair : tokenToEndpoint.entrySet())
-{
-if (pair.getKey().equals(stringToken))
-{
-return pair.getValue();
-}
-}
-
-throw new RuntimeException("Could not find myself in the endpoint 
list, something is very wrong!  Is the Cassandra node fully started?");
+Map hostIdToEndpoint = ssProxy.getHostIdMap();
+return hostIdToEndpoint.get(ssProxy.getLocalHostId());
 }
 
 public String getDataCenter()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index a2d4ead..6a7a930 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -463,13 +463,22 @@ public class NodeTool
 probe.getCacheMetric("CounterCache", "HitRate"),
 cacheService.getCounterCacheSavePeriodInSeconds());
 
-// Tokens
-List tokens = probe.getTokens();
-if (tokens.size() == 1 || this.tokens)
-for (String token : tokens)
-System.out.printf("%-23s: %s%n", "Token", token);
+// check if node is already joined, before getting tokens, since 
it throws exception if not.
+if (probe.isJoined())
+{
+// Tokens
+List tokens = probe.getTokens();
+if (tokens.size() == 1 || this.tokens)
+for (String token : tokens)
+System.out.printf("%-23s: %s%n", "Token", token);
+else
+System.out.printf("%-23s: (invoke with -T/--tokens to see 
all %d tokens)%n", "Token",
+  tokens.size());
+}
 else
-System.out.printf("%-23s: (invoke with -T/--tokens to see all 
%d tokens)%n", "Token", tokens.size());
+{
+System.out.printf("%-23s: (node is

[06/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

2015-08-05 Thread yukim
Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/32bc8b0b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/32bc8b0b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/32bc8b0b

Branch: refs/heads/cassandra-2.2
Commit: 32bc8b0b182176f0132522f821a1b13919efc63a
Parents: 5c59d5a 20f12e9
Author: Yuki Morishita 
Authored: Wed Aug 5 16:10:22 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:10:22 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++
 .../apache/cassandra/tools/nodetool/Info.java   | 23 ++--
 3 files changed, 19 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/32bc8b0b/CHANGES.txt
--
diff --cc CHANGES.txt
index 66e5a0c,9a475ea..72ad3cd
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,27 -1,8 +1,28 @@@
 -2.1.9
 +2.2.1
 + * Log warning when using an aggregate without partition key (CASSANDRA-9737)
 + * Avoid grouping sstables for anticompaction with DTCS (CASSANDRA-9900)
 + * UDF / UDA execution time in trace (CASSANDRA-9723)
 + * Fix broken internode SSL (CASSANDRA-9884)
 +Merged from 2.1:
   * Cannot replace token does not exist - DN node removed as Fat Client 
(CASSANDRA-9871)
   * Fix handling of enable/disable autocompaction (CASSANDRA-9899)
 - * Commit log segment recycling is disabled by default (CASSANDRA-9896)
   * Add consistency level to tracing ouput (CASSANDRA-9827)
 + * Remove repair snapshot leftover on startup (CASSANDRA-7357)
 + * Use random nodes for batch log when only 2 racks (CASSANDRA-8735)
 + * Ensure atomicity inside thrift and stream session (CASSANDRA-7757)
++ * Fix nodetool info error when the node is not joined (CASSANDRA-9031)
 +Merged from 2.0:
 + * Log when messages are dropped due to cross_node_timeout (CASSANDRA-9793)
 + * Don't track hotness when opening from snapshot for validation 
(CASSANDRA-9382)
 +
 +
 +2.2.0
 + * Allow the selection of columns together with aggregates (CASSANDRA-9767)
 + * Fix cqlsh copy methods and other windows specific issues (CASSANDRA-9795)
 + * Don't wrap byte arrays in SequentialWriter (CASSANDRA-9797)
 + * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
 + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
 +Merged from 2.1:
   * Fix MarshalException when upgrading superColumn family (CASSANDRA-9582)
   * Fix broken logging for "empty" flushes in Memtable (CASSANDRA-9837)
   * Handle corrupt files on startup (CASSANDRA-9686)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/32bc8b0b/src/java/org/apache/cassandra/tools/NodeProbe.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/32bc8b0b/src/java/org/apache/cassandra/tools/nodetool/Info.java
--
diff --cc src/java/org/apache/cassandra/tools/nodetool/Info.java
index 5852fc7,000..0d9bd73
mode 100644,00..100644
--- a/src/java/org/apache/cassandra/tools/nodetool/Info.java
+++ b/src/java/org/apache/cassandra/tools/nodetool/Info.java
@@@ -1,153 -1,0 +1,162 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * "License"); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + * http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is distributed on an "AS IS" BASIS,
 + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 + * See the License for the specific language governing permissions and
 + * limitations under the License.
 + */
 +package org.apache.cassandra.tools.nodetool;
 +
 +import io.airlift.command.Command;
 +import io.airlift.command.Option;
 +
 +import java.lang.management.MemoryUsage;
 +import java.util.Iterator;
 +import java.util.List;
 +import java.util.Map;
 +import java.util.Map.Entry;
 +
 +import javax.management.InstanceNotFoundException;
 +
 +import org.apache.cassandra.db.ColumnFamilyStoreMBean;
 +import org.apache.cassandra.io.util.FileUtils;
 +import org.apache.cassandra.service.CacheServiceMBean;
 +import org.apache.cassandra.tools.NodeProbe;
 +import org.apache.cassandra.tools.NodeTool.NodeToolCmd;
 +
 +@Comman

[05/10] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

2015-08-05 Thread yukim
Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/32bc8b0b
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/32bc8b0b
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/32bc8b0b

Branch: refs/heads/cassandra-3.0
Commit: 32bc8b0b182176f0132522f821a1b13919efc63a
Parents: 5c59d5a 20f12e9
Author: Yuki Morishita 
Authored: Wed Aug 5 16:10:22 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:10:22 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++
 .../apache/cassandra/tools/nodetool/Info.java   | 23 ++--
 3 files changed, 19 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/32bc8b0b/CHANGES.txt
--
diff --cc CHANGES.txt
index 66e5a0c,9a475ea..72ad3cd
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,27 -1,8 +1,28 @@@
 -2.1.9
 +2.2.1
 + * Log warning when using an aggregate without partition key (CASSANDRA-9737)
 + * Avoid grouping sstables for anticompaction with DTCS (CASSANDRA-9900)
 + * UDF / UDA execution time in trace (CASSANDRA-9723)
 + * Fix broken internode SSL (CASSANDRA-9884)
 +Merged from 2.1:
   * Cannot replace token does not exist - DN node removed as Fat Client 
(CASSANDRA-9871)
   * Fix handling of enable/disable autocompaction (CASSANDRA-9899)
 - * Commit log segment recycling is disabled by default (CASSANDRA-9896)
   * Add consistency level to tracing ouput (CASSANDRA-9827)
 + * Remove repair snapshot leftover on startup (CASSANDRA-7357)
 + * Use random nodes for batch log when only 2 racks (CASSANDRA-8735)
 + * Ensure atomicity inside thrift and stream session (CASSANDRA-7757)
++ * Fix nodetool info error when the node is not joined (CASSANDRA-9031)
 +Merged from 2.0:
 + * Log when messages are dropped due to cross_node_timeout (CASSANDRA-9793)
 + * Don't track hotness when opening from snapshot for validation 
(CASSANDRA-9382)
 +
 +
 +2.2.0
 + * Allow the selection of columns together with aggregates (CASSANDRA-9767)
 + * Fix cqlsh copy methods and other windows specific issues (CASSANDRA-9795)
 + * Don't wrap byte arrays in SequentialWriter (CASSANDRA-9797)
 + * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
 + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
 +Merged from 2.1:
   * Fix MarshalException when upgrading superColumn family (CASSANDRA-9582)
   * Fix broken logging for "empty" flushes in Memtable (CASSANDRA-9837)
   * Handle corrupt files on startup (CASSANDRA-9686)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/32bc8b0b/src/java/org/apache/cassandra/tools/NodeProbe.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/32bc8b0b/src/java/org/apache/cassandra/tools/nodetool/Info.java
--
diff --cc src/java/org/apache/cassandra/tools/nodetool/Info.java
index 5852fc7,000..0d9bd73
mode 100644,00..100644
--- a/src/java/org/apache/cassandra/tools/nodetool/Info.java
+++ b/src/java/org/apache/cassandra/tools/nodetool/Info.java
@@@ -1,153 -1,0 +1,162 @@@
 +/*
 + * Licensed to the Apache Software Foundation (ASF) under one
 + * or more contributor license agreements.  See the NOTICE file
 + * distributed with this work for additional information
 + * regarding copyright ownership.  The ASF licenses this file
 + * to you under the Apache License, Version 2.0 (the
 + * "License"); you may not use this file except in compliance
 + * with the License.  You may obtain a copy of the License at
 + *
 + * http://www.apache.org/licenses/LICENSE-2.0
 + *
 + * Unless required by applicable law or agreed to in writing, software
 + * distributed under the License is distributed on an "AS IS" BASIS,
 + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 + * See the License for the specific language governing permissions and
 + * limitations under the License.
 + */
 +package org.apache.cassandra.tools.nodetool;
 +
 +import io.airlift.command.Command;
 +import io.airlift.command.Option;
 +
 +import java.lang.management.MemoryUsage;
 +import java.util.Iterator;
 +import java.util.List;
 +import java.util.Map;
 +import java.util.Map.Entry;
 +
 +import javax.management.InstanceNotFoundException;
 +
 +import org.apache.cassandra.db.ColumnFamilyStoreMBean;
 +import org.apache.cassandra.io.util.FileUtils;
 +import org.apache.cassandra.service.CacheServiceMBean;
 +import org.apache.cassandra.tools.NodeProbe;
 +import org.apache.cassandra.tools.NodeTool.NodeToolCmd;
 +
 +@Comman

[04/10] cassandra git commit: Fix nodetool info error when the node is not joined

2015-08-05 Thread yukim
Fix nodetool info error when the node is not joined

patch by yukim; reviewed by stefania for CASSANDRA-9031


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/20f12e97
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/20f12e97
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/20f12e97

Branch: refs/heads/trunk
Commit: 20f12e97446eee55461a8d3512a94389a67e79ee
Parents: 1a2c1bc
Author: Yuki Morishita 
Authored: Wed Aug 5 15:58:36 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:01:53 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++-
 .../org/apache/cassandra/tools/NodeTool.java| 21 ++--
 3 files changed, 18 insertions(+), 20 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index de7cfa8..9a475ea 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -12,6 +12,7 @@
  * Remove repair snapshot leftover on startup (CASSANDRA-7357)
  * Use random nodes for batch log when only 2 racks (CASSANDRA-8735)
  * Ensure atomicity inside thrift and stream session (CASSANDRA-7757)
+ * Fix nodetool info error when the node is not joined (CASSANDRA-9031)
 Merged from 2.0:
  * Don't cast expected bf size to an int (CASSANDRA-9959)
  * Log when messages are dropped due to cross_node_timeout (CASSANDRA-9793)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/src/java/org/apache/cassandra/tools/NodeProbe.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeProbe.java 
b/src/java/org/apache/cassandra/tools/NodeProbe.java
index d3bce4d..caa12c3 100644
--- a/src/java/org/apache/cassandra/tools/NodeProbe.java
+++ b/src/java/org/apache/cassandra/tools/NodeProbe.java
@@ -807,20 +807,8 @@ public class NodeProbe implements AutoCloseable
 
 public String getEndpoint()
 {
-// Try to find the endpoint using the local token, doing so in a crazy 
manner
-// to maintain backwards compatibility with the MBean interface
-String stringToken = ssProxy.getTokens().get(0);
-Map tokenToEndpoint = ssProxy.getTokenToEndpointMap();
-
-for (Map.Entry pair : tokenToEndpoint.entrySet())
-{
-if (pair.getKey().equals(stringToken))
-{
-return pair.getValue();
-}
-}
-
-throw new RuntimeException("Could not find myself in the endpoint 
list, something is very wrong!  Is the Cassandra node fully started?");
+Map hostIdToEndpoint = ssProxy.getHostIdMap();
+return hostIdToEndpoint.get(ssProxy.getLocalHostId());
 }
 
 public String getDataCenter()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index a2d4ead..6a7a930 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -463,13 +463,22 @@ public class NodeTool
 probe.getCacheMetric("CounterCache", "HitRate"),
 cacheService.getCounterCacheSavePeriodInSeconds());
 
-// Tokens
-List tokens = probe.getTokens();
-if (tokens.size() == 1 || this.tokens)
-for (String token : tokens)
-System.out.printf("%-23s: %s%n", "Token", token);
+// check if node is already joined, before getting tokens, since 
it throws exception if not.
+if (probe.isJoined())
+{
+// Tokens
+List tokens = probe.getTokens();
+if (tokens.size() == 1 || this.tokens)
+for (String token : tokens)
+System.out.printf("%-23s: %s%n", "Token", token);
+else
+System.out.printf("%-23s: (invoke with -T/--tokens to see 
all %d tokens)%n", "Token",
+  tokens.size());
+}
 else
-System.out.printf("%-23s: (invoke with -T/--tokens to see all 
%d tokens)%n", "Token", tokens.size());
+{
+System.out.printf("%-23s: (node is not joined to the 
cluster)%n", "Token");
+}
 }
 
 /**



[03/10] cassandra git commit: Fix nodetool info error when the node is not joined

2015-08-05 Thread yukim
Fix nodetool info error when the node is not joined

patch by yukim; reviewed by stefania for CASSANDRA-9031


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/20f12e97
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/20f12e97
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/20f12e97

Branch: refs/heads/cassandra-3.0
Commit: 20f12e97446eee55461a8d3512a94389a67e79ee
Parents: 1a2c1bc
Author: Yuki Morishita 
Authored: Wed Aug 5 15:58:36 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:01:53 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++-
 .../org/apache/cassandra/tools/NodeTool.java| 21 ++--
 3 files changed, 18 insertions(+), 20 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index de7cfa8..9a475ea 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -12,6 +12,7 @@
  * Remove repair snapshot leftover on startup (CASSANDRA-7357)
  * Use random nodes for batch log when only 2 racks (CASSANDRA-8735)
  * Ensure atomicity inside thrift and stream session (CASSANDRA-7757)
+ * Fix nodetool info error when the node is not joined (CASSANDRA-9031)
 Merged from 2.0:
  * Don't cast expected bf size to an int (CASSANDRA-9959)
  * Log when messages are dropped due to cross_node_timeout (CASSANDRA-9793)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/src/java/org/apache/cassandra/tools/NodeProbe.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeProbe.java 
b/src/java/org/apache/cassandra/tools/NodeProbe.java
index d3bce4d..caa12c3 100644
--- a/src/java/org/apache/cassandra/tools/NodeProbe.java
+++ b/src/java/org/apache/cassandra/tools/NodeProbe.java
@@ -807,20 +807,8 @@ public class NodeProbe implements AutoCloseable
 
 public String getEndpoint()
 {
-// Try to find the endpoint using the local token, doing so in a crazy 
manner
-// to maintain backwards compatibility with the MBean interface
-String stringToken = ssProxy.getTokens().get(0);
-Map tokenToEndpoint = ssProxy.getTokenToEndpointMap();
-
-for (Map.Entry pair : tokenToEndpoint.entrySet())
-{
-if (pair.getKey().equals(stringToken))
-{
-return pair.getValue();
-}
-}
-
-throw new RuntimeException("Could not find myself in the endpoint 
list, something is very wrong!  Is the Cassandra node fully started?");
+Map hostIdToEndpoint = ssProxy.getHostIdMap();
+return hostIdToEndpoint.get(ssProxy.getLocalHostId());
 }
 
 public String getDataCenter()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/20f12e97/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index a2d4ead..6a7a930 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -463,13 +463,22 @@ public class NodeTool
 probe.getCacheMetric("CounterCache", "HitRate"),
 cacheService.getCounterCacheSavePeriodInSeconds());
 
-// Tokens
-List tokens = probe.getTokens();
-if (tokens.size() == 1 || this.tokens)
-for (String token : tokens)
-System.out.printf("%-23s: %s%n", "Token", token);
+// check if node is already joined, before getting tokens, since 
it throws exception if not.
+if (probe.isJoined())
+{
+// Tokens
+List tokens = probe.getTokens();
+if (tokens.size() == 1 || this.tokens)
+for (String token : tokens)
+System.out.printf("%-23s: %s%n", "Token", token);
+else
+System.out.printf("%-23s: (invoke with -T/--tokens to see 
all %d tokens)%n", "Token",
+  tokens.size());
+}
 else
-System.out.printf("%-23s: (invoke with -T/--tokens to see all 
%d tokens)%n", "Token", tokens.size());
+{
+System.out.printf("%-23s: (node is not joined to the 
cluster)%n", "Token");
+}
 }
 
 /**



[09/10] cassandra git commit: Merge branch 'cassandra-2.2' into cassandra-3.0

2015-08-05 Thread yukim
Merge branch 'cassandra-2.2' into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c3ed25b0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c3ed25b0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c3ed25b0

Branch: refs/heads/trunk
Commit: c3ed25b0ad43aad0deaade1b915ff8310c9ca3fc
Parents: 90e0013 32bc8b0
Author: Yuki Morishita 
Authored: Wed Aug 5 16:10:33 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 16:10:33 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeProbe.java   | 16 ++
 .../apache/cassandra/tools/nodetool/Info.java   | 23 ++--
 3 files changed, 19 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c3ed25b0/CHANGES.txt
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c3ed25b0/src/java/org/apache/cassandra/tools/NodeProbe.java
--



[jira] [Commented] (CASSANDRA-9927) Security for MaterializedViews

2015-08-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659028#comment-14659028
 ] 

Jonathan Ellis commented on CASSANDRA-9927:
---

Long term, we do want to support this.  But it's pretty late to start design 
for 3.0.

> Security for MaterializedViews
> --
>
> Key: CASSANDRA-9927
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
> Project: Cassandra
>  Issue Type: Task
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.0 beta 1
>
>
> We need to think about how to handle security wrt materialized views. Since 
> they are based on a source table we should possibly inherit the same security 
> model as that table.  
> However I can see cases where users would want to create different security 
> auth for different views.  esp once we have CASSANDRA-9664 and users can 
> filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9446) Failure detector should ignore local pauses per endpoint

2015-08-05 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659029#comment-14659029
 ] 

sankalp kohli commented on CASSANDRA-9446:
--

This is a simple approach however this patch won't mark anything down for 5 
seconds after a large pause. A much granular approach will be to have a 
lastInterpret time per endpoint and work on it. You can put this in 
ArrivalWindow. If you have not evaluated that endpoint for more than 5 seconds, 
you don't mark it down.

> Failure detector should ignore local pauses per endpoint
> 
>
> Key: CASSANDRA-9446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9446
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Brandon Williams
>Priority: Minor
> Attachments: 9446.txt, 9644-v2.txt
>
>
> In CASSANDRA-9183, we added a feature to ignore local pauses. But it will 
> only not mark 2 endpoints as down. 
> We should do this per endpoint as suggested by Brandon in CASSANDRA-9183. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9777) If you have a ~/.cqlshrc and a ~/.cassandra/cqlshrc, cqlsh will overwrite the latter with the former

2015-08-05 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658997#comment-14658997
 ] 

Aleksey Yeschenko commented on CASSANDRA-9777:
--

[~mishail] What Tyler said.

> If you have a ~/.cqlshrc and a ~/.cassandra/cqlshrc, cqlsh will overwrite the 
> latter with the former
> 
>
> Key: CASSANDRA-9777
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9777
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jon Moses
>Assignee: David Kua
>  Labels: cqlsh
> Fix For: 2.2.x
>
>
> If you have a .cqlshrc file, and a ~/.cassandra/cqlshrc file, when you run 
> `cqlsh`, it will overwrite the latter with the former.  
> https://github.com/apache/cassandra/blob/trunk/bin/cqlsh#L202
> If the 'new' path exists (~/.cassandra/cqlsh), cqlsh should either WARN or 
> just leave the files alone.
> {noformat}
> ~$ cat .cqlshrc
> [authentication]
> ~$ cat .cassandra/cqlshrc
> [connection]
> ~$ cqlsh
> ~$ cat .cqlshrc
> cat: .cqlshrc: No such file or directory
> ~$ cat .cassandra/cqlshrc
> [authentication]
> ~$
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9965) ColumnFamilyStore.setCompactionStrategyClass() is (somewhat) broken

2015-08-05 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9965:
-
Reviewer: Aleksey Yeschenko

> ColumnFamilyStore.setCompactionStrategyClass() is (somewhat) broken
> ---
>
> Key: CASSANDRA-9965
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9965
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 2.1.x, 2.2.x, 3.0.0 rc1
>
>
> {{ColumnFamilyStore.setCompactionStrategyClass()}} should get the same 
> treatment wrt JMX/schema switches that {{enabled}} got in CASSANDRA-9899.
> It should also not alter the {{CFMetaData}} object directly, ever. Only DDL 
> statements should be allowed to do that.
> CASSANDRA-9712 will temporarily throw UOE for that call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9927) Security for MaterializedViews

2015-08-05 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658990#comment-14658990
 ] 

Aleksey Yeschenko commented on CASSANDRA-9927:
--

bq. Why can't we just inherit base table permissions for 3.0?

That was the initial suggestion, still in CASSANDRA-6477 comments. Jake raised 
a valid point that a user might want to grant access to the view that is a 
subset of the columns in the base to more users (if a view doesn't have 
anything sensitive, but the base table does).

What do we want to do in the long term?

> Security for MaterializedViews
> --
>
> Key: CASSANDRA-9927
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
> Project: Cassandra
>  Issue Type: Task
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.0 beta 1
>
>
> We need to think about how to handle security wrt materialized views. Since 
> they are based on a source table we should possibly inherit the same security 
> model as that table.  
> However I can see cases where users would want to create different security 
> auth for different views.  esp once we have CASSANDRA-9664 and users can 
> filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6717) Modernize schema tables

2015-08-05 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658969#comment-14658969
 ] 

Aleksey Yeschenko commented on CASSANDRA-6717:
--

[~nutbunnies] Secondary indexes, UDFs, UDAs, at least. A much wider range of 
tables - including {{COMPACT STORAGE}}.

> Modernize schema tables
> ---
>
> Key: CASSANDRA-6717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6717
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Sylvain Lebresne
>Assignee: Aleksey Yeschenko
>  Labels: client-impacting, doc-impacting
> Fix For: 3.0 beta 1
>
>
> There is a few problems/improvements that can be done with the way we store 
> schema:
> # CASSANDRA-4988: as explained on the ticket, storing the comparator is now 
> redundant (or almost, we'd need to store whether the table is COMPACT or not 
> too, which we don't currently is easy and probably a good idea anyway), it 
> can be entirely reconstructed from the infos in schema_columns (the same is 
> true of key_validator and subcomparator, and replacing default_validator by a 
> COMPACT_VALUE column in all case is relatively simple). And storing the 
> comparator as an opaque string broke concurrent updates of sub-part of said 
> comparator (concurrent collection addition or altering 2 separate clustering 
> columns typically) so it's really worth removing it.
> # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note 
> that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I 
> think we should fix it once and for-all nonetheless (see below).
> # For CASSANDRA-6382 and to allow indexing both map keys and values at the 
> same time, we'd need to be able to have more than one index definition for a 
> given column.
> # There is a few mismatches in table options between the one stored in the 
> schema and the one used when declaring/altering a table which would be nice 
> to fix. The compaction, compression and replication maps are one already 
> mentioned from CASSANDRA-4603, but also for some reason 
> 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance' 
> in the schema table, and 'min/max_compaction_threshold' are column families 
> option in the schema but just compaction options for CQL (which makes more 
> sense).
> None of those issues are major, and we could probably deal with them 
> independently but it might be simpler to just fix them all in one shot so I 
> wanted to sum them all up here. In particular, the fact that 
> 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map, 
> but it may limit future stuff too) which suggest we should migrate it to a 
> new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt 
> to rename schema_columnfamilies to schema_tables for the years to come since 
> that's the prefered vernacular for CQL.
> Overall, what I would suggest is to move all schema tables to a new keyspace, 
> named 'schema' for instance (or 'system_schema' but I prefer the shorter 
> version), and fix all the issues above at once. Since we currently don't 
> exchange schema between nodes of different versions, all we'd need to do that 
> is a one shot startup migration, and overall, I think it could be simpler for 
> clients to deal with one clear migration than to have to handle minor 
> individual changes all over the place. I also think it's somewhat cleaner 
> conceptually to have schema tables in their own keyspace since they are 
> replicated through a different mechanism than other system tables.
> If we do that, we could, for instance, migrate to the following schema tables 
> (details up for discussion of course):
> {noformat}
> CREATE TYPE user_type (
>   name text,
>   column_names list,
>   column_types list
> )
> CREATE TABLE keyspaces (
>   name text PRIMARY KEY,
>   durable_writes boolean,
>   replication map,
>   user_types map
> )
> CREATE TYPE trigger_definition (
>   name text,
>   options map
> )
> CREATE TABLE tables (
>   keyspace text,
>   name text,
>   id uuid,
>   table_type text, // COMPACT, CQL or SUPER
>   dropped_columns map,
>   triggers map,
>   // options
>   comment text,
>   compaction map,
>   compression map,
>   read_repair_chance double,
>   dclocal_read_repair_chance double,
>   gc_grace_seconds int,
>   caching text,
>   rows_per_partition_to_cache text,
>   default_time_to_live int,
>   min_index_interval int,
>   max_index_interval int,
>   speculative_retry text,
>   populate_io_cache_on_flush boolean,
>   bloom_filter_fp_chance double
>   memtable_flush_period_in_ms int,
>   PRIMARY KEY (keyspace, name)
> )
> CREATE TYPE index_definition (
>   name text,
>   index_type text,
>   options map
> )
> CREATE TABLE columns (
>   keyspace text,
>   table text,
>   

[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Adam Holmberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658970#comment-14658970
 ] 

Adam Holmberg commented on CASSANDRA-9302:
--

Pure python murmur3 is available on a branch 
[here|https://github.com/datastax/python-driver/tree/363]. Ready for Python 2. 
Still going to look at Python 3 and some possible optimizations.

I hope nobody is surprised that the Python implementation is much slower than 
the C implementation. There is a fair amount of bit twiddling while fighting 
the Python integer system to mimic fixed width types and logical shifts. 
Hopefully we can amortize this with batching by partition and/or giving it more 
processes.

Has there been discussion anywhere about implementing a loader on the Java 
driver, now that it's bundled with the server?

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9533) Make batch commitlog mode easier to tune

2015-08-05 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658966#comment-14658966
 ] 

Ariel Weisberg commented on CASSANDRA-9533:
---

OK round #2. This just means that if there is work waiting to sync it starts 
syncing immediately as long as there is stuff pending sync. So you get smart 
batching?

Sounds reasonable for a proper setup, but if you point this at a consumer SSD 
is it just going to chew through erase blocks all day long as writes trickle in?

I am +1 on this. I am just probing for a corner case where it is undesirable.

> Make batch commitlog mode easier to tune
> 
>
> Key: CASSANDRA-9533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9533
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Benedict
> Fix For: 3.x
>
>
> As discussed in CASSANDRA-9504, 2.1 changed commitlog_sync_batch_window_in_ms 
> from a maximum time to wait between fsync to the minimum time, so one must be 
> very careful to keep it small enough that most writers aren't kept waiting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9996) Extra "keyspace updated" SchemaChange when creating/removing a table

2015-08-05 Thread Olivier Michallat (JIRA)
Olivier Michallat created CASSANDRA-9996:


 Summary: Extra "keyspace updated" SchemaChange when 
creating/removing a table
 Key: CASSANDRA-9996
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9996
 Project: Cassandra
  Issue Type: Bug
Reporter: Olivier Michallat
Priority: Minor


When a table gets created or removed, 2.2 sends an extra "keyspace updated" 
schema change event in addition to the normal "table created" event. 2.1 only 
sends table created.

In {{LegacySchemaTables#mergeKeyspaces}}, the keyspace is added to {{altered}} 
so it calls {{Schema#updateKeyspace}}, which triggers the event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6717) Modernize schema tables

2015-08-05 Thread Andrew Hust (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658951#comment-14658951
 ] 

Andrew Hust commented on CASSANDRA-6717:


I added some 
[dtests|https://github.com/riptano/cassandra-dtest/commit/c346c6b87ec081956330e5b3cab2178e4c5b8a23]
 to check schema metadata -- anything additional that should be added?


> Modernize schema tables
> ---
>
> Key: CASSANDRA-6717
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6717
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Sylvain Lebresne
>Assignee: Aleksey Yeschenko
>  Labels: client-impacting, doc-impacting
> Fix For: 3.0 beta 1
>
>
> There is a few problems/improvements that can be done with the way we store 
> schema:
> # CASSANDRA-4988: as explained on the ticket, storing the comparator is now 
> redundant (or almost, we'd need to store whether the table is COMPACT or not 
> too, which we don't currently is easy and probably a good idea anyway), it 
> can be entirely reconstructed from the infos in schema_columns (the same is 
> true of key_validator and subcomparator, and replacing default_validator by a 
> COMPACT_VALUE column in all case is relatively simple). And storing the 
> comparator as an opaque string broke concurrent updates of sub-part of said 
> comparator (concurrent collection addition or altering 2 separate clustering 
> columns typically) so it's really worth removing it.
> # CASSANDRA-4603: it's time to get rid of those ugly json maps. I'll note 
> that schema_keyspaces is a problem due to its use of COMPACT STORAGE, but I 
> think we should fix it once and for-all nonetheless (see below).
> # For CASSANDRA-6382 and to allow indexing both map keys and values at the 
> same time, we'd need to be able to have more than one index definition for a 
> given column.
> # There is a few mismatches in table options between the one stored in the 
> schema and the one used when declaring/altering a table which would be nice 
> to fix. The compaction, compression and replication maps are one already 
> mentioned from CASSANDRA-4603, but also for some reason 
> 'dclocal_read_repair_chance' in CQL is called just 'local_read_repair_chance' 
> in the schema table, and 'min/max_compaction_threshold' are column families 
> option in the schema but just compaction options for CQL (which makes more 
> sense).
> None of those issues are major, and we could probably deal with them 
> independently but it might be simpler to just fix them all in one shot so I 
> wanted to sum them all up here. In particular, the fact that 
> 'schema_keyspaces' uses COMPACT STORAGE is annoying (for the replication map, 
> but it may limit future stuff too) which suggest we should migrate it to a 
> new, non COMPACT table. And while that's arguably a detail, it wouldn't hurt 
> to rename schema_columnfamilies to schema_tables for the years to come since 
> that's the prefered vernacular for CQL.
> Overall, what I would suggest is to move all schema tables to a new keyspace, 
> named 'schema' for instance (or 'system_schema' but I prefer the shorter 
> version), and fix all the issues above at once. Since we currently don't 
> exchange schema between nodes of different versions, all we'd need to do that 
> is a one shot startup migration, and overall, I think it could be simpler for 
> clients to deal with one clear migration than to have to handle minor 
> individual changes all over the place. I also think it's somewhat cleaner 
> conceptually to have schema tables in their own keyspace since they are 
> replicated through a different mechanism than other system tables.
> If we do that, we could, for instance, migrate to the following schema tables 
> (details up for discussion of course):
> {noformat}
> CREATE TYPE user_type (
>   name text,
>   column_names list,
>   column_types list
> )
> CREATE TABLE keyspaces (
>   name text PRIMARY KEY,
>   durable_writes boolean,
>   replication map,
>   user_types map
> )
> CREATE TYPE trigger_definition (
>   name text,
>   options map
> )
> CREATE TABLE tables (
>   keyspace text,
>   name text,
>   id uuid,
>   table_type text, // COMPACT, CQL or SUPER
>   dropped_columns map,
>   triggers map,
>   // options
>   comment text,
>   compaction map,
>   compression map,
>   read_repair_chance double,
>   dclocal_read_repair_chance double,
>   gc_grace_seconds int,
>   caching text,
>   rows_per_partition_to_cache text,
>   default_time_to_live int,
>   min_index_interval int,
>   max_index_interval int,
>   speculative_retry text,
>   populate_io_cache_on_flush boolean,
>   bloom_filter_fp_chance double
>   memtable_flush_period_in_ms int,
>   PRIMARY KEY (keyspace, name)
> )
> CREATE TYPE index_definition (
>   name text,
>   index_type text,
>   options map
> )
> CREATE

[jira] [Comment Edited] (CASSANDRA-9533) Make batch commitlog mode easier to tune

2015-08-05 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658924#comment-14658924
 ] 

Ariel Weisberg edited comment on CASSANDRA-9533 at 8/5/15 9:14 PM:
---

-Is this updating a shared cache line for every single write that flows through 
C* even though the signaling only needs to occur infrequently? Can you 
reformulate this so a write only occurs if the thread being notified is 
actually asleep?-

-Looking at things it seems a little barn door, but I can attest to the big 
difference it makes. The CL I wrote was very sensitive to submission cost and 
topped out around 250k operations/second and that number did vary a lot 
depending on how much shared state had to be mutated as part of submission. It 
was the one big not split lock in the system.-

That's a stupid comment. The thread is going to go to sleep which is is going 
to dominate.


was (Author: aweisberg):
Is this updating a shared cache line for every single write that flows through 
C* even though the signaling only needs to occur infrequently? Can you 
reformulate this so a write only occurs if the thread being notified is 
actually asleep?

Looking at things it seems a little barn door, but I can attest to the big 
difference it makes. The CL I wrote was very sensitive to submission cost and 
topped out around 250k operations/second and that number did vary a lot 
depending on how much shared state had to be mutated as part of submission. It 
was the one big not split lock in the system.

> Make batch commitlog mode easier to tune
> 
>
> Key: CASSANDRA-9533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9533
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Benedict
> Fix For: 3.x
>
>
> As discussed in CASSANDRA-9504, 2.1 changed commitlog_sync_batch_window_in_ms 
> from a maximum time to wait between fsync to the minimum time, so one must be 
> very careful to keep it small enough that most writers aren't kept waiting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9533) Make batch commitlog mode easier to tune

2015-08-05 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658924#comment-14658924
 ] 

Ariel Weisberg commented on CASSANDRA-9533:
---

Is this updating a shared cache line for every single write that flows through 
C* even though the signaling only needs to occur infrequently? Can you 
reformulate this so a write only occurs if the thread being notified is 
actually asleep?

Looking at things it seems a little barn door, but I can attest to the big 
difference it makes. The CL I wrote was very sensitive to submission cost and 
topped out around 250k operations/second and that number did vary a lot 
depending on how much shared state had to be mutated as part of submission. It 
was the one big not split lock in the system.

> Make batch commitlog mode easier to tune
> 
>
> Key: CASSANDRA-9533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9533
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Benedict
> Fix For: 3.x
>
>
> As discussed in CASSANDRA-9504, 2.1 changed commitlog_sync_batch_window_in_ms 
> from a maximum time to wait between fsync to the minimum time, so one must be 
> very careful to keep it small enough that most writers aren't kept waiting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9985) Introduce our own AbstractIterator

2015-08-05 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658893#comment-14658893
 ] 

Ariel Weisberg commented on CASSANDRA-9985:
---

Can you use the AbstractIterator test from Guava on this? It's a chance to 
check for any gotchas they may have come across and quietly engineered around. 

Can I bring up the icache boogyman? Guava appears to have two AbstractIterator 
implementations. One in base and one in collect?

Maybe what we want is to replace Guava's abstract iterator(s) by putting it in 
their package?

> Introduce our own AbstractIterator
> --
>
> Key: CASSANDRA-9985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9985
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Trivial
> Fix For: 3.0.0 rc1
>
>
> The Guava AbstractIterator not only has unnecessary method call depth, it is 
> difficult to debug without attaching source. Since it's absolutely trivial to 
> write our own, and it's used widely within the codebase, I think we should do 
> so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/3] cassandra git commit: Use byte to serialize MT hash length

2015-08-05 Thread yukim
Use byte to serialize MT hash length

patch by Bharatendra Boddu; reviewed by yukim for CASSANDRA-9792


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/90e00131
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/90e00131
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/90e00131

Branch: refs/heads/trunk
Commit: 90e0013126e4875d696891c67d1b22fdb2b8ba7a
Parents: bd7d119
Author: Bharatendra Boddu 
Authored: Wed Aug 5 15:30:00 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 15:41:30 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/utils/MerkleTree.java  | 56 +---
 2 files changed, 39 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/90e00131/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e1f1757..2a5be78 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -2,6 +2,7 @@
  * Disable scripted UDFs by default (CASSANDRA-9889)
  * Add transparent data encryption core classes (CASSANDRA-9945)
  * Bytecode inspection for Java-UDFs (CASSANDRA-9890)
+ * Use byte to serialize MT hash length (CASSANDRA-9792)
 Merged from 2.2:
  * Log warning when using an aggregate without partition key (CASSANDRA-9737)
 Merged from 2.1:

http://git-wip-us.apache.org/repos/asf/cassandra/blob/90e00131/src/java/org/apache/cassandra/utils/MerkleTree.java
--
diff --git a/src/java/org/apache/cassandra/utils/MerkleTree.java 
b/src/java/org/apache/cassandra/utils/MerkleTree.java
index 3840622..b4a782d 100644
--- a/src/java/org/apache/cassandra/utils/MerkleTree.java
+++ b/src/java/org/apache/cassandra/utils/MerkleTree.java
@@ -35,6 +35,7 @@ import org.apache.cassandra.exceptions.ConfigurationException;
 import org.apache.cassandra.io.IVersionedSerializer;
 import org.apache.cassandra.io.util.DataInputPlus;
 import org.apache.cassandra.io.util.DataOutputPlus;
+import org.apache.cassandra.net.MessagingService;
 
 /**
  * A MerkleTree implemented as a binary tree.
@@ -817,12 +818,15 @@ public class MerkleTree implements Serializable
 {
 public void serialize(Inner inner, DataOutputPlus out, int 
version) throws IOException
 {
-if (inner.hash == null)
-out.writeInt(-1);
-else
+if (version < MessagingService.VERSION_30)
 {
-out.writeInt(inner.hash.length);
-out.write(inner.hash);
+if (inner.hash == null)
+out.writeInt(-1);
+else
+{
+out.writeInt(inner.hash.length);
+out.write(inner.hash);
+}
 }
 Token.serializer.serialize(inner.token, out, version);
 Hashable.serializer.serialize(inner.lchild, out, version);
@@ -831,10 +835,13 @@ public class MerkleTree implements Serializable
 
 public Inner deserialize(DataInput in, IPartitioner p, int 
version) throws IOException
 {
-int hashLen = in.readInt();
-byte[] hash = hashLen >= 0 ? new byte[hashLen] : null;
-if (hash != null)
-in.readFully(hash);
+if (version < MessagingService.VERSION_30)
+{
+int hashLen = in.readInt();
+byte[] hash = hashLen >= 0 ? new byte[hashLen] : null;
+if (hash != null)
+in.readFully(hash);
+}
 Token token = Token.serializer.deserialize(in, p, version);
 Hashable lchild = Hashable.serializer.deserialize(in, p, 
version);
 Hashable rchild = Hashable.serializer.deserialize(in, p, 
version);
@@ -843,9 +850,13 @@ public class MerkleTree implements Serializable
 
 public long serializedSize(Inner inner, int version)
 {
-int size = inner.hash == null
-? TypeSizes.sizeof(-1)
-: TypeSizes.sizeof(inner.hash().length) + 
inner.hash().length;
+long size = 0;
+if (version < MessagingService.VERSION_30)
+{
+size += inner.hash == null
+   ? TypeSizes.sizeof(-1)
+   : TypeSizes.sizeof(inner.hash().length) 
+ inner.hash().length;
+}
 
 size += Token.serializer.serializedSize(inner.token, version)
 + Hashable.serializer.serializedSize(

[3/3] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

2015-08-05 Thread yukim
Merge branch 'cassandra-3.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/760dbd95
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/760dbd95
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/760dbd95

Branch: refs/heads/trunk
Commit: 760dbd957c3c2cc69ea7d74a954891bbe3a156b4
Parents: de49ed8 90e0013
Author: Yuki Morishita 
Authored: Wed Aug 5 15:43:30 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 15:43:30 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/utils/MerkleTree.java  | 56 +---
 2 files changed, 39 insertions(+), 18 deletions(-)
--




[1/3] cassandra git commit: Use byte to serialize MT hash length

2015-08-05 Thread yukim
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 bd7d1198a -> 90e001312
  refs/heads/trunk de49ed84c -> 760dbd957


Use byte to serialize MT hash length

patch by Bharatendra Boddu; reviewed by yukim for CASSANDRA-9792


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/90e00131
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/90e00131
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/90e00131

Branch: refs/heads/cassandra-3.0
Commit: 90e0013126e4875d696891c67d1b22fdb2b8ba7a
Parents: bd7d119
Author: Bharatendra Boddu 
Authored: Wed Aug 5 15:30:00 2015 -0500
Committer: Yuki Morishita 
Committed: Wed Aug 5 15:41:30 2015 -0500

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/utils/MerkleTree.java  | 56 +---
 2 files changed, 39 insertions(+), 18 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/90e00131/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e1f1757..2a5be78 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -2,6 +2,7 @@
  * Disable scripted UDFs by default (CASSANDRA-9889)
  * Add transparent data encryption core classes (CASSANDRA-9945)
  * Bytecode inspection for Java-UDFs (CASSANDRA-9890)
+ * Use byte to serialize MT hash length (CASSANDRA-9792)
 Merged from 2.2:
  * Log warning when using an aggregate without partition key (CASSANDRA-9737)
 Merged from 2.1:

http://git-wip-us.apache.org/repos/asf/cassandra/blob/90e00131/src/java/org/apache/cassandra/utils/MerkleTree.java
--
diff --git a/src/java/org/apache/cassandra/utils/MerkleTree.java 
b/src/java/org/apache/cassandra/utils/MerkleTree.java
index 3840622..b4a782d 100644
--- a/src/java/org/apache/cassandra/utils/MerkleTree.java
+++ b/src/java/org/apache/cassandra/utils/MerkleTree.java
@@ -35,6 +35,7 @@ import org.apache.cassandra.exceptions.ConfigurationException;
 import org.apache.cassandra.io.IVersionedSerializer;
 import org.apache.cassandra.io.util.DataInputPlus;
 import org.apache.cassandra.io.util.DataOutputPlus;
+import org.apache.cassandra.net.MessagingService;
 
 /**
  * A MerkleTree implemented as a binary tree.
@@ -817,12 +818,15 @@ public class MerkleTree implements Serializable
 {
 public void serialize(Inner inner, DataOutputPlus out, int 
version) throws IOException
 {
-if (inner.hash == null)
-out.writeInt(-1);
-else
+if (version < MessagingService.VERSION_30)
 {
-out.writeInt(inner.hash.length);
-out.write(inner.hash);
+if (inner.hash == null)
+out.writeInt(-1);
+else
+{
+out.writeInt(inner.hash.length);
+out.write(inner.hash);
+}
 }
 Token.serializer.serialize(inner.token, out, version);
 Hashable.serializer.serialize(inner.lchild, out, version);
@@ -831,10 +835,13 @@ public class MerkleTree implements Serializable
 
 public Inner deserialize(DataInput in, IPartitioner p, int 
version) throws IOException
 {
-int hashLen = in.readInt();
-byte[] hash = hashLen >= 0 ? new byte[hashLen] : null;
-if (hash != null)
-in.readFully(hash);
+if (version < MessagingService.VERSION_30)
+{
+int hashLen = in.readInt();
+byte[] hash = hashLen >= 0 ? new byte[hashLen] : null;
+if (hash != null)
+in.readFully(hash);
+}
 Token token = Token.serializer.deserialize(in, p, version);
 Hashable lchild = Hashable.serializer.deserialize(in, p, 
version);
 Hashable rchild = Hashable.serializer.deserialize(in, p, 
version);
@@ -843,9 +850,13 @@ public class MerkleTree implements Serializable
 
 public long serializedSize(Inner inner, int version)
 {
-int size = inner.hash == null
-? TypeSizes.sizeof(-1)
-: TypeSizes.sizeof(inner.hash().length) + 
inner.hash().length;
+long size = 0;
+if (version < MessagingService.VERSION_30)
+{
+size += inner.hash == null
+   ? TypeSizes.sizeof(-1)
+   : TypeSizes.sizeof(inner.hash().length) 
+ inner.hash().length;
+   

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-05 Thread mlowicki (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658878#comment-14658878
 ] 

mlowicki commented on CASSANDRA-9935:
-

It didn't print anything to the console on all nodes. I can grep through 
system.log or attach logs from each box if this helps?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
>  ~[apache-ca

[jira] [Commented] (CASSANDRA-9932) Make all partitions btree backed

2015-08-05 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658873#comment-14658873
 ] 

Ariel Weisberg commented on CASSANDRA-9932:
---

This looks really nice especially how it removes duplicated code. At a high 
level I have nothing to complain about. At a low level it's pretty hard to have 
confidence just inspecting the code I just want to focus on how convincing the 
tests are. I will do coverage for the utests now, and continue reviewing after.

Code coverage for utests, and dtests separately and together would be helpful 
in reviewing this. I could generate that myself if I knew the magic incantation 
for code coverage and dtests. This change to a large extent looks like a 
refactor and not new code so I am trying not to wade too deep into the existing 
code although as you will see I fail at that in my review.

CachedPartition still refers to ArrayBackedPartition in comments. Worth a pass 
to clean up comments pointing to the removed classes.

With default methods it really seems to me that we always want @Override 
annotations. When someone adds a default they can be careful and add the 
@Override, but when someone is implementing an interface to they have go and 
check which ones have defaults and just add it for those or use them all the 
time? For me the refactor pain from missing annotations heavily outweighs the 
extra typing/text.

AbstractBTreePartition.Holder.with appears to be unused.

I ran code coverage for some tests and there seems to be untested stuff in 
BTree such as transformAndFilter. reverse() which is new for this patch also 
has no coverage. I didn't check coverage from other tests. I don't see unit 
tests for the various partition types in isolation  AtomicBTreePartition from 
PartitionTest doesn't cover stuff like waste tracking or pessimistic locking. 
Is that coverage supposed to come indirectly via Memtable tests? 
AbstractBTreePartition also has stuff that isn't tested.

> Make all partitions btree backed
> 
>
> Key: CASSANDRA-9932
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9932
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 3.0.0 rc1
>
>
> Following on from the other btree related refactors, this patch makes all 
> partition (and partition-like) objects backed by the same basic structure: 
> {{AbstractBTreePartition}}. With two main offshoots: 
> {{ImmutableBTreePartition}} and {{AtomicBTreePartition}}
> The main upshot is a 30% net code reduction, meaning better exercise of btree 
> code paths and fewer new code paths to go wrong. A secondary upshort is that, 
> by funnelling all our comparisons through a btree, there is a higher 
> likelihood of icache occupancy and we have only one area to focus delivery of 
> improvements for their enjoyment by all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8921) Experiment with a probabilistic tree of membership for maxPurgeableTimestamp

2015-08-05 Thread Daniel Chia (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658868#comment-14658868
 ] 

Daniel Chia commented on CASSANDRA-8921:


[~benedict] if no one is working on this, I might give this a go.

> Experiment with a probabilistic tree of membership for maxPurgeableTimestamp
> 
>
> Key: CASSANDRA-8921
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8921
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>
> maxPurgeableTimestamp appears to be a significant cost for some workloads, 
> the majority of which stemming from the cost of membership tests across the 
> overlapping tables. It would be possible to construct a tree of bloom filters 
> from the existing filters, that could yield queries of the set of possible 
> membership of a given key with logarithmic performance, and it appears there 
> is a research paper (that I haven't dived into yet) that outlines something 
> like this http://www.usna.edu/Users/cs/adina/research/Bloofi%20_CloudI2013.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2015-08-05 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658865#comment-14658865
 ] 

Yuki Morishita commented on CASSANDRA-9935:
---

Thanks.
Hmm, do you still have logs when running nodetool scrub?
Did it detect out of order rows in any SSTable?

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Attachments: db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
>  ~[apache-cassan

[jira] [Commented] (CASSANDRA-7918) Provide graphing tool along with cassandra-stress

2015-08-05 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658847#comment-14658847
 ] 

Jeremy Hanna commented on CASSANDRA-7918:
-

Can this be committed then?  Just didn't know if it was waiting on anything 
else.

> Provide graphing tool along with cassandra-stress
> -
>
> Key: CASSANDRA-7918
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7918
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Benedict
>Assignee: Ryan McGuire
>Priority: Minor
> Attachments: 7918.patch, reads.svg
>
>
> Whilst cstar makes some pretty graphs, they're a little limited and also 
> require you to run your tests through it. It would be useful to be able to 
> graph results from any stress run easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9927) Security for MaterializedViews

2015-08-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658812#comment-14658812
 ] 

Jonathan Ellis commented on CASSANDRA-9927:
---

Why can't we just inherit base table permissions for 3.0?

> Security for MaterializedViews
> --
>
> Key: CASSANDRA-9927
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
> Project: Cassandra
>  Issue Type: Task
>Reporter: T Jake Luciani
>  Labels: materializedviews
> Fix For: 3.0 beta 1
>
>
> We need to think about how to handle security wrt materialized views. Since 
> they are based on a source table we should possibly inherit the same security 
> model as that table.  
> However I can see cases where users would want to create different security 
> auth for different views.  esp once we have CASSANDRA-9664 and users can 
> filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7237) Optimize batchlog manager to avoid full scans

2015-08-05 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658790#comment-14658790
 ] 

Aleksey Yeschenko commented on CASSANDRA-7237:
--

Rebased and fixed the build 
[here|https://github.com/iamaleksey/cassandra/commits/7237]. Also fixed some 
very minor nits myself to not waste your time:
- naming of constants in {{SystemKeyspace}} being inconsistent with the rest of 
the defined tables there
- {{replayAllFailedBatches()}} doesn't actually throw any declared exceptions 
anymore; removed the use of {{WrappedRunnable}} with Java 8 method references
- made some static methods {{static}} there, while we are editing it anyway, to 
satisfy the annoying IDEA inspections
- copy-paste-ish code in {{calculatePageSize()}} was still referring to hints

There are two issues with the patch:
- batches created in 3.0 will not be understood by 2.1/2.2 nodes (new table), 
breaking upgrades
- batches created in 2.1/2.2 will be written to the deprecated table and not 
noticed/replayed until the next node restart, when conversion will happen again

However, I suggest not fixing these issues here, since that would duplicate 
[~Stefania]'s work on CASSANDRA-9673, that already has to deal with 
compatibility in both directions.

If you don't mind my (extremely minor) changes, and letting Stefania handle the 
upgrade issue, I'm going to commit as is as soon as cassci is happy 
([testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-7237-testall/],
 
[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-7237-dtest/]).

Two more things, for a follow up ticket, if reasonable:
1. We can remember the uuid of the last replayed batch and only scan from there 
to (now - timeout). Or maybe add some correction for error and start with (last 
- timeout).
2. If we only scan from (last - timeout) to (now - timeout) - instead of 
pre-3.0 scan (allthethings), then we might consider replaying more often than 
ever 60 seconds (make it 10, or come up with some other number).

> Optimize batchlog manager to avoid full scans
> -
>
> Key: CASSANDRA-7237
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7237
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Aleksey Yeschenko
>Assignee: Branimir Lambov
>Priority: Minor
> Fix For: 3.0.0 rc1
>
>
> Now that we use time-UUIDs for batchlog ids, and given that w/ local strategy 
> the partitions are ordered in time-order here, we can optimize the scanning 
> by limiting the range to replay taking the last replayed batch's id as the 
> beginning of the range, and uuid(now+timeout) as its end.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9792) Reduce Merkle tree serialized size

2015-08-05 Thread Bharatendra Boddu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658597#comment-14658597
 ] 

Bharatendra Boddu commented on CASSANDRA-9792:
--

OK. We can discuss on token serialization changes in a new ticket.

> Reduce Merkle tree serialized size
> --
>
> Key: CASSANDRA-9792
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9792
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Bharatendra Boddu
>Priority: Minor
> Fix For: 3.x
>
> Attachments: MerkleTree.java.patch, Token.java.patch
>
>
> This patch reduces the serialized size of a Merkle Tree by 10%.  With 
> num_tokens set to 256, 10% reduction in Merkle tree serialized size for each 
> token range repair, improves network bandwidth during repair 
> This table describes serialized sizes (in bytes) of Merkle trees with 
> different depths before and after patch. 
> Serialized size of a Merkle tree with certain depth, doesn't depend on number 
> of keys it represent.
> | Depth | Before patch | After patch |  Diff |
> |---+--+-+---|
> | 5 | 2060 |1840 |   220 |
> | 6 | 4044 |3600 |   444 |
> | 7 | 8012 |7120 |   892 |
> | 8 |15948 |   14160 |  1788 |
> | 9 |31820 |   28240 |  3580 |
> |10 |63564 |   56400 |  7164 |
> |11 |   127052 |  112720 | 14332 |
> |12 |   254028 |  225360 | 28668 |
> |13 |   507980 |  450640 | 57340 |
> Merkle tree with depth 15, uses serialized size of ~2MB and with this patch 
> it will be reduce the size by ~200KB. Repairing 256 token ranges will save 
> ~50MB in transfer.
> Also if token serialize() method uses, byte type to represent a token size, 
> then the serialized size can be reduced by 30 to 40%.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9995) Add background consistency mode for MV

2015-08-05 Thread Carl Yeksigian (JIRA)
Carl Yeksigian created CASSANDRA-9995:
-

 Summary: Add background consistency mode for MV
 Key: CASSANDRA-9995
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9995
 Project: Cassandra
  Issue Type: New Feature
Reporter: Carl Yeksigian


Currently, we only support a fast refresh mode which slows down writes, but 
brings the base and view to consistency quickly. It would be possible to keep 
reads and writes close to the same performance they have now by sacrificing the 
time to consistency.

The way this mode would work is:
- When data is flushed, the sstable is marked as inconsistent for MV
- Compaction can only run on either the set of sstables which are consistent, 
or the set of sstables which are inconsistent, but cannot mix the two
- A background job would take the sstables which are inconsistent and compare 
them to the current set which are consistent and generate the appropriate 
updates for the index to bring it up to date
- Any newly streamed sstables would be marked as inconsistent and would be 
included the next time the job ran

The background consistency job could be configured to run whenever a new 
sstable is flushed, or at certain time intervals.

By switching to a job which only looked at the flushed sstables, we wouldn't 
have to worry about the memtable updates which generate updates to the view but 
aren't recorded anywhere. We also wouldn't have to do any coordination at write 
time, use the batchlog for these writes, or issue any new updates when applying 
the MV update.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9426) Provide a per-table text->text map for storing extra metadata

2015-08-05 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658492#comment-14658492
 ] 

Blake Eggleston commented on CASSANDRA-9426:


What about something like {code}... WITH EXTENSION anything = goes AND 
EXTENSION something_else = {'a':'b', 'c': 'd'}{code}

I don't think you'd need to change the way they're stored in the schema table, 
but this would make ddl statements easier to read/write when working with more 
than one extension. It would also let you do alter statements on individual 
extensions without having to include the entire extensions map.

Also, would this add hooks for validating the extension key value pairs?

> Provide a per-table text->text map for storing extra metadata
> -
>
> Key: CASSANDRA-9426
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9426
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Sam Tunnicliffe
>  Labels: client-impacting
> Fix For: 3.0 beta 1
>
>
> For some applications that build on Cassandra it's important to be able to 
> attach extra metadata to tables, and have it be distributed via regular 
> Cassandra schema paths.
> I propose a new {{extensions map}} table param for just that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9994) super_counter_test.py failing on Windows

2015-08-05 Thread Philip Thompson (JIRA)
Philip Thompson created CASSANDRA-9994:
--

 Summary: super_counter_test.py failing on Windows
 Key: CASSANDRA-9994
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9994
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Philip Thompson
Assignee: Paulo Motta
 Fix For: 2.2.x


The dtest 
{{super_counter_test.py:TestSuperCounterClusterRestart.functional_test}} which 
tests regressions of CASSANDRA-3821 is failing on windows.

The test creates a column family with supercolumns and counters, then inserts 
data. After cluster restart, the check if any data is lost, is now failing on 
windows 2.2-HEAD at sha 5c59d5af



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9894) Serialize the header only once per message

2015-08-05 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658428#comment-14658428
 ] 

Ariel Weisberg commented on CASSANDRA-9894:
---

I ran code coverage on ColumnsTest. serializeLargeSubset, 
deserializeLargeSubset, serializeLargeSubsetSize all have no coverage. In 
deserialize, the getDroppedColumnDefinition path never runs, although if we 
unit test getDroppedColumnDefinition then maybe we don't care.

WRT to the test and what is tested. Can you use a random with a known or logged 
seed for the test? It looks like randomHuge() accidentally returns an empty 
ArrayList. I changed it to return the assembled list and then the test didn't 
pass.

There are a couple of other methods like digest, selectOrderIterator, 
getComplex, and getSimple that don't run. I get surprised pretty regularly so I 
would test those as well.

> Serialize the header only once per message
> --
>
> Key: CASSANDRA-9894
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9894
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Sylvain Lebresne
>Assignee: Benedict
> Fix For: 3.0 beta 1
>
>
> One last improvement I'd like to do on the serialization side is that we 
> currently serialize the {{SerializationHeader}} for each partition. That 
> header contains the serialized columns in particular and for range queries, 
> serializing that for every partition is wasted (note that it's only a problem 
> for the messaging protocol as for sstable we only write the header once per 
> sstable).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9992) Sending batchlog verb to previous versions

2015-08-05 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9992:
--
Fix Version/s: (was: 3.0 beta 1)
   3.0.0 rc1

> Sending batchlog verb to previous versions
> --
>
> Key: CASSANDRA-9992
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9992
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Carl Yeksigian
>Assignee: Carl Yeksigian
> Fix For: 3.0.0 rc1
>
>
> We are currently sending {{Verb.BATCHLOG_MUTATION}} in 
> {{StorageProxy.syncWriteToBatchlog}} and 
> {{StorageProxy.asyncRemoveFromBatchlog}}. to previous versions which do not 
> have that Verb. We should be sending them {{Verb.MUTATION}} instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9993) Unused verb for MV

2015-08-05 Thread Carl Yeksigian (JIRA)
Carl Yeksigian created CASSANDRA-9993:
-

 Summary: Unused verb for MV
 Key: CASSANDRA-9993
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9993
 Project: Cassandra
  Issue Type: Bug
Reporter: Carl Yeksigian
Assignee: Carl Yeksigian
 Fix For: 3.0 beta 1


In CASSANDRA-6477, we added {{Verb.MATERIALIZEDVIEW_MUTATION}}, which is now 
unused. We should remove this verb.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9426) Provide a per-table text->text map for storing extra metadata

2015-08-05 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658364#comment-14658364
 ] 

Aleksey Yeschenko commented on CASSANDRA-9426:
--

{{... WITH extensions = \{'anything': 'goes', 'here': 'too'\};}}

> Provide a per-table text->text map for storing extra metadata
> -
>
> Key: CASSANDRA-9426
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9426
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Sam Tunnicliffe
>  Labels: client-impacting
> Fix For: 3.0 beta 1
>
>
> For some applications that build on Cassandra it's important to be able to 
> attach extra metadata to tables, and have it be distributed via regular 
> Cassandra schema paths.
> I propose a new {{extensions map}} table param for just that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9426) Provide a per-table text->text map for storing extra metadata

2015-08-05 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658359#comment-14658359
 ] 

Jeremiah Jordan commented on CASSANDRA-9426:


Do you have a proposed syntax for setting values into the map from CQL?

> Provide a per-table text->text map for storing extra metadata
> -
>
> Key: CASSANDRA-9426
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9426
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Aleksey Yeschenko
>Assignee: Sam Tunnicliffe
>  Labels: client-impacting
> Fix For: 3.0 beta 1
>
>
> For some applications that build on Cassandra it's important to be able to 
> attach extra metadata to tables, and have it be distributed via regular 
> Cassandra schema paths.
> I propose a new {{extensions map}} table param for just that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9975) Flatten RowIterator call hierarchy with a shared RowTransformer

2015-08-05 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9975:
-
Reviewer: Aleksey Yeschenko

> Flatten RowIterator call hierarchy with a shared RowTransformer
> ---
>
> Key: CASSANDRA-9975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9975
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 3.0.0 rc1
>
>
> Stepping through a read response is made exceedingly difficult by the sheer 
> depth of the call hierarchy, and how rapidly your context jumps around. This 
> ticket intend to partially address that, by flattening one of the main causes 
> of this: iterator transformations.
> I have a patch that attempts to mitigate (but not entirely eliminate) this, 
> through the introduction of a {{RowTransformer}} class that all 
> transformations are applied through. If a transformation has already been 
> applied, the {{RowTransformer}} class does not wrap a new iterator, but 
> instead returns a new {{RowTransformer}} that wraps the original underlying 
> (untransformed) iterator and both transformations. This can accumulate an 
> arbitrary number of transformations and, quite importantly, can apply the 
> filtration step {{Unfiltered -> Row}}  in the same instance as well. The 
> intention being that a majority of control flow happens inside this 
> {{RowTransformer}}, so there is far less context jumping to cope with.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9961) cqlsh should have DESCRIBE MATERIALIZED VIEW

2015-08-05 Thread Adam Holmberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658353#comment-14658353
 ] 

Adam Holmberg commented on CASSANDRA-9961:
--

https://datastax-oss.atlassian.net/browse/PYTHON-371

> cqlsh should have DESCRIBE MATERIALIZED VIEW
> 
>
> Key: CASSANDRA-9961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9961
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Carl Yeksigian
>Assignee: Stefania
>  Labels: client-impacting, materializedviews
> Fix For: 3.0 beta 1
>
>
> cqlsh doesn't currently produce describe output that can be used to recreate 
> a MV. Needs to add a new {{DESCRIBE MATERIALIZED VIEW}} command, and also add 
> to {{DESCRIBE KEYSPACE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658349#comment-14658349
 ] 

Marcus Olsson commented on CASSANDRA-5220:
--

Created a pull request [here|https://github.com/stef1927/cassandra/pull/2] to 
your branch.

Most comments should've been fixed but there was one in particular I wasn't 
100% sure about. In _RepairJobDesc.java_ in the _deserialize()_ method:
{quote}
// CR-TODO is it safe to use the MS.globalPartitioner() here?
range = (Range) AbstractBounds.tokenSerializer.deserialize(in,
MessagingService.globalPartitioner(), version);
{quote}
Not sure what to use instead, but I guess it should be safe since the trunk 
version uses it.

> Repair improvements when using vnodes
> -
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Brandon Williams
>Assignee: Marcus Olsson
>  Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
> cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
> cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9992) Sending batchlog verb to previous versions

2015-08-05 Thread Carl Yeksigian (JIRA)
Carl Yeksigian created CASSANDRA-9992:
-

 Summary: Sending batchlog verb to previous versions
 Key: CASSANDRA-9992
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9992
 Project: Cassandra
  Issue Type: Bug
Reporter: Carl Yeksigian
Assignee: Carl Yeksigian
 Fix For: 3.0 beta 1


We are currently sending {{Verb.BATCHLOG_MUTATION}} in 
{{StorageProxy.syncWriteToBatchlog}} and 
{{StorageProxy.asyncRemoveFromBatchlog}}. to previous versions which do not 
have that Verb. We should be sending them {{Verb.MUTATION}} instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-9980) test_eat_glass in cqlsh_tests fails on windows

2015-08-05 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta reassigned CASSANDRA-9980:
--

Assignee: Paulo Motta

> test_eat_glass in cqlsh_tests fails on windows
> --
>
> Key: CASSANDRA-9980
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9980
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Philip Thompson
>Assignee: Paulo Motta
>  Labels: cqlsh, windows
> Fix For: 2.2.x
>
>
> The cqlsh dtest {{cqlsh_tests.TestCqlsh.test_eat_glass}} is failing on 
> windows with 2.2-head. It has been failing for a very long time. We've looked 
> into it before, but haven't figured it out. Cqlsh does not return anything on 
> the following query:
> {code}
> self.assertEquals(output.count('Можам да јадам стакло, а не ме штета.'), 16)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9915) IndexError('list index out of range') when trying to connect to Cassandra cluster with cqlsh

2015-08-05 Thread Adam Holmberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658331#comment-14658331
 ] 

Adam Holmberg commented on CASSANDRA-9915:
--

The driver is definitely having problems with the {{healthcheck.testdyna}} 
table, which has a comparator of {{DynamicCompositeType()}} (with no subtypes). 
The driver was written assuming composite types should always have subtypes. 
*[~prabir_apache] do you know how this table was created?* I'm trying to figure 
out if this is valid metadata that should be handled, or if something has gone 
wrong on the server side.

In any case, I created this ticket to make the driver more robust against these 
unexpected metadata states.
https://datastax-oss.atlassian.net/browse/PYTHON-370

In the mean time, if you are not using that table you can resolve this by 
dropping it via JDBC.

> IndexError('list index out of range') when trying to connect to Cassandra 
> cluster with cqlsh
> 
>
> Key: CASSANDRA-9915
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9915
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu, Cassandra 2.5.1, Python 2.7.3
>Reporter: Prabir Kr Sarkar
>Assignee: Adam Holmberg
>Priority: Critical
>  Labels: cqlsh
> Fix For: 2.1.x
>
> Attachments: schema_columnfamilies.xls
>
>
> Cassandra by default uses a Python driver to connect
> {code}
> >>> cluster = Cluster(['IP'], protocol_version=3)
> >>> session = cluster.connect()
> Traceback (most recent call last):
>   File "", line 1, in 
>   File "/usr/local/lib/python2.7/dist-packages/cassandra/cluster.py", line 
> 839, in connect
> self.control_connection.connect()
>   File "/usr/local/lib/python2.7/dist-packages/cassandra/cluster.py", line 
> 2075, in connect
> self._set_new_connection(self._reconnect_internal())
>   File "/usr/local/lib/python2.7/dist-packages/cassandra/cluster.py", line 
> 2110, in _reconnect_internal
> raise NoHostAvailable("Unable to connect to any servers", errors)
> cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers', 
> {'IP': IndexError('list index out of range',)})
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9962) WaitQueueTest is flakey

2015-08-05 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658318#comment-14658318
 ] 

Ariel Weisberg commented on CASSANDRA-9962:
---

My bad. I looked at the bottom part of the diff and it doesn't show how wait 
queue separates registration from waiting.

+1

> WaitQueueTest is flakey
> ---
>
> Key: CASSANDRA-9962
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9962
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Minor
> Fix For: 3.x
>
>
> While the test is a little noddy, and superfluous, it shouldn't fail even 
> vanishingly infrequently. [~aweisberg] has spotted it doing so, and I have 
> also encountered it once, so I suspect that a change in hardware/OS may have 
> made vanishingly unlikely just pretty unlikely, which is even less good. 
> Right now it depends on {{Thread.start()}} completing before the new thread 
> starts; this isn't guaranteed. This patch fixes that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9920) Consider making Materialized Views a schema target in events

2015-08-05 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658303#comment-14658303
 ] 

Carl Yeksigian commented on CASSANDRA-9920:
---

I've been thinking about this for CASSANDRA-9921; moving MV's to be a part of a 
Keyspace and not a Table makes sense. 2i can only be queried through the table, 
but MV's can be queried directly at the Keyspace level, and with CASSANDRA-9921 
we'll remove MV from the Tables schema altogether. 

Since it will no longer be a Table and will be modeled as a separate schema 
object altogether, it should also be a schema event target.

> Consider making Materialized Views a schema target in events
> 
>
> Key: CASSANDRA-9920
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9920
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Carl Yeksigian
>  Labels: client-impacting, materializedviews
> Fix For: 3.0 beta 1
>
>
> Make views be a schema target like Tables and Types in protocol events, 
> rather than a property of Tables like 2i.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9312) Provide a way to retrieve the write time of a CQL row

2015-08-05 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658295#comment-14658295
 ] 

Tyler Hobbs commented on CASSANDRA-9312:


In addition to the writetime, we need to support fetching the ttl of a row with 
only primary key columns.

> Provide a way to retrieve the write time of a CQL row
> -
>
> Key: CASSANDRA-9312
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9312
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API
>Reporter: Nicolas Favre-Felix
>
> There is currently no way to retrieve the "writetime" of a CQL row. This is 
> an issue for tables in which all dimensions are part of the primary key.
> Since Cassandra already stores a cell for the CQL row, it would make sense to 
> provide a way to read its timestamp. This feature would be consistent with 
> the concept of a row as an entity containing a number of optional columns, 
> but able to exist on its own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9985) Introduce our own AbstractIterator

2015-08-05 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658310#comment-14658310
 ] 

Benedict commented on CASSANDRA-9985:
-

FTR, I've had to update the patch to closer to Guava because we do in fact use 
null returns (I had thought we did not).

To be honest, I would be totally comfortable just porting the Guava code 
wholesale (or almost wholesale; the removal of tryComputeNext is still nice).

> Introduce our own AbstractIterator
> --
>
> Key: CASSANDRA-9985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9985
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Trivial
> Fix For: 3.0.0 rc1
>
>
> The Guava AbstractIterator not only has unnecessary method call depth, it is 
> difficult to debug without attaching source. Since it's absolutely trivial to 
> write our own, and it's used widely within the codebase, I think we should do 
> so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9985) Introduce our own AbstractIterator

2015-08-05 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658273#comment-14658273
 ] 

Benedict commented on CASSANDRA-9985:
-

I wouldn't care if it weren't for the sheer number of these calls that we chain 
together now, so that 75% of your code stepping is through a linked jar. With 
multiple versions it is hard to link the source, and so the majority of your 
debug stepping is through a morass of unknown. This patch also removes one 
method call from that chain, which shrinks the height of the call tree 
significantly when they're added together.

But, as I say, I figured this little patch might split opinion so I only 
propose it for inclusion at the consensus of the community.

It;s worth noting that CASSANDRA-9975 also mitigates the problem, however it 
leaves a lot of these iterators in the call tree.

> Introduce our own AbstractIterator
> --
>
> Key: CASSANDRA-9985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9985
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Trivial
> Fix For: 3.0.0 rc1
>
>
> The Guava AbstractIterator not only has unnecessary method call depth, it is 
> difficult to debug without attaching source. Since it's absolutely trivial to 
> write our own, and it's used widely within the codebase, I think we should do 
> so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9985) Introduce our own AbstractIterator

2015-08-05 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658264#comment-14658264
 ] 

Jason Brown edited comment on CASSANDRA-9985 at 8/5/15 2:15 PM:


Rereading through the guava AbstractIterator code, I didn't feel it to be too 
overbearing (there's the state machine and a Precondition call). TBH, I'm kinda 
mixed on this - I appreciate one less external dependency, and the patch is 
simple enough. If others are fine with the change, I will be, as well.

Nit:
- in your AbstractIterator.remove(), it's a no-op. In the guava version, it 
throws an UnsupportedOperationException as it derives from UnmodifiableIterator


was (Author: jasobrown):
Rereading through the guava AbstractIterator code, I didn't feel it to be too 
overbearing (there's the state machine and a Precondition call). TBH, I'm kinda 
mixed on this - I appreciate one less external dependency, and the patch is 
simple enough. If others are fine with the change, I will be, as well.

Nit:
- in your AbstractIterator.remove(), it's a no-op. In the guava version, it 
throws a UnsupportedOperationException as it derives from UnmodifiableIterator

> Introduce our own AbstractIterator
> --
>
> Key: CASSANDRA-9985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9985
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Trivial
> Fix For: 3.0.0 rc1
>
>
> The Guava AbstractIterator not only has unnecessary method call depth, it is 
> difficult to debug without attaching source. Since it's absolutely trivial to 
> write our own, and it's used widely within the codebase, I think we should do 
> so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658262#comment-14658262
 ] 

Marcus Olsson edited comment on CASSANDRA-5220 at 8/5/15 2:15 PM:
--

While looking at CASSANDRA-5839 I realized that this might break something 
during upgrade from 2.2->3.0, with this patch the table _repair_history_ 
changes to have a set of ranges instead of a start and end range.

Should I change the table back and do one insert per range instead?


was (Author: molsson):
While looking at CASSANDRA-5839 I realized that this might break something 
during upgrade from 2.2->3.0, with this patch the table _repair_history_ 
changes to have a set of ranges instead of a start and end range. (This patch 
was first done when

Should I change the table back and do one insert per range instead?

> Repair improvements when using vnodes
> -
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Brandon Williams
>Assignee: Marcus Olsson
>  Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
> cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
> cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9985) Introduce our own AbstractIterator

2015-08-05 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658264#comment-14658264
 ] 

Jason Brown commented on CASSANDRA-9985:


Rereading through the guava AbstractIterator code, I didn't feel it to be too 
overbearing (there's the state machine and a Precondition call). TBH, I'm kinda 
mixed on this - I appreciate one less external dependency, and the patch is 
easy enough. If others are fine with the change, I will be, as well.

Nit:
- in your AbstractIterator.remove(), it's a no-op. In the guava version, they 
throw a UnsupportedOperationException as it derives from UnmodifiableIterator

> Introduce our own AbstractIterator
> --
>
> Key: CASSANDRA-9985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9985
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Trivial
> Fix For: 3.0.0 rc1
>
>
> The Guava AbstractIterator not only has unnecessary method call depth, it is 
> difficult to debug without attaching source. Since it's absolutely trivial to 
> write our own, and it's used widely within the codebase, I think we should do 
> so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9985) Introduce our own AbstractIterator

2015-08-05 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658264#comment-14658264
 ] 

Jason Brown edited comment on CASSANDRA-9985 at 8/5/15 2:14 PM:


Rereading through the guava AbstractIterator code, I didn't feel it to be too 
overbearing (there's the state machine and a Precondition call). TBH, I'm kinda 
mixed on this - I appreciate one less external dependency, and the patch is 
simple enough. If others are fine with the change, I will be, as well.

Nit:
- in your AbstractIterator.remove(), it's a no-op. In the guava version, they 
throw a UnsupportedOperationException as it derives from UnmodifiableIterator


was (Author: jasobrown):
Rereading through the guava AbstractIterator code, I didn't feel it to be too 
overbearing (there's the state machine and a Precondition call). TBH, I'm kinda 
mixed on this - I appreciate one less external dependency, and the patch is 
easy enough. If others are fine with the change, I will be, as well.

Nit:
- in your AbstractIterator.remove(), it's a no-op. In the guava version, they 
throw a UnsupportedOperationException as it derives from UnmodifiableIterator

> Introduce our own AbstractIterator
> --
>
> Key: CASSANDRA-9985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9985
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Trivial
> Fix For: 3.0.0 rc1
>
>
> The Guava AbstractIterator not only has unnecessary method call depth, it is 
> difficult to debug without attaching source. Since it's absolutely trivial to 
> write our own, and it's used widely within the codebase, I think we should do 
> so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9985) Introduce our own AbstractIterator

2015-08-05 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658264#comment-14658264
 ] 

Jason Brown edited comment on CASSANDRA-9985 at 8/5/15 2:14 PM:


Rereading through the guava AbstractIterator code, I didn't feel it to be too 
overbearing (there's the state machine and a Precondition call). TBH, I'm kinda 
mixed on this - I appreciate one less external dependency, and the patch is 
simple enough. If others are fine with the change, I will be, as well.

Nit:
- in your AbstractIterator.remove(), it's a no-op. In the guava version, it 
throws a UnsupportedOperationException as it derives from UnmodifiableIterator


was (Author: jasobrown):
Rereading through the guava AbstractIterator code, I didn't feel it to be too 
overbearing (there's the state machine and a Precondition call). TBH, I'm kinda 
mixed on this - I appreciate one less external dependency, and the patch is 
simple enough. If others are fine with the change, I will be, as well.

Nit:
- in your AbstractIterator.remove(), it's a no-op. In the guava version, they 
throw a UnsupportedOperationException as it derives from UnmodifiableIterator

> Introduce our own AbstractIterator
> --
>
> Key: CASSANDRA-9985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9985
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Trivial
> Fix For: 3.0.0 rc1
>
>
> The Guava AbstractIterator not only has unnecessary method call depth, it is 
> difficult to debug without attaching source. Since it's absolutely trivial to 
> write our own, and it's used widely within the codebase, I think we should do 
> so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658262#comment-14658262
 ] 

Marcus Olsson commented on CASSANDRA-5220:
--

While looking at CASSANDRA-5839 I realized that this might break something 
during upgrade from 2.2->3.0, with this patch the table _repair_history_ 
changes to have a set of ranges instead of a start and end range. (This patch 
was first done when

Should I change the table back and do one insert per range instead?

> Repair improvements when using vnodes
> -
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Brandon Williams
>Assignee: Marcus Olsson
>  Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
> cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
> cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9533) Make batch commitlog mode easier to tune

2015-08-05 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9533:
--
Reviewer: Ariel Weisberg

[~aweisberg] to review

> Make batch commitlog mode easier to tune
> 
>
> Key: CASSANDRA-9533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9533
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Benedict
> Fix For: 3.x
>
>
> As discussed in CASSANDRA-9504, 2.1 changed commitlog_sync_batch_window_in_ms 
> from a maximum time to wait between fsync to the minimum time, so one must be 
> very careful to keep it small enough that most writers aren't kept waiting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9985) Introduce our own AbstractIterator

2015-08-05 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9985:
--
Reviewer: Ariel Weisberg

> Introduce our own AbstractIterator
> --
>
> Key: CASSANDRA-9985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9985
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Trivial
> Fix For: 3.0.0 rc1
>
>
> The Guava AbstractIterator not only has unnecessary method call depth, it is 
> difficult to debug without attaching source. Since it's absolutely trivial to 
> write our own, and it's used widely within the codebase, I think we should do 
> so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9913) Select * is only returning the first page of data on trunk

2015-08-05 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9913:
-
Reviewer: Aleksey Yeschenko

Sure.

> Select * is only returning the first page of data on trunk
> --
>
> Key: CASSANDRA-9913
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9913
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Philip Thompson
>Assignee: Benjamin Lerer
> Fix For: 3.0 beta 1
>
> Attachments: 9913.txt
>
>
> While doing some testing on the validation harness, I have run into a pretty 
> trivially reproducible problem.
> {code}
> ccm create test -v git:trunk -n 1 -s
> ccm node1 stress write n=2M
> ccm node1 cqlsh
> {code}
> {code}
> Use keyspace1;
> Select * From standard1; (100 rows)
> Select count(*) from standard1; (300 rows)
> {code}
> Despite two million rows being written, I have found that {{select * from 
> standard1}} only returns one page's worth of data. I have used both the java 
> and the python driver to test this. I have also found that {{select count(*) 
> from standard1}} gives a multiple of one page's worth of data, that appears 
> to correspond to page size * RF.
> I have already tried with the patch for CASSANDRA-9775, and that did not 
> resolve this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9891) AggregationTest.testAggregateWithWithWriteTimeOrTTL is fragile

2015-08-05 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9891:
-
Reviewer: Aleksey Yeschenko

> AggregationTest.testAggregateWithWithWriteTimeOrTTL is fragile
> --
>
> Key: CASSANDRA-9891
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9891
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sylvain Lebresne
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 2.2.1
>
> Attachments: 9891.txt
>
>
> I've seen {{AggregationTest.testAggregateWithWithWriteTimeOrTTL}} fail on 
> cassci on the line
> {noformat}
> assertTrue(row.getInt("ttl(b)") > 4
> {noformat}
> Given that the ttl is set to 5 a few lines above and the CQL {{ttl()}} method 
> returns the actual time-to-live (that is, it decrease with time), it feels 
> safe to assume this test is overly fragile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658225#comment-14658225
 ] 

Stefania commented on CASSANDRA-5220:
-

Sounds great, thanks! :)

> Repair improvements when using vnodes
> -
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Brandon Williams
>Assignee: Marcus Olsson
>  Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
> cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
> cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9891) AggregationTest.testAggregateWithWithWriteTimeOrTTL is fragile

2015-08-05 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-9891:
--
Attachment: 9891.txt

The patch extends the time allowed for the test to execute to 10 minutes.

> AggregationTest.testAggregateWithWithWriteTimeOrTTL is fragile
> --
>
> Key: CASSANDRA-9891
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9891
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sylvain Lebresne
>Assignee: Benjamin Lerer
>Priority: Minor
> Fix For: 2.2.1
>
> Attachments: 9891.txt
>
>
> I've seen {{AggregationTest.testAggregateWithWithWriteTimeOrTTL}} fail on 
> cassci on the line
> {noformat}
> assertTrue(row.getInt("ttl(b)") > 4
> {noformat}
> Given that the ttl is set to 5 a few lines above and the CQL {{ttl()}} method 
> returns the actual time-to-live (that is, it decrease with time), it feels 
> safe to assume this test is overly fragile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655293#comment-14655293
 ] 

Marcus Olsson commented on CASSANDRA-5220:
--

I'm happy to implement it!

> Repair improvements when using vnodes
> -
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Brandon Williams
>Assignee: Marcus Olsson
>  Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
> cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
> cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9913) Select * is only returning the first page of data on trunk

2015-08-05 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-9913:
--
Attachment: 9913.txt

The patch fix 2 problems:
# For the {{SELECT * FROM Standard1}} query, the problem was that the query was 
considered wrongly as a range query. It was due to the fake clustering column 
{{column1}} that is part of the static compact tables schema. The patch make 
sure that the column is ignored for static compact tables.
# For the {{SELECT count(*) FROM Standard1}} query, the problem was caused by 
the fact that the {{RowIterator}} was not closed.  

> Select * is only returning the first page of data on trunk
> --
>
> Key: CASSANDRA-9913
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9913
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Philip Thompson
>Assignee: Benjamin Lerer
> Fix For: 3.0 beta 1
>
> Attachments: 9913.txt
>
>
> While doing some testing on the validation harness, I have run into a pretty 
> trivially reproducible problem.
> {code}
> ccm create test -v git:trunk -n 1 -s
> ccm node1 stress write n=2M
> ccm node1 cqlsh
> {code}
> {code}
> Use keyspace1;
> Select * From standard1; (100 rows)
> Select count(*) from standard1; (300 rows)
> {code}
> Despite two million rows being written, I have found that {{select * from 
> standard1}} only returns one page's worth of data. I have used both the java 
> and the python driver to test this. I have also found that {{select count(*) 
> from standard1}} gives a multiple of one page's worth of data, that appears 
> to correspond to page size * RF.
> I have already tried with the patch for CASSANDRA-9775, and that did not 
> resolve this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9775) some paging dtests fail/flap on trunk

2015-08-05 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655256#comment-14655256
 ] 

Benjamin Lerer commented on CASSANDRA-9775:
---

I have committed the 3 fixes separately as they are unrelated:
* Fix paging with static columns: 6aa83990530dbfe5e8a2c3a194c4dcbb3ffd4b59
* Fix unecessary and broken (when reversed) condition:  
028df729b7bc0f63359990d9cc7ebb7d653232d5
* Fix serialization of AbstractBounds: bd7d1198ac1e02785e912c7cfbb504ddaab6bb93

> some paging dtests fail/flap on trunk
> -
>
> Key: CASSANDRA-9775
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9775
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jim Witschey
>Assignee: Sylvain Lebresne
>Priority: Blocker
> Fix For: 3.0 beta 1
>
>
> Several paging dtests fail on trunk:
> [static_columns_paging_test|http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastSuccessfulBuild/testReport/junit/paging_test/TestPagingData/static_columns_paging_test/history/]
> [test_undefined_page_size_default|http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastSuccessfulBuild/testReport/junit/paging_test/TestPagingSize/test_undefined_page_size_default/history/]
> [test_failure_threshold_deletions|http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastSuccessfulBuild/testReport/junit/paging_test/TestPagingWithDeletions/test_failure_threshold_deletions/history/]
> I'm not sure if these are all rooted in the same underlying problem, so I 
> defer to whoever takes this ticket on.
> [~thobbs] I'm assigning you because this is about paging, but reassign as you 
> see fit. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/4] cassandra git commit: Fix unecessary and broken (when reversed) condition

2015-08-05 Thread blerer
Fix unecessary and broken (when reversed) condition

patch by Sylvain Lebresne; reviewed by Benjamin Lerer for CASSANDRA-9775


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/028df729
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/028df729
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/028df729

Branch: refs/heads/trunk
Commit: 028df729b7bc0f63359990d9cc7ebb7d653232d5
Parents: 6aa8399
Author: Sylvain Lebresne 
Authored: Wed Aug 5 12:19:32 2015 +0200
Committer: blerer 
Committed: Wed Aug 5 12:19:32 2015 +0200

--
 .../apache/cassandra/db/filter/ClusteringIndexNamesFilter.java  | 5 -
 1 file changed, 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/028df729/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java
--
diff --git 
a/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java 
b/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java
index a6f2179..f2c81a7 100644
--- a/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java
+++ b/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java
@@ -80,11 +80,6 @@ public class ClusteringIndexNamesFilter extends 
AbstractClusteringIndexFilter
 
 public ClusteringIndexNamesFilter forPaging(ClusteringComparator 
comparator, Clustering lastReturned, boolean inclusive)
 {
-// TODO: Consider removal of the initial check.
-int cmp = comparator.compare(lastReturned, 
clusteringsInQueryOrder.first());
-if (cmp < 0 || (inclusive && cmp == 0))
-return this;
-
 NavigableSet newClusterings = reversed ?
   
clusterings.headSet(lastReturned, inclusive) :
   
clusterings.tailSet(lastReturned, inclusive);



[4/4] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

2015-08-05 Thread blerer
Merge branch 'cassandra-3.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/de49ed84
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/de49ed84
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/de49ed84

Branch: refs/heads/trunk
Commit: de49ed84cf5fdd21e3215940b6b375ae77cc1774
Parents: 3ab86d0 bd7d119
Author: blerer 
Authored: Wed Aug 5 13:57:55 2015 +0200
Committer: blerer 
Committed: Wed Aug 5 13:58:29 2015 +0200

--
 src/java/org/apache/cassandra/db/DataRange.java | 14 -
 .../db/filter/ClusteringIndexNamesFilter.java   |  5 --
 .../apache/cassandra/dht/AbstractBounds.java| 58 +---
 .../service/pager/AbstractQueryPager.java   | 54 ++
 .../service/pager/RangeNamesQueryPager.java |  6 ++
 .../service/pager/RangeSliceQueryPager.java |  6 ++
 .../service/pager/SinglePartitionPager.java |  6 ++
 7 files changed, 122 insertions(+), 27 deletions(-)
--




[1/4] cassandra git commit: Fix paging with static

2015-08-05 Thread blerer
Repository: cassandra
Updated Branches:
  refs/heads/trunk 3ab86d043 -> de49ed84c


Fix paging with static

patch by Sylvain Lebresne; reviewed by Benjamin Lerer for CASSANDRA-9775


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6aa83990
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6aa83990
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6aa83990

Branch: refs/heads/trunk
Commit: 6aa83990530dbfe5e8a2c3a194c4dcbb3ffd4b59
Parents: e58b7df
Author: Sylvain Lebresne 
Authored: Wed Aug 5 12:14:26 2015 +0200
Committer: blerer 
Committed: Wed Aug 5 12:18:12 2015 +0200

--
 .../service/pager/AbstractQueryPager.java   | 54 
 .../service/pager/RangeNamesQueryPager.java |  6 +++
 .../service/pager/RangeSliceQueryPager.java |  6 +++
 .../service/pager/SinglePartitionPager.java |  6 +++
 4 files changed, 61 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/6aa83990/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java
--
diff --git 
a/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java 
b/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java
index 2c16ace..9991277 100644
--- a/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java
+++ b/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java
@@ -17,6 +17,8 @@
  */
 package org.apache.cassandra.service.pager;
 
+import java.util.NoSuchElementException;
+
 import org.apache.cassandra.config.CFMetaData;
 import org.apache.cassandra.db.*;
 import org.apache.cassandra.db.rows.*;
@@ -79,6 +81,9 @@ abstract class AbstractQueryPager implements QueryPager
 
 private Row lastRow;
 
+private boolean isFirstPartition = true;
+private RowIterator nextPartition;
+
 private PagerIterator(PartitionIterator iter, DataLimits pageLimits, 
int nowInSec)
 {
 super(iter, pageLimits, nowInSec);
@@ -86,30 +91,56 @@ abstract class AbstractQueryPager implements QueryPager
 }
 
 @Override
-@SuppressWarnings("resource") // iter is closed by closing the result
-public RowIterator next()
+@SuppressWarnings("resource") // iter is closed by closing the result 
or in close()
+public boolean hasNext()
 {
-RowIterator iter = super.next();
-try
+while (nextPartition == null && super.hasNext())
 {
-DecoratedKey key = iter.partitionKey();
+if (nextPartition == null)
+nextPartition = super.next();
+
+DecoratedKey key = nextPartition.partitionKey();
 if (lastKey == null || !lastKey.equals(key))
 remainingInPartition = limits.perPartitionCount();
 
 lastKey = key;
-return new RowPagerIterator(iter);
-}
-catch (RuntimeException e)
-{
-iter.close();
-throw e;
+
+// If this is the first partition of this page, this could be 
the continuation of a partition we've started
+// on the previous page. In which case, we could have the 
problem that the partition has no more "regular"
+// rows (but the page size is such we didn't knew before) but 
it does has a static row. We should then skip
+// the partition as returning it would means to the upper 
layer that the partition has "only" static columns,
+// which is not the case (and we know the static results have 
been sent on the previous page).
+if (isFirstPartition && isPreviouslyReturnedPartition(key) && 
!nextPartition.hasNext())
+{
+nextPartition.close();
+nextPartition = null;
+}
+
+isFirstPartition = false;
 }
+return nextPartition != null;
+}
+
+@Override
+@SuppressWarnings("resource") // iter is closed by closing the result
+public RowIterator next()
+{
+if (!hasNext())
+throw new NoSuchElementException();
+
+RowIterator toReturn = nextPartition;
+nextPartition = null;
+
+return new RowPagerIterator(toReturn);
 }
 
 @Override
 public void close()
 {
 super.close();
+if (nextPartition != null)
+nextPartition.close();
+
 recordLast(lastKey, lastRow);
 
 int counted = counter.counted();
@@ -158,4 +189,5 @@ abstract class AbstractQueryPager implements QueryPager

[3/4] cassandra git commit: Fix serialization of AbstractBounds

2015-08-05 Thread blerer
Fix serialization of AbstractBounds

patch by Sylvain Lebresne; reviewed by Benjamin Lerer for CASSANDRA-9775


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bd7d1198
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bd7d1198
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bd7d1198

Branch: refs/heads/trunk
Commit: bd7d1198ac1e02785e912c7cfbb504ddaab6bb93
Parents: 028df72
Author: Sylvain Lebresne 
Authored: Wed Aug 5 12:21:24 2015 +0200
Committer: blerer 
Committed: Wed Aug 5 12:21:24 2015 +0200

--
 src/java/org/apache/cassandra/db/DataRange.java | 14 -
 .../apache/cassandra/dht/AbstractBounds.java| 58 +---
 2 files changed, 61 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bd7d1198/src/java/org/apache/cassandra/db/DataRange.java
--
diff --git a/src/java/org/apache/cassandra/db/DataRange.java 
b/src/java/org/apache/cassandra/db/DataRange.java
index 023f572..79b2448 100644
--- a/src/java/org/apache/cassandra/db/DataRange.java
+++ b/src/java/org/apache/cassandra/db/DataRange.java
@@ -42,7 +42,7 @@ public class DataRange
 {
 public static final Serializer serializer = new Serializer();
 
-private final AbstractBounds keyRange;
+protected final AbstractBounds keyRange;
 protected final ClusteringIndexFilter clusteringIndexFilter;
 
 /**
@@ -201,7 +201,7 @@ public class DataRange
  * @param range the range of partition keys to query.
  * @param comparator the comparator for the table queried.
  * @param lastReturned the clustering for the last result returned by the 
previous page, i.e. the result we want to start our new page
- * from. This last returned must must correspond to left bound of 
{@code range} (in other words, {@code range.left} must be the
+ * from. This last returned must correspond to left bound of {@code 
range} (in other words, {@code range.left} must be the
  * partition key for that {@code lastReturned} result).
  * @param inclusive whether or not we want to include the {@code 
lastReturned} in the newly returned page of results.
  *
@@ -354,6 +354,16 @@ public class DataRange
 {
 return false;
 }
+
+@Override
+public String toString(CFMetaData metadata)
+{
+return String.format("range=%s pfilter=%s lastReturned=%s (%s)",
+ 
keyRange.getString(metadata.getKeyValidator()),
+ clusteringIndexFilter.toString(metadata),
+ lastReturned.toString(metadata),
+ inclusive ? "included" : "excluded");
+}
 }
 
 public static class Serializer

http://git-wip-us.apache.org/repos/asf/cassandra/blob/bd7d1198/src/java/org/apache/cassandra/dht/AbstractBounds.java
--
diff --git a/src/java/org/apache/cassandra/dht/AbstractBounds.java 
b/src/java/org/apache/cassandra/dht/AbstractBounds.java
index d9a0c62..9e74eb8 100644
--- a/src/java/org/apache/cassandra/dht/AbstractBounds.java
+++ b/src/java/org/apache/cassandra/dht/AbstractBounds.java
@@ -27,6 +27,7 @@ import org.apache.cassandra.db.PartitionPosition;
 import org.apache.cassandra.db.TypeSizes;
 import org.apache.cassandra.db.marshal.AbstractType;
 import org.apache.cassandra.io.util.DataOutputPlus;
+import org.apache.cassandra.net.MessagingService;
 import org.apache.cassandra.utils.Pair;
 
 public abstract class AbstractBounds> implements 
Serializable
@@ -119,8 +120,13 @@ public abstract class AbstractBounds> implements Seria
 
 public static class AbstractBoundsSerializer> 
implements IPartitionerDependentSerializer>
 {
+private static final int IS_TOKEN_FLAG= 0x01;
+private static final int START_INCLUSIVE_FLAG = 0x02;
+private static final int END_INCLUSIVE_FLAG   = 0x04;
+
 IPartitionerDependentSerializer serializer;
 
+// Use for pre-3.0 protocol
 private static int kindInt(AbstractBounds ab)
 {
 int kind = ab instanceof Range ? Type.RANGE.ordinal() : 
Type.BOUNDS.ordinal();
@@ -129,6 +135,19 @@ public abstract class AbstractBounds> implements Seria
 return kind;
 }
 
+// For from 3.0 onwards
+private static int kindFlags(AbstractBounds ab)
+{
+int flags = 0;
+if (ab.left instanceof Token)
+flags |= IS_TOKEN_FLAG;
+if (ab.isStartInclusive())
+flags |= START_INCLUSIVE_FLAG;
+if (ab.isEndInclusive())
+flags |= END_INCLUSIVE_FLAG;
+ret

[1/3] cassandra git commit: Fix paging with static

2015-08-05 Thread blerer
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 e58b7df93 -> bd7d1198a


Fix paging with static

patch by Sylvain Lebresne; reviewed by Benjamin Lerer for CASSANDRA-9775


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6aa83990
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6aa83990
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6aa83990

Branch: refs/heads/cassandra-3.0
Commit: 6aa83990530dbfe5e8a2c3a194c4dcbb3ffd4b59
Parents: e58b7df
Author: Sylvain Lebresne 
Authored: Wed Aug 5 12:14:26 2015 +0200
Committer: blerer 
Committed: Wed Aug 5 12:18:12 2015 +0200

--
 .../service/pager/AbstractQueryPager.java   | 54 
 .../service/pager/RangeNamesQueryPager.java |  6 +++
 .../service/pager/RangeSliceQueryPager.java |  6 +++
 .../service/pager/SinglePartitionPager.java |  6 +++
 4 files changed, 61 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/6aa83990/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java
--
diff --git 
a/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java 
b/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java
index 2c16ace..9991277 100644
--- a/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java
+++ b/src/java/org/apache/cassandra/service/pager/AbstractQueryPager.java
@@ -17,6 +17,8 @@
  */
 package org.apache.cassandra.service.pager;
 
+import java.util.NoSuchElementException;
+
 import org.apache.cassandra.config.CFMetaData;
 import org.apache.cassandra.db.*;
 import org.apache.cassandra.db.rows.*;
@@ -79,6 +81,9 @@ abstract class AbstractQueryPager implements QueryPager
 
 private Row lastRow;
 
+private boolean isFirstPartition = true;
+private RowIterator nextPartition;
+
 private PagerIterator(PartitionIterator iter, DataLimits pageLimits, 
int nowInSec)
 {
 super(iter, pageLimits, nowInSec);
@@ -86,30 +91,56 @@ abstract class AbstractQueryPager implements QueryPager
 }
 
 @Override
-@SuppressWarnings("resource") // iter is closed by closing the result
-public RowIterator next()
+@SuppressWarnings("resource") // iter is closed by closing the result 
or in close()
+public boolean hasNext()
 {
-RowIterator iter = super.next();
-try
+while (nextPartition == null && super.hasNext())
 {
-DecoratedKey key = iter.partitionKey();
+if (nextPartition == null)
+nextPartition = super.next();
+
+DecoratedKey key = nextPartition.partitionKey();
 if (lastKey == null || !lastKey.equals(key))
 remainingInPartition = limits.perPartitionCount();
 
 lastKey = key;
-return new RowPagerIterator(iter);
-}
-catch (RuntimeException e)
-{
-iter.close();
-throw e;
+
+// If this is the first partition of this page, this could be 
the continuation of a partition we've started
+// on the previous page. In which case, we could have the 
problem that the partition has no more "regular"
+// rows (but the page size is such we didn't knew before) but 
it does has a static row. We should then skip
+// the partition as returning it would means to the upper 
layer that the partition has "only" static columns,
+// which is not the case (and we know the static results have 
been sent on the previous page).
+if (isFirstPartition && isPreviouslyReturnedPartition(key) && 
!nextPartition.hasNext())
+{
+nextPartition.close();
+nextPartition = null;
+}
+
+isFirstPartition = false;
 }
+return nextPartition != null;
+}
+
+@Override
+@SuppressWarnings("resource") // iter is closed by closing the result
+public RowIterator next()
+{
+if (!hasNext())
+throw new NoSuchElementException();
+
+RowIterator toReturn = nextPartition;
+nextPartition = null;
+
+return new RowPagerIterator(toReturn);
 }
 
 @Override
 public void close()
 {
 super.close();
+if (nextPartition != null)
+nextPartition.close();
+
 recordLast(lastKey, lastRow);
 
 int counted = counter.counted();
@@ -158,4 +189,5 @@ abstract class AbstractQueryPager implem

[2/3] cassandra git commit: Fix unecessary and broken (when reversed) condition

2015-08-05 Thread blerer
Fix unecessary and broken (when reversed) condition

patch by Sylvain Lebresne; reviewed by Benjamin Lerer for CASSANDRA-9775


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/028df729
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/028df729
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/028df729

Branch: refs/heads/cassandra-3.0
Commit: 028df729b7bc0f63359990d9cc7ebb7d653232d5
Parents: 6aa8399
Author: Sylvain Lebresne 
Authored: Wed Aug 5 12:19:32 2015 +0200
Committer: blerer 
Committed: Wed Aug 5 12:19:32 2015 +0200

--
 .../apache/cassandra/db/filter/ClusteringIndexNamesFilter.java  | 5 -
 1 file changed, 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/028df729/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java
--
diff --git 
a/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java 
b/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java
index a6f2179..f2c81a7 100644
--- a/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java
+++ b/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java
@@ -80,11 +80,6 @@ public class ClusteringIndexNamesFilter extends 
AbstractClusteringIndexFilter
 
 public ClusteringIndexNamesFilter forPaging(ClusteringComparator 
comparator, Clustering lastReturned, boolean inclusive)
 {
-// TODO: Consider removal of the initial check.
-int cmp = comparator.compare(lastReturned, 
clusteringsInQueryOrder.first());
-if (cmp < 0 || (inclusive && cmp == 0))
-return this;
-
 NavigableSet newClusterings = reversed ?
   
clusterings.headSet(lastReturned, inclusive) :
   
clusterings.tailSet(lastReturned, inclusive);



[3/3] cassandra git commit: Fix serialization of AbstractBounds

2015-08-05 Thread blerer
Fix serialization of AbstractBounds

patch by Sylvain Lebresne; reviewed by Benjamin Lerer for CASSANDRA-9775


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bd7d1198
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bd7d1198
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bd7d1198

Branch: refs/heads/cassandra-3.0
Commit: bd7d1198ac1e02785e912c7cfbb504ddaab6bb93
Parents: 028df72
Author: Sylvain Lebresne 
Authored: Wed Aug 5 12:21:24 2015 +0200
Committer: blerer 
Committed: Wed Aug 5 12:21:24 2015 +0200

--
 src/java/org/apache/cassandra/db/DataRange.java | 14 -
 .../apache/cassandra/dht/AbstractBounds.java| 58 +---
 2 files changed, 61 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/bd7d1198/src/java/org/apache/cassandra/db/DataRange.java
--
diff --git a/src/java/org/apache/cassandra/db/DataRange.java 
b/src/java/org/apache/cassandra/db/DataRange.java
index 023f572..79b2448 100644
--- a/src/java/org/apache/cassandra/db/DataRange.java
+++ b/src/java/org/apache/cassandra/db/DataRange.java
@@ -42,7 +42,7 @@ public class DataRange
 {
 public static final Serializer serializer = new Serializer();
 
-private final AbstractBounds keyRange;
+protected final AbstractBounds keyRange;
 protected final ClusteringIndexFilter clusteringIndexFilter;
 
 /**
@@ -201,7 +201,7 @@ public class DataRange
  * @param range the range of partition keys to query.
  * @param comparator the comparator for the table queried.
  * @param lastReturned the clustering for the last result returned by the 
previous page, i.e. the result we want to start our new page
- * from. This last returned must must correspond to left bound of 
{@code range} (in other words, {@code range.left} must be the
+ * from. This last returned must correspond to left bound of {@code 
range} (in other words, {@code range.left} must be the
  * partition key for that {@code lastReturned} result).
  * @param inclusive whether or not we want to include the {@code 
lastReturned} in the newly returned page of results.
  *
@@ -354,6 +354,16 @@ public class DataRange
 {
 return false;
 }
+
+@Override
+public String toString(CFMetaData metadata)
+{
+return String.format("range=%s pfilter=%s lastReturned=%s (%s)",
+ 
keyRange.getString(metadata.getKeyValidator()),
+ clusteringIndexFilter.toString(metadata),
+ lastReturned.toString(metadata),
+ inclusive ? "included" : "excluded");
+}
 }
 
 public static class Serializer

http://git-wip-us.apache.org/repos/asf/cassandra/blob/bd7d1198/src/java/org/apache/cassandra/dht/AbstractBounds.java
--
diff --git a/src/java/org/apache/cassandra/dht/AbstractBounds.java 
b/src/java/org/apache/cassandra/dht/AbstractBounds.java
index d9a0c62..9e74eb8 100644
--- a/src/java/org/apache/cassandra/dht/AbstractBounds.java
+++ b/src/java/org/apache/cassandra/dht/AbstractBounds.java
@@ -27,6 +27,7 @@ import org.apache.cassandra.db.PartitionPosition;
 import org.apache.cassandra.db.TypeSizes;
 import org.apache.cassandra.db.marshal.AbstractType;
 import org.apache.cassandra.io.util.DataOutputPlus;
+import org.apache.cassandra.net.MessagingService;
 import org.apache.cassandra.utils.Pair;
 
 public abstract class AbstractBounds> implements 
Serializable
@@ -119,8 +120,13 @@ public abstract class AbstractBounds> implements Seria
 
 public static class AbstractBoundsSerializer> 
implements IPartitionerDependentSerializer>
 {
+private static final int IS_TOKEN_FLAG= 0x01;
+private static final int START_INCLUSIVE_FLAG = 0x02;
+private static final int END_INCLUSIVE_FLAG   = 0x04;
+
 IPartitionerDependentSerializer serializer;
 
+// Use for pre-3.0 protocol
 private static int kindInt(AbstractBounds ab)
 {
 int kind = ab instanceof Range ? Type.RANGE.ordinal() : 
Type.BOUNDS.ordinal();
@@ -129,6 +135,19 @@ public abstract class AbstractBounds> implements Seria
 return kind;
 }
 
+// For from 3.0 onwards
+private static int kindFlags(AbstractBounds ab)
+{
+int flags = 0;
+if (ab.left instanceof Token)
+flags |= IS_TOKEN_FLAG;
+if (ab.isStartInclusive())
+flags |= START_INCLUSIVE_FLAG;
+if (ab.isEndInclusive())
+flags |= END_INCLUSIVE_FLAG;
+   

[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655230#comment-14655230
 ] 

Stefania commented on CASSANDRA-5220:
-

As you prefer, in preference you should implement them but if you are busy I 
can also implement them myself and then you review afterwards. Just let me know.




> Repair improvements when using vnodes
> -
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Brandon Williams
>Assignee: Marcus Olsson
>  Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
> cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
> cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655181#comment-14655181
 ] 

Marcus Olsson commented on CASSANDRA-5220:
--

Nice, would you like me to take care of the main points/nits/comments as well 
or would you rather fix them yourself?

Regarding the main points:

#2 For MerkleTrees serialization I guess we could remove the range and just 
serialize the MerkleTree's and use the fullRange.

#3 I guess I missed that option, it should probably be possible to use TreeMap 
instead.

#4 I don't think the token ranges should overlap, so a few assertions could be 
useful.

> Repair improvements when using vnodes
> -
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.2.0 beta 1
>Reporter: Brandon Williams
>Assignee: Marcus Olsson
>  Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
> cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
> cassandra-3.0-5220.patch
>
>
> Currently when using vnodes, repair takes much longer to complete than 
> without them.  This appears at least in part because it's using a session per 
> range and processing them sequentially.  This generates a lot of log spam 
> with vnodes, and while being gentler and lighter on hard disk deployments, 
> ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9600) DescriptorTypeTidy and GlobalTypeTidy do not benefit from being full fledged Ref instances

2015-08-05 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict resolved CASSANDRA-9600.
-
Resolution: Won't Fix

We've already removed DescriptorTypeTidy, and I don't think it's worth worrying 
about GlobalTypeTidy any time soon.

> DescriptorTypeTidy and GlobalTypeTidy do not benefit from being full fledged 
> Ref instances
> --
>
> Key: CASSANDRA-9600
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9600
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Minor
>
> These inner SSTableReader tidying classes do not benefit from being a full 
> fledged Ref because they are managed in such a small scope. This increases 
> our surface area to problems such as CASSANDRA-9549 (these were the affected 
> instances, ftr).
> .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9925) Sorted + Unique BTree.Builder usage can use a thread-local TreeBuilder directly

2015-08-05 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict resolved CASSANDRA-9925.
-
Resolution: Duplicate

> Sorted + Unique BTree.Builder usage can use a thread-local TreeBuilder 
> directly
> ---
>
> Key: CASSANDRA-9925
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9925
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
>Priority: Minor
> Fix For: 3.0.x
>
>
> There are now a number of situations where we use a BTree.Builder that could 
> be made to go directly to the TreeBuilder, since they only perform in-order 
> unique additions to build an initial tree. This would potentially avoid a 
> number of array allocations/copies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

2015-08-05 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655149#comment-14655149
 ] 

Benedict edited comment on CASSANDRA-8630 at 8/5/15 10:34 AM:
--

bq. For the fast path, the built-in BB methods should still be faster, right?

Right.

bq. readByte() would result in one unsafe get call per byte.

The unsafe calls here are all intrinsics, but still - even for fully inlined 
method calls and unrolled loop we're talking something like 24x the work, but 
then we have the virtual invocation costs involved (the behaviour of which for 
a sequence of 8 identical calls I'm not certain - I would hope there is some 
sharing of the method call burden through loop unrolling, but I don't count on 
it), and we are probably 100x+ using rigorous finger-in-air maths.

bq. Do we care about little-endian ordering as well

Good point. I think we may actually depend on this in MemoryInputStream for 
some classes that were persisted in weird byte order. However I'm tempted to 
say we should start pushing this upstream to the places that care, as there are 
few, and we basically consider them all broken. It's not the first time this 
has caused annoyance. I'd be tempted to just forbid it, and patch up the few 
places that need it. (My opinion, only, and if it looks like a hassle we can 
just add support here)


was (Author: benedict):
bq. For the fast path, the built-in BB methods should still be faster, right?

Right.

bq. readByte() would result in one unsafe get call per byte.

The unsafe calls here are all intrinsics, but still - even for fully inlined 
method calls and unrolled loop we're talking something like 24x the work, but 
then we have the virtual invocation costs involved (the behaviour of which for 
a sequence of 8 identical calls I'm not certain - I would hope there is some 
sharing of the method call burden through loop unrolling, but I don't count on 
it), and we are probably 100x+ using rigorous finger-in-air maths.

bq. Do we care about little-endian ordering as well

Good point. I think we may actually depend on this in MemoryInputStream for 
some classes that were persisted in weird byte order. However I'm tempted to 
say we should start pushing this upstream to the places that care, as there are 
few, and we basically consider them all broken. It's not the first time this 
has caused annoyance. I'd be tempted to just forbid it, and patch up the few 
places that need it.

> Faster sequential IO (on compaction, streaming, etc)
> 
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Tools
>Reporter: Oleg Anastasyev
>Assignee: Stefania
>  Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read and 
> SequencialWriter.write methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

2015-08-05 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655149#comment-14655149
 ] 

Benedict edited comment on CASSANDRA-8630 at 8/5/15 10:33 AM:
--

bq. For the fast path, the built-in BB methods should still be faster, right?

Right.

bq. readByte() would result in one unsafe get call per byte.

The unsafe calls here are all intrinsics, but still - even for fully inlined 
method calls and unrolled loop we're talking something like 24x the work, but 
then we have the virtual invocation costs involved (the behaviour of which for 
a sequence of 8 identical calls I'm not certain - I would hope there is some 
sharing of the method call burden through loop unrolling, but I don't count on 
it), and we are probably 100x+ using rigorous finger-in-air maths.

bq. Do we care about little-endian ordering as well

Good point. I think we may actually depend on this in MemoryInputStream for 
some classes that were persisted in weird byte order. However I'm tempted to 
say we should start pushing this upstream to the places that care, as there are 
few, and we basically consider them all broken. It's not the first time this 
has caused annoyance. I'd be tempted to just forbid it, and patch up the few 
places that need it.


was (Author: benedict):
bq. For the fast path, the built-in BB methods should still be faster, right?

Right.

bq. readByte() would result in one unsafe get call per byte.

The unsafe calls here are all intrinsics, but still - even for fully inlined 
method calls and unrolled loop we're talking something like 24x the work, but 
then we have the virtual invocation costs involved (the behaviour of which for 
a sequence of 8 identical calls I'm not certain - I would hope there is some 
sharing of the method call burden through loop unrolling, but I don't count on 
it), and we are probably 100x+ using rigorous finger-in-air maths.

> Faster sequential IO (on compaction, streaming, etc)
> 
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Tools
>Reporter: Oleg Anastasyev
>Assignee: Stefania
>  Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read and 
> SequencialWriter.write methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

2015-08-05 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14655149#comment-14655149
 ] 

Benedict commented on CASSANDRA-8630:
-

bq. For the fast path, the built-in BB methods should still be faster, right?

Right.

bq. readByte() would result in one unsafe get call per byte.

The unsafe calls here are all intrinsics, but still - even for fully inlined 
method calls and unrolled loop we're talking something like 24x the work, but 
then we have the virtual invocation costs involved (the behaviour of which for 
a sequence of 8 identical calls I'm not certain - I would hope there is some 
sharing of the method call burden through loop unrolling, but I don't count on 
it), and we are probably 100x+ using rigorous finger-in-air maths.

> Faster sequential IO (on compaction, streaming, etc)
> 
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Tools
>Reporter: Oleg Anastasyev
>Assignee: Stefania
>  Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read and 
> SequencialWriter.write methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9991) Implement efficient btree removal

2015-08-05 Thread Benedict (JIRA)
Benedict created CASSANDRA-9991:
---

 Summary: Implement efficient btree removal
 Key: CASSANDRA-9991
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9991
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Benedict
 Fix For: 3.x


Currently removal is implemented as a reconstruction by filtering and iterator 
over the original btree. This could be much more efficient, editing just the 
necessary nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9990) Use TreeBuilder directly where possible

2015-08-05 Thread Benedict (JIRA)
Benedict created CASSANDRA-9990:
---

 Summary: Use TreeBuilder directly where possible
 Key: CASSANDRA-9990
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9990
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 3.x


In cases where we iteratively build a btree with sorted and unique values, we 
can use a TreeBuilder directly and avoid unnecessary copies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >