[jira] [Resolved] (CASSANDRA-8567) Functions like length() and trim() on cqlsh table fields

2015-01-06 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-8567.
-
Resolution: Won't Fix

We might add some {{length}} and {{trim}} method at some point, though even if 
we don't, UDF in 3.0 will allow you to define them. But what you're really 
asking here is to be able to use functions in {{ORDER BY}}, and that is not 
going to happen. The reason is that we have no better strategy server side than 
to read everything and sort it in memory afterwards before returning to the 
client. Which 1) is no faster than letting you the client do the sorting client 
side and 2) doesn't work with paging at all and thus makes it likely to OOM as 
soon as the amount of data to order is not small.

For those reason, we prefer letting clients do the sorting themselves 
post-query if they wants to (but we understand that for cqlsh it's not as nice 
as you'd like. Though if you really need quick and dirty post-query sorting, 
piping the output of cqlsh into sort (the unix utility) isn't terribly hard). 
Alternatively, if you need results in that sort order often, you should store 
the data in a table whose clustering is {{length(trim(field)}}.

> Functions like length() and trim() on cqlsh table fields
> 
>
> Key: CASSANDRA-8567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8567
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
>Reporter: Rekha Joshi
>
> It would be nice to be able to order by length of field values.
> A function like length(field) and trim(field) on cqlsh
> To enable do something like say - select * from   where 
> = order by length(trim()) desc;
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2015-01-06 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265852#comment-14265852
 ] 

Sylvain Lebresne commented on CASSANDRA-8325:
-

For the sake of clarification, being from DataStax doesn't entitled anyone to 
make calls regarding Apache Cassandra (having a good track record of 
contributions, which usually means you're a committer, is probably what 
entitled you to make calls, though decisions are ideally agreed upon 
collectively).

> Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
> -
>
> Key: CASSANDRA-8325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
> Project: Cassandra
>  Issue Type: Bug
> Environment: FreeBSD 10.0 with openjdk version "1.7.0_71", 64-Bit 
> Server VM
>Reporter: Leonid Shalupov
> Attachments: hs_err_pid1856.log, system.log, unsafeCopy1.txt, 
> untested_8325.patch
>
>
> See attached error file after JVM crash
> {quote}
> FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
> Jan 16 22:34:59 UTC 2014 
> r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
> {quote}
> {quote}
>  % java -version
> openjdk version "1.7.0_71"
> OpenJDK Runtime Environment (build 1.7.0_71-b14)
> OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8547) Make RangeTombstone.Tracker.isDeleted() faster

2015-01-06 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-8547:

Reviewer: Sylvain Lebresne  (was: Benedict)

> Make RangeTombstone.Tracker.isDeleted() faster
> --
>
> Key: CASSANDRA-8547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: 2.0.11
>Reporter: Dominic Letz
>Assignee: Dominic Letz
> Fix For: 2.1.3
>
> Attachments: Selection_044.png, cassandra-2.0.11-8547.txt, 
> cassandra-2.1-8547.txt, rangetombstone.tracker.txt
>
>
> During compaction and repairs with many tombstones an exorbitant amount of 
> time is spend in RangeTombstone.Tracker.isDeleted().
> The amount of time spend there can be so big that compactions and repairs 
> look "stalled" and the time remaining time estimated frozen at the same value 
> for days.
> Using visualvm I've been sample profiling the code during execution and both 
> in Compaction as well as during repairs found this. (point in time backtraces 
> attached)
> Looking at the code the problem is obviously the linear scanning:
> {code}
> public boolean isDeleted(Column column)
> {
> for (RangeTombstone tombstone : ranges)
> {
> if (comparator.compare(column.name(), tombstone.min) >= 0
> && comparator.compare(column.name(), tombstone.max) <= 0
> && tombstone.maxTimestamp() >= column.timestamp())
> {
> return true;
> }
> }
> return false;
> }
> {code}
> I would like to propose to change this and instead use a sorted list (e.g. 
> RangeTombstoneList) here instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8547) Make RangeTombstone.Tracker.isDeleted() faster

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265891#comment-14265891
 ] 

Benedict commented on CASSANDRA-8547:
-

This looks like something [~slebresne] should take a look at.

> Make RangeTombstone.Tracker.isDeleted() faster
> --
>
> Key: CASSANDRA-8547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: 2.0.11
>Reporter: Dominic Letz
>Assignee: Dominic Letz
> Fix For: 2.1.3
>
> Attachments: Selection_044.png, cassandra-2.0.11-8547.txt, 
> cassandra-2.1-8547.txt, rangetombstone.tracker.txt
>
>
> During compaction and repairs with many tombstones an exorbitant amount of 
> time is spend in RangeTombstone.Tracker.isDeleted().
> The amount of time spend there can be so big that compactions and repairs 
> look "stalled" and the time remaining time estimated frozen at the same value 
> for days.
> Using visualvm I've been sample profiling the code during execution and both 
> in Compaction as well as during repairs found this. (point in time backtraces 
> attached)
> Looking at the code the problem is obviously the linear scanning:
> {code}
> public boolean isDeleted(Column column)
> {
> for (RangeTombstone tombstone : ranges)
> {
> if (comparator.compare(column.name(), tombstone.min) >= 0
> && comparator.compare(column.name(), tombstone.max) <= 0
> && tombstone.maxTimestamp() >= column.timestamp())
> {
> return true;
> }
> }
> return false;
> }
> {code}
> I would like to propose to change this and instead use a sorted list (e.g. 
> RangeTombstoneList) here instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8552) Large compactions run out of off-heap RAM

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265898#comment-14265898
 ] 

Benedict commented on CASSANDRA-8552:
-

[~philipthompson] could we try to reproduce ourselves, since Brent has 
navigated around the problem for the moment? It would be great to figure this 
out and fix for 2.1.3

> Large compactions run out of off-heap RAM
> -
>
> Key: CASSANDRA-8552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.4 
> AWS EC2
> 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
> Java build 1.7.0_55-b13 and build 1.8.0_25-b17
>Reporter: Brent Haines
>Assignee: Benedict
>Priority: Blocker
> Fix For: 2.1.3
>
> Attachments: Screen Shot 2015-01-02 at 9.36.11 PM.png, fhandles.log, 
> freelog.log, lsof.txt, meminfo.txt, sysctl.txt, system.log
>
>
> We have a large table of storing, effectively event logs and a pair of 
> denormalized tables for indexing.
> When updating from 2.0 to 2.1 we saw performance improvements, but some 
> random and silent crashes during nightly repairs. We lost a node (totally 
> corrupted) and replaced it. That node has never stabilized -- it simply can't 
> finish the compactions. 
> Smaller compactions finish. Larger compactions, like these two never finish - 
> {code}
> pending tasks: 48
>compaction type   keyspace table completed total   
>  unit   progress
> Compaction   data   stories   16532973358   75977993784   
> bytes 21.76%
> Compaction   data   stories_by_text   10593780658   38555048812   
> bytes 27.48%
> Active compaction remaining time :   0h10m51s
> {code}
> We are not getting exceptions and are not running out of heap space. The 
> Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
> We watch memory in the opscenter console and it will grow. If we turn off the 
> OOM killer for the process, it will run until everything else is killed 
> instead and then the kernel panics.
> We have the following settings configured: 
> 2G Heap
> 512M New
> {code}
> memtable_heap_space_in_mb: 1024
> memtable_offheap_space_in_mb: 1024
> memtable_allocation_type: heap_buffers
> commitlog_total_space_in_mb: 2048
> concurrent_compactors: 1
> compaction_throughput_mb_per_sec: 128
> {code}
> The compaction strategy is leveled (these are read-intensive tables that are 
> rarely updated)
> I have tried every setting, every option and I have the system where the MTBF 
> is about an hour now, but we never finish compacting because there are some 
> large compactions pending. None of the GC tools or settings help because it 
> is not a GC problem. It is an off-heap memory problem.
> We are getting these messages in our syslog 
> {code}
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
> process java  pte:0320 pmd:2d6fa5067
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
> vm_flags:0870 anon_vma:  (null) mapping:  (null) 
> index:7fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
> Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
> 88020d43da98 81715ac4 7fb820be3000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
> 81174183 0320 0007fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
> 0320 7fb820be3000 7fb820be4000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [] 
> dump_stack+0x45/0x56
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [] 
> print_bad_pte+0x1a3/0x250
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [] 
> vm_normal_page+0x69/0x80
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [] 
> unmap_page_range+0x3bb/0x7f0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [] 
> unmap_single_vma+0x81/0xf0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [] 
> unmap_vmas+0x49/0x90
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219610]  [] 
> exit_mmap+0x9c/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219617]  [] 
> ? __delayacct_add_tsk+0x153/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219621]  [] 
> mmput+0x5c/0x120
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219625]  [] 
> do_exit+0x26c/0xa50
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219631]  [] 
> ? __unqueue_futex+0x31/0x60
> Jan  2 07:06:00 ip-1

[jira] [Commented] (CASSANDRA-6246) EPaxos

2015-01-06 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265909#comment-14265909
 ] 

sankalp kohli commented on CASSANDRA-6246:
--

I am a little confused as in how you will use epoch to make sure instances are 
executed on all replicas when incrementing? 

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>Priority: Minor
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265911#comment-14265911
 ] 

Benedict commented on CASSANDRA-8325:
-

[~graham sanderson] thanks for the New Year cheer and putting together another 
great patch.

My slight concern with this approach is that it pollutes the instruction cache, 
so we do need to run some performance tests. These codepaths _are_ pretty 
sensitive to these effects.

Ultimately, I think a follow up ticket might be sensible that performs all of 
this work natively, since we should not need such code duplication. 

> Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
> -
>
> Key: CASSANDRA-8325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
> Project: Cassandra
>  Issue Type: Bug
> Environment: FreeBSD 10.0 with openjdk version "1.7.0_71", 64-Bit 
> Server VM
>Reporter: Leonid Shalupov
> Attachments: hs_err_pid1856.log, system.log, unsafeCopy1.txt, 
> untested_8325.patch
>
>
> See attached error file after JVM crash
> {quote}
> FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
> Jan 16 22:34:59 UTC 2014 
> r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
> {quote}
> {quote}
>  % java -version
> openjdk version "1.7.0_71"
> OpenJDK Runtime Environment (build 1.7.0_71-b14)
> OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8546) RangeTombstoneList becoming bottleneck on tombstone heavy tasks

2015-01-06 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-8546:

Assignee: (was: Benedict)

> RangeTombstoneList becoming bottleneck on tombstone heavy tasks
> ---
>
> Key: CASSANDRA-8546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8546
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: 2.0.11 / 2.1
>Reporter: Dominic Letz
> Fix For: 2.1.3
>
> Attachments: cassandra-2.0.11-8546.txt, cassandra-2.1-8546.txt, 
> rangetombstonelist_compaction.png, rangetombstonelist_mutation.png, 
> rangetombstonelist_read.png, tombstone_test.tgz
>
>
> I would like to propose a change of the data structure used in the 
> RangeTombstoneList to store and insert tombstone ranges to something with at 
> least O(log N) insert in the middle and at near O(1) and start AND end. Here 
> is why:
> When having tombstone heavy work-loads the current implementation of 
> RangeTombstoneList becomes a bottleneck with slice queries.
> Scanning the number of tombstones up to the default maximum (100k) can take 
> up to 3 minutes of how addInternal() scales on insertion of middle and start 
> elements.
> The attached test shows that with 50k deletes from both sides of a range.
> INSERT 1...11
> flush()
> DELETE 1...5
> DELETE 11...6
> While one direction performs ok (~400ms on my notebook):
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp DESC LIMIT 1
> {code}
> The other direction underperforms (~7seconds on my notebook)
> {code}
> SELECT * FROM timeseries WHERE name = 'a' ORDER BY timestamp ASC LIMIT 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8568) Impose new API on data tracker modifications that makes correct usage obvious and imposes safety

2015-01-06 Thread Benedict (JIRA)
Benedict created CASSANDRA-8568:
---

 Summary: Impose new API on data tracker modifications that makes 
correct usage obvious and imposes safety
 Key: CASSANDRA-8568
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8568
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict


DataTracker has become a bit of a quagmire, and not at all obvious to interface 
with, with many subtly different modifiers. I suspect it is still subtly 
broken, especially around error recovery.

I propose piggy-backing on CASSANDRA-7705 to offer RAII (and GC-enforced, for 
those situations where a try/finally block isn't possible) objects that have 
transactional behaviour, and with few simple declarative methods that can be 
composed simply to provide all of the functionality we currently need.

See CASSANDRA-8399 for context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-8506) Improve management of DataTracker, esp. compacting

2015-01-06 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict resolved CASSANDRA-8506.
-
Resolution: Duplicate

> Improve management of DataTracker, esp. compacting
> --
>
> Key: CASSANDRA-8506
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8506
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
> Fix For: 3.0
>
>
> Building on CASSANDRA-7705, we can use debuggable ref counting to manage the 
> marking and unmarking of compaction state, so that we can quickly track down 
> errors.
> We should also simplify the logic wrt rewriters, by ignoring the descriptor 
> type, perhaps for all sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-06 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257721#comment-14257721
 ] 

Robert Stupp edited comment on CASSANDRA-7438 at 1/6/15 10:08 AM:
--

I had the opportunity to test OHC on a big machine.
First: it works - very happy about that :)

Some things I want to notice:
* high number of segments do not have any really measurable influence (default 
of 2* # of cores is fine)
* throughput heavily depends on serialization (hash entry size) - Java8 gave 
about 10% to 15% improvement in some tests (either on {{Unsafe.copyMemory}} or 
something related like JNI barrier)
* the number of entries per bucket stays pretty low with the default load 
factor of .75 - vast majority has 0 or 1 entries, some 2 or 3 and few up to 8

Issue (not solvable yet):
It works great for hash entries to approx. 64kB with good to great throughput. 
Above that barrier it first works good but after some time the system spends a 
huge amount of CPU time (~95%) in {{malloc()}} / {{free()}} (with jemalloc, 
Unsafe.allocate is not worth discussing at all on Linux).
I tried to add some „memory buffer cache“ that caches free’d hash entries for 
reuse. But it turned out that in the end it would be too complex if done right. 
The current implementation is still in the code, but must be explicitly enabled 
with a system property. Workloads with small entries and high number of threads 
easily trigger Linux OOM protection (that kills the process). Please note that 
it works with large hash entries - but throughput drops dramatically to just a 
few thousand writes per second.

Some numbers (value sizes have gaussian distribution). Had to do these tests in 
a hurry because I had to give back the machine. Code used during these tests is 
tagged as {{0.1-SNAP-Bench}} in git. Throughput is limited by {{malloc()}} / 
{{free()}} and most tests did only use 50% of available CPU capacity (on 
_c3.8xlarge_ - 32 cores, Intel Xeon E5-2680v2 @2.8GHz, 64GB).
* -1k..200k value size, 32 threads, 1M keys, 90% read ratio, 32GB: 22k 
writes/sec, 200k reads/sec, ~8k evictions/sec, write: 8ms (99perc), read: 
3ms(99perc)-
* -1k..64k value size, 500 threads, 1M keys, 90% read ratio, 32GB: 55k 
writes/sec, 499k reads/sec, ~2k evictions/sec, write: .1ms (99perc), read: 
.03ms(99perc)-
* -1k..64k value size, 500 threads, 1M keys, 50% read ratio, 32GB: 195k 
writes/sec, 195k reads/sec, ~9k evictions/sec, write: .2ms (99perc), read: 
.1ms(99perc)-
* -1k..64k value size, 500 threads, 1M keys, 10% read ratio, 32GB: 185k 
writes/sec, 20k reads/sec, ~7k evictions/sec, write: 4ms (99perc), read: 
.07ms(99perc)-
* -1k..16k value size, 500 threads, 5M keys, 90% read ratio, 32GB: 110k 
writes/sec, 1M reads/sec, 30k evictions/sec, write: .04ms (99perc), read: 
.01ms(99perc)-
* -1k..16k value size, 500 threads, 5M keys, 50% read ratio, 32GB: 420k 
writes/sec, 420k reads/sec, 125k evictions/sec, write: .06ms (99perc), read: 
.01ms(99perc)-
* -1k..16k value size, 500 threads, 5M keys, 10% read ratio, 32GB: 435k 
writes/sec, 48k reads/sec, 130k evictions/sec, write: .06ms (99perc), read: 
.01ms(99perc)-
* -1k..4k value size, 500 threads, 20M keys, 90% read ratio, 32GB: 140k 
writes/sec, 1.25M reads/sec, 50k evictions/sec, write: .02ms (99perc), read: 
.005ms(99perc)-
* -1k..4k value size, 500 threads, 20M keys, 50% read ratio, 32GB: 530k 
writes/sec, 530k reads/sec, 220k evictions/sec, write: .04ms (99perc), read: 
.005ms(99perc)-
* -1k..4k value size, 500 threads, 20M keys, 10% read ratio, 32GB: 665k 
writes/sec, 74k reads/sec, 250k evcictions/sec, write: .04ms (99perc), read: 
.005ms(99perc)-

Command line to execute the benchmark:
{code}
java -jar ohc-benchmark/target/ohc-benchmark-0.1-SNAPSHOT.jar -rkd 
'uniform(1..2000)' -wkd 'uniform(1..2000)' -vs 'gaussian(1024..4096,2)' 
-r .1 -cap 320 -d 86400 -t 500 -dr 8

-r = read rate
-d = duration
-t = # of threads
-dr = # of driver threads that feed the worker threads
-rkd = read key distribution
-wkd = write key distribution
-vs = value size
-cap = capacity
{code}

Sample bucket histogram from 20M test:
{code}
[0..0]: 8118604
[1..1]: 5892298
[2..2]: 2138308
[3..3]: 518089
[4..4]: 94441
[5..5]: 13672
[6..6]: 1599
[7..7]: 189
[8..9]: 16
{code}

After trapping into that memory management issue with varying allocation sized 
of some few kB to several MB, I think that it’s still worth to work on an own 
off-heap memory management. Maybe some block-based approach (fixed or 
variable). But that’s out of the scope of this ticket.

EDIT: The problem with high system-CPU usage only persists on systems with 
multiple CPUs. Cross check with the second CPU socket disabled - calling the 
benchmark with {{taskset 0x3ff java -jar ...}}  does not show 95% system CPU 
usage.

EDIT2: Marked benchmark values as invalid (see my comment on 01/

[jira] [Updated] (CASSANDRA-8194) Reading from Auth table should not be in the request path

2015-01-06 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli updated CASSANDRA-8194:
-
Assignee: Vishy Kasar

> Reading from Auth table should not be in the request path
> -
>
> Key: CASSANDRA-8194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8194
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Assignee: Vishy Kasar
>Priority: Minor
> Fix For: 2.0.12, 3.0
>
> Attachments: 8194-V2.patch, 8194.patch, CacheTest2.java
>
>
> We use PasswordAuthenticator and PasswordAuthorizer. The system_auth has a RF 
> of 10 per DC over 2 DCs. The permissions_validity_in_ms is 5 minutes. 
> We still have few thousand requests failing each day with the trace below. 
> The reason for this is read cache request realizing that cached entry has 
> expired and doing a blocking request to refresh cache. 
> We should have cache refreshed periodically only in the back ground. The user 
> request should simply look at the cache and not try to refresh it. 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
> received only 0 responses.
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2258)
>   at com.google.common.cache.LocalCache.get(LocalCache.java:3990)
>   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994)
>   at 
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4878)
>   at 
> org.apache.cassandra.service.ClientState.authorize(ClientState.java:292)
>   at 
> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:172)
>   at 
> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:165)
>   at 
> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:149)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement.checkAccess(ModificationStatement.java:75)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:102)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:113)
>   at 
> org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1735)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4162)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4150)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
>   at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>   at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
> received only 0 responses.
>   at org.apache.cassandra.auth.Auth.selectUser(Auth.java:256)
>   at org.apache.cassandra.auth.Auth.isSuperuser(Auth.java:84)
>   at 
> org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:50)
>   at 
> org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:68)
>   at org.apache.cassandra.service.ClientState$1.load(ClientState.java:278)
>   at org.apache.cassandra.service.ClientState$1.load(ClientState.java:275)
>   at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589)
>   at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374)
>   at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337)
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252)
>   ... 19 more
> Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation 
> timed out - received only 0 responses.
>   at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:105)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:943)
>   at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:828)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:140)
>   at org.apache.cassandra.auth.Auth.selectUser(Auth.java:245)
>   ... 28 more
> ERROR [Thrift:17232] 2014-10-24 05:06:51,004 CustomTThreadPoolServer.java 
> (line 224) Error occurred during pro

[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-06 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265947#comment-14265947
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

The latest (just checked in) benchmark implementation gives much better 
results. Using 
{{com.codahale.metrics.Timer#time(java.util.concurrent.Callable)}} 
eliminates use of {{System.nanoTime()}} or 
{{ThreadMXBean.getCurrentThreadCpuTime()}} - it can directly use its internal 
clock.
The benchmark {{java -jar ohc-benchmark/target/ohc-benchmark-0.2-SNAPSHOT.jar 
-rkd 'gaussian(1..2000,2)' -wkd 'gaussian(1..2000,2)' -vs 
'gaussian(1024..4096,2)' -r .9 -cap 16 -d 30 -t 30}} improved from 800k 
reads to 3.3M reads per second w/ 8 cores). So yes - benchmark was measuring 
its own mad code. Due to that I edited my previous comment with the benchmark 
results since those are invalid now.

I've added a (yet simple) JMH benchmark as a separate module. This one can 
cause high system CPU usage - at operation rates of 2M per second or more (8 
cores). I think these rates are really fine.

Note: these rates cannot be achieved in production since then you'll obviously 
have to pay for (de)serialization, too.

So we want to address these topics as follow-up:
* own off-heap allocator
* C* ability to access off-heap cached rows
* C* ability to serialize hot keys directly from off-heap (might be a minor win 
since it's triggered not that often)
* per-table knob to control whether to add to row-cache on writes -- I strongly 
believe that this is a useful feature (maybe LHF) on workloads where read and 
written data work on different (row} keys.
* investigate if counter-cache can benefit
* investigate if key-cache can benefit

bq. You could start with it outside and publish to maven central and if there 
an issue getting patches applied quickly we can always fork it in C*.
OK

bq. pluggable row cache
Then I'll start with that - just make row-cache pluggable and the 
implementation configurable.

Note: JNA has a synchronized block that's executed at every call - version 
4.2.0 fixes this (don't know when it will be released).

> Serializing Row cache alternative (Fully off heap)
> --
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux
>Reporter: Vijay
>Assignee: Vijay
>  Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in 
> JVM heap as BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better 
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off 
> heap and use JNI to interact with cache. We might want to ensure that the new 
> implementation match the existing API's (ICache), and the implementation 
> needs to have safe memory access, low overhead in memory and less memcpy's 
> (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-06 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7438:

Reviewer: Ariel Weisberg  (was: Robert Stupp)
Assignee: Robert Stupp  (was: Vijay)

> Serializing Row cache alternative (Fully off heap)
> --
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux
>Reporter: Vijay
>Assignee: Robert Stupp
>  Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in 
> JVM heap as BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better 
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off 
> heap and use JNI to interact with cache. We might want to ensure that the new 
> implementation match the existing API's (ICache), and the implementation 
> needs to have safe memory access, low overhead in memory and less memcpy's 
> (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8163) Re-introduce DESCRIBE permission

2015-01-06 Thread Ben Laplanche (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265958#comment-14265958
 ] 

Ben Laplanche commented on CASSANDRA-8163:
--

The ability for users to see the table names for other users keyspaces is 
blocking us from using a cluster in a multi-tenant setup, which for some 
clients and the nature of their work can be prohibitive in using Cassandra. 

> Re-introduce DESCRIBE permission
> 
>
> Key: CASSANDRA-8163
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8163
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Priority: Minor
>
> We have a cluster like this:
> project1_keyspace
> table101
> table102
> project2_keyspace
> table201
> table202
> We have set up following users and grants:
> project1_user has all access to project1_keyspace 
> project2_user has all access to project2_keyspace
> However project1_user can still do a 'describe schema' and get the schema for 
> project2_keyspace as well. We do not want project1_user to have any knowledge 
> for project2 in any way (cqlsh/java-driver etc) .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8558) deleted row still can be selected out

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265965#comment-14265965
 ] 

Benedict commented on CASSANDRA-8558:
-

This is nothing to do with either change as far as I can tell. Somewhere 
inbetween those two changes something else was presumably broken, that is 
unrelated to either of them. It's possible that it was broken before either, in 
fact. Somewhere amongst them we changed drop behaviour to introduce flushing of 
dirty tables, and this flushing causes the problem. In fact we probably have a 
much worse problem than this appears.

I can illicit this behaviour with a simple call to nodetool flush. The deletion 
records are flushed to disk, and appear in their respective sstables, but are 
not being returned by the IndexedSliceReader that queries them. A new test to 
find this behaviour would be simpler: 

CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
CREATE  TABLE space1.table1(a int, b int, c text,primary key(a,b));
INSERT INTO space1.table1(a,b,c) VALUES(1,1,'1');
// nodetool flush
DELETE FROM space1.table3 where a=1 and b=1;
// nodetool flush
SELECT * FROM space1.table3 where a=1 and b=1;

This is much more fundamentally broken than this ticket suggests, but I'm 
probably not the best person to investigate, since it looks to be a problem 
with IndexedSliceReader. Hopefully a git bisect with the updated test will 
blame a suitable candidate next time around :)

(assuming it isn't somehow still my fault)

> deleted row still can be selected out
> -
>
> Key: CASSANDRA-8558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8558
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.1.2 
> java version "1.7.0_55"
>Reporter: zhaoyan
>Assignee: Benedict
> Fix For: 2.1.3
>
>
> first
> {code}CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE  TABLE space1.table3(a int, b int, c text,primary key(a,b));
> CREATE  KEYSPACE space2 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};{code}
> second
> {code}CREATE  TABLE space2.table1(a int, b int, c int, primary key(a,b));
> CREATE  TABLE space2.table2(a int, b int, c int, primary key(a,b));
> INSERT INTO space1.table3(a,b,c) VALUES(1,1,'1');
> drop table space2.table1;
> DELETE FROM space1.table3 where a=1 and b=1;
> drop table space2.table2;
> select * from space1.table3 where a=1 and b=1;{code}
> you will find that the row (a=1 and b=1)  in space1.table3 is not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8558) deleted row still can be selected out

2015-01-06 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-8558:

Assignee: Philip Thompson  (was: Benedict)

> deleted row still can be selected out
> -
>
> Key: CASSANDRA-8558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8558
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.1.2 
> java version "1.7.0_55"
>Reporter: zhaoyan
>Assignee: Philip Thompson
> Fix For: 2.1.3
>
>
> first
> {code}CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE  TABLE space1.table3(a int, b int, c text,primary key(a,b));
> CREATE  KEYSPACE space2 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};{code}
> second
> {code}CREATE  TABLE space2.table1(a int, b int, c int, primary key(a,b));
> CREATE  TABLE space2.table2(a int, b int, c int, primary key(a,b));
> INSERT INTO space1.table3(a,b,c) VALUES(1,1,'1');
> drop table space2.table1;
> DELETE FROM space1.table3 where a=1 and b=1;
> drop table space2.table2;
> select * from space1.table3 where a=1 and b=1;{code}
> you will find that the row (a=1 and b=1)  in space1.table3 is not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8558) deleted row still can be selected out

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265965#comment-14265965
 ] 

Benedict edited comment on CASSANDRA-8558 at 1/6/15 12:12 PM:
--

This is nothing to do with either change as far as I can tell. Somewhere 
inbetween those two changes something else was presumably broken, that is 
unrelated to either of them. It's possible that it was broken before either, in 
fact. Somewhere amongst them we changed drop behaviour to introduce flushing of 
dirty tables, and this flushing causes the problem. In fact we probably have a 
much worse problem than this appears.

I can illicit this behaviour with a simple call to nodetool flush. The deletion 
records are flushed to disk, and appear in their respective sstables, but are 
not being returned by the IndexedSliceReader that queries them. A new test to 
find this behaviour would be simpler: 

CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
CREATE  TABLE space1.table1(a int, b int, c text,primary key(a,b));
INSERT INTO space1.table1(a,b,c) VALUES(1,1,'1');
// nodetool flush
DELETE FROM space1.table1 where a=1 and b=1;
// nodetool flush
SELECT * FROM space1.table1 where a=1 and b=1;

This is much more fundamentally broken than this ticket suggests, but I'm 
probably not the best person to investigate, since it looks to be a problem 
with IndexedSliceReader. Hopefully a git bisect with the updated test will 
blame a suitable candidate next time around :)

(assuming it isn't somehow still my fault)


was (Author: benedict):
This is nothing to do with either change as far as I can tell. Somewhere 
inbetween those two changes something else was presumably broken, that is 
unrelated to either of them. It's possible that it was broken before either, in 
fact. Somewhere amongst them we changed drop behaviour to introduce flushing of 
dirty tables, and this flushing causes the problem. In fact we probably have a 
much worse problem than this appears.

I can illicit this behaviour with a simple call to nodetool flush. The deletion 
records are flushed to disk, and appear in their respective sstables, but are 
not being returned by the IndexedSliceReader that queries them. A new test to 
find this behaviour would be simpler: 

CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};
CREATE  TABLE space1.table1(a int, b int, c text,primary key(a,b));
INSERT INTO space1.table1(a,b,c) VALUES(1,1,'1');
// nodetool flush
DELETE FROM space1.table3 where a=1 and b=1;
// nodetool flush
SELECT * FROM space1.table3 where a=1 and b=1;

This is much more fundamentally broken than this ticket suggests, but I'm 
probably not the best person to investigate, since it looks to be a problem 
with IndexedSliceReader. Hopefully a git bisect with the updated test will 
blame a suitable candidate next time around :)

(assuming it isn't somehow still my fault)

> deleted row still can be selected out
> -
>
> Key: CASSANDRA-8558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8558
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.1.2 
> java version "1.7.0_55"
>Reporter: zhaoyan
>Assignee: Philip Thompson
> Fix For: 2.1.3
>
>
> first
> {code}CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE  TABLE space1.table3(a int, b int, c text,primary key(a,b));
> CREATE  KEYSPACE space2 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};{code}
> second
> {code}CREATE  TABLE space2.table1(a int, b int, c int, primary key(a,b));
> CREATE  TABLE space2.table2(a int, b int, c int, primary key(a,b));
> INSERT INTO space1.table3(a,b,c) VALUES(1,1,'1');
> drop table space2.table1;
> DELETE FROM space1.table3 where a=1 and b=1;
> drop table space2.table2;
> select * from space1.table3 where a=1 and b=1;{code}
> you will find that the row (a=1 and b=1)  in space1.table3 is not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7124) Use JMX Notifications to Indicate Success/Failure of Long-Running Operations

2015-01-06 Thread Rajanarayanan Thottuvaikkatumana (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266033#comment-14266033
 ] 

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-7124:
-

[~yukim], Did you get a chance to look at the changes made? Any comments? Thanks

> Use JMX Notifications to Indicate Success/Failure of Long-Running Operations
> 
>
> Key: CASSANDRA-7124
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7124
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Tyler Hobbs
>Assignee: Rajanarayanan Thottuvaikkatumana
>Priority: Minor
>  Labels: lhf
> Fix For: 3.0
>
> Attachments: 7124-wip.txt, cassandra-trunk-compact-7124.txt, 
> cassandra-trunk-decommission-7124.txt
>
>
> If {{nodetool cleanup}} or some other long-running operation takes too long 
> to complete, you'll see an error like the one in CASSANDRA-2126, so you can't 
> tell if the operation completed successfully or not.  CASSANDRA-4767 fixed 
> this for repairs with JMX notifications.  We should do something similar for 
> nodetool cleanup, compact, decommission, move, relocate, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8194) Reading from Auth table should not be in the request path

2015-01-06 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266056#comment-14266056
 ] 

Sam Tunnicliffe commented on CASSANDRA-8194:


While there is a window during which a stale set of permissions is used, under 
normal operation I don't think this *should* present too many practical 
problems. 

Refresh is triggered by the first lookup after permisions_validity_in_ms, so 
we'll continue to use the stale set between that point and when that refresh 
actually completes. Outside of tests though, clients have no 
visibility/expectation about the precise load or expiry timings, so this 
shouldn't usually matter. My concern would be performing every 
IAuthorizer.authorize call on a single thread using StorageService.tasks 
instead of distributing them across client request threads could cause a 
backlog and allow the window to grow unacceptably (plus, these tasks will also 
be contending with other users of the shared executor). 

The point about the proliferation of threads and executors is valid, but maybe 
there's a case for a dedicated executor here. We could make it a TPE with a 
default poolsize of 1 but allow that to be increased via a system property if 
necessary.

What may be more of an issue is that we'll continue to serve the stale perms as 
long as the refresh fails completely due to IAuthorizer.authorize throwing some 
exception. This shouldn't really happen with CassandraAuthorizer, but other 
IAuthorizer impls could well encounter errors when fetching perms. To guard 
against that, we can force an invalidation if the ListenableFutureTask 
encounters an exception. That would pretty much maintain current behaviour, 
with the client receiving an error response while the refresh fails (actually, 
the authorize calls after an error would serve stale perms until the exception 
is thrown & caught, but all subsequent calls would fail as per current 
behaviour).

I've attached a v3 with this second change, what are your thoughts on reverting 
back to a dedicated executor for cache refresh?

Also, as I mentioned, tests do have concrete expectations about expiry of 
permissions and so this breaks auth_test.py:TestAuth.permissions_caching_test. 
I've pushed a fix [here|https://github.com/beobal/cassandra-dtest/tree/8194] 
and I'll open a PR shortly.

> Reading from Auth table should not be in the request path
> -
>
> Key: CASSANDRA-8194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8194
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Assignee: Vishy Kasar
>Priority: Minor
> Fix For: 2.0.12, 3.0
>
> Attachments: 8194-V2.patch, 8194.patch, CacheTest2.java
>
>
> We use PasswordAuthenticator and PasswordAuthorizer. The system_auth has a RF 
> of 10 per DC over 2 DCs. The permissions_validity_in_ms is 5 minutes. 
> We still have few thousand requests failing each day with the trace below. 
> The reason for this is read cache request realizing that cached entry has 
> expired and doing a blocking request to refresh cache. 
> We should have cache refreshed periodically only in the back ground. The user 
> request should simply look at the cache and not try to refresh it. 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
> received only 0 responses.
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2258)
>   at com.google.common.cache.LocalCache.get(LocalCache.java:3990)
>   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994)
>   at 
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4878)
>   at 
> org.apache.cassandra.service.ClientState.authorize(ClientState.java:292)
>   at 
> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:172)
>   at 
> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:165)
>   at 
> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:149)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement.checkAccess(ModificationStatement.java:75)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:102)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:113)
>   at 
> org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1735)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4162)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4150)
>   at org.ap

[jira] [Updated] (CASSANDRA-8194) Reading from Auth table should not be in the request path

2015-01-06 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-8194:
---
Attachment: 8194-V3.txt

> Reading from Auth table should not be in the request path
> -
>
> Key: CASSANDRA-8194
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8194
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vishy Kasar
>Assignee: Vishy Kasar
>Priority: Minor
> Fix For: 2.0.12, 3.0
>
> Attachments: 8194-V2.patch, 8194-V3.txt, 8194.patch, CacheTest2.java
>
>
> We use PasswordAuthenticator and PasswordAuthorizer. The system_auth has a RF 
> of 10 per DC over 2 DCs. The permissions_validity_in_ms is 5 minutes. 
> We still have few thousand requests failing each day with the trace below. 
> The reason for this is read cache request realizing that cached entry has 
> expired and doing a blocking request to refresh cache. 
> We should have cache refreshed periodically only in the back ground. The user 
> request should simply look at the cache and not try to refresh it. 
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
> received only 0 responses.
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2258)
>   at com.google.common.cache.LocalCache.get(LocalCache.java:3990)
>   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994)
>   at 
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4878)
>   at 
> org.apache.cassandra.service.ClientState.authorize(ClientState.java:292)
>   at 
> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:172)
>   at 
> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:165)
>   at 
> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:149)
>   at 
> org.apache.cassandra.cql3.statements.ModificationStatement.checkAccess(ModificationStatement.java:75)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:102)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:113)
>   at 
> org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1735)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4162)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4150)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
>   at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>   at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - 
> received only 0 responses.
>   at org.apache.cassandra.auth.Auth.selectUser(Auth.java:256)
>   at org.apache.cassandra.auth.Auth.isSuperuser(Auth.java:84)
>   at 
> org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:50)
>   at 
> org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:68)
>   at org.apache.cassandra.service.ClientState$1.load(ClientState.java:278)
>   at org.apache.cassandra.service.ClientState$1.load(ClientState.java:275)
>   at 
> com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589)
>   at 
> com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374)
>   at 
> com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337)
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252)
>   ... 19 more
> Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation 
> timed out - received only 0 responses.
>   at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:105)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:943)
>   at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:828)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:140)
>   at org.apache.cassandra.auth.Auth.selectUser(Auth.java:245)
>   ... 28 more
> ERROR [Thrift:17232] 2014-10-24 05:06:51,004 CustomTThreadPoolServer.java 
> (line 224) Error 

[jira] [Updated] (CASSANDRA-6983) DirectoriesTest fails when run as root

2015-01-06 Thread Alan Boudreault (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Boudreault updated CASSANDRA-6983:
---
Tester: Alan Boudreault

> DirectoriesTest fails when run as root
> --
>
> Key: CASSANDRA-6983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6983
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tests
>Reporter: Brandon Williams
>Assignee: Ryan McGuire
>Priority: Minor
> Fix For: 2.0.12
>
>
> When you run the DirectoriesTest as a normal user, it passes because it fails 
> to create the 'bad' directory:
> {noformat}
> [junit] - Standard Error -
> [junit] ERROR 16:16:18,111 Failed to create 
> /tmp/cassandra4119802552776680052unittest/ks/bad directory
> [junit]  WARN 16:16:18,112 Blacklisting 
> /tmp/cassandra4119802552776680052unittest/ks/bad for writes
> [junit] -  ---
> {noformat}
> But when you run the test as root, it succeeds in making the directory, 
> causing an assertion failure that it's unwritable:
> {noformat}
> [junit] Testcase: 
> testDiskFailurePolicy_best_effort(org.apache.cassandra.db.DirectoriesTest):   
> FAILED
> [junit] 
> [junit] junit.framework.AssertionFailedError: 
> [junit] at 
> org.apache.cassandra.db.DirectoriesTest.testDiskFailurePolicy_best_effort(DirectoriesTest.java:199)
> {noformat}
> It seems to me that we shouldn't be relying on failing the make the 
> directory.  If we're just going to test a nonexistent dir, why try to make 
> one at all?  And if that is supposed to succeed, then we have a problem with 
> either the test or blacklisting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-6983) DirectoriesTest fails when run as root

2015-01-06 Thread Alan Boudreault (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Boudreault updated CASSANDRA-6983:
---
Attachment: 6983-v1.patch

As a non-root user perspective, the test is ok. We have to call the  
Directories.create() if we want to get the internal cassandra disk policy 
mechanism triggered. If we remove that fake directory creation, the test will 
fail since the it is not marked as an unwritable BlacklistedDirectory. 

The real issue with that test using the root user is that we cannot set the 
directory unwritable using Java. Internally, Java will simply re-set the write 
permission for the operation since the root user is allowed to do everything. 
Please correct me if I'm wrong here about that Java behavior. 

At this point, we have two options to fix this test:

1.  Do not call Directories.create(), but rather set the directory unwritable 
manually via the BlacklistedDirectories class. I don't especially like this 
option since it doesn't really trigger the best_effort internal mechanism.  

2. Simulate a fake directory creation by throwing an exception. This will 
trigger the internal handleFSError and the best_effort mechanism. This will 
work for the root user since we are not playing anymore with the filesystem. 
I've attached a tentative patch with this solution.

Let me know what you think or if you have any suggestion.

> DirectoriesTest fails when run as root
> --
>
> Key: CASSANDRA-6983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6983
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tests
>Reporter: Brandon Williams
>Assignee: Ryan McGuire
>Priority: Minor
> Fix For: 2.0.12
>
> Attachments: 6983-v1.patch
>
>
> When you run the DirectoriesTest as a normal user, it passes because it fails 
> to create the 'bad' directory:
> {noformat}
> [junit] - Standard Error -
> [junit] ERROR 16:16:18,111 Failed to create 
> /tmp/cassandra4119802552776680052unittest/ks/bad directory
> [junit]  WARN 16:16:18,112 Blacklisting 
> /tmp/cassandra4119802552776680052unittest/ks/bad for writes
> [junit] -  ---
> {noformat}
> But when you run the test as root, it succeeds in making the directory, 
> causing an assertion failure that it's unwritable:
> {noformat}
> [junit] Testcase: 
> testDiskFailurePolicy_best_effort(org.apache.cassandra.db.DirectoriesTest):   
> FAILED
> [junit] 
> [junit] junit.framework.AssertionFailedError: 
> [junit] at 
> org.apache.cassandra.db.DirectoriesTest.testDiskFailurePolicy_best_effort(DirectoriesTest.java:199)
> {noformat}
> It seems to me that we shouldn't be relying on failing the make the 
> directory.  If we're just going to test a nonexistent dir, why try to make 
> one at all?  And if that is supposed to succeed, then we have a problem with 
> either the test or blacklisting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8303) Provide "strict mode" for CQL Queries

2015-01-06 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266101#comment-14266101
 ] 

Sam Tunnicliffe commented on CASSANDRA-8303:


It seems to me that this could be a reasonably natural fit as an extension to 
the existing authz system. If we do go down that route there are a couple of 
things to consider. 

What we're really talking about here are restrictions on the permissions a 
particular user may have been granted. For instance, you could grant SELECT on 
ks1.cf1 to user foo, but add a restriction such that multipartition queries are 
not allowed. My first question then is are there other restrictions that would 
be useful but which don't fit this definition?

For restrictions which do fit that definition, it seems sensible to use a 
similar resource hierarchy as permissions. i.e. a restriction can be applied at 
the table level, keyspace level (which trickles down to apply to all tables in 
the ks), or at the root level. The alternative implied by the syntax in the 
previous comments suggests that a restriction should always be applied at the 
root level (i.e. for all keyspaces & tables). 

How inheritance is handled in that hierarchy of resources should be from the 
bottom upwards, working from the most specific permission/restriction.For 
instance, in the case where user x is granted SELECT without any restriction on 
ks1 and the same user is also granted SELECT with the MULTI_PARTITION 
restriction on ks1.cf1 it seems clear that MP queries from user x should only 
be disallowed on the one table, not all tables in ks1. Also, the unrestricted 
grant should not trump the more specific, restricted one and allow MP queries 
on that table.
In the reverse scenario, where the restriction is at the keyspace level, but 
there is also a non-restricted grant for the table, the behaviour should also 
be reversed, user x can make MP queries on the 1 table, but on no others in the 
keyspace.

As noted, managing all this at the user level could be overly burdensome but 
CASSANDRA-7653 should relieve most of that extra administrative load. However, 
it would involve some additional complexity in handling resolution of 
permissions and restrictions.

During the authz process when we resolve permissions for a user on a particular 
resource, the most specific permission whether granted directly or inherited 
through role membership should apply, along with any of its restrictions. If 
there are multiple inherited or direct grants for the exact same resource (i.e. 
at the same level in the resource hierarchy) then we would merge them, unioning 
any restrictions, e.g

{noformat}
role x is granted SELECT on ks1 with MULTI_PARTITION restriction
role y is granted SELECT on ks1.cf1, unrestricted
role y should be allowed to SELECT from any table in ks1, but perform MP 
queries on ks1.cf1 only 

role z is granted both role x and role y
role z should be allowed to SELECT from any table in ks1, but perform MP 
queries on ks1.cf1 only 

role a is granted SELECT on ks1, unrestricted
role z is granted roles x, y, a
role z should be allowed to SELECT from any table in ks1, but perform MP 
queries on ks1.cf1 only 
{noformat}

So, I think this could fit nicely into authz for 3.0 but we may also want to 
think about adding options to cassandra.yaml to enforce the same restrictions 
without requiring authz (& authn). Obviously, that would be much more of a 
blunt instrument - enforcing a given restriction for all queries when enabled, 
though it would mean that it could be turned on on a per-node/per-DC basis. I 
don't see a problem with supporting both cql and yaml based approaches so long 
as we define an order of precedence.

> Provide "strict mode" for CQL Queries
> -
>
> Key: CASSANDRA-8303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8303
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Anupam Arora
> Fix For: 3.0
>
>
> Please provide a "strict mode" option in cassandra that will kick out any CQL 
> queries that are expensive, e.g. any query with ALLOWS FILTERING, 
> multi-partition queries, secondary index queries, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-6983) DirectoriesTest fails when run as root

2015-01-06 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-6983:

Reviewer: Yuki Morishita
Assignee: Alan Boudreault  (was: Ryan McGuire)

> DirectoriesTest fails when run as root
> --
>
> Key: CASSANDRA-6983
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6983
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tests
>Reporter: Brandon Williams
>Assignee: Alan Boudreault
>Priority: Minor
> Fix For: 2.0.12
>
> Attachments: 6983-v1.patch
>
>
> When you run the DirectoriesTest as a normal user, it passes because it fails 
> to create the 'bad' directory:
> {noformat}
> [junit] - Standard Error -
> [junit] ERROR 16:16:18,111 Failed to create 
> /tmp/cassandra4119802552776680052unittest/ks/bad directory
> [junit]  WARN 16:16:18,112 Blacklisting 
> /tmp/cassandra4119802552776680052unittest/ks/bad for writes
> [junit] -  ---
> {noformat}
> But when you run the test as root, it succeeds in making the directory, 
> causing an assertion failure that it's unwritable:
> {noformat}
> [junit] Testcase: 
> testDiskFailurePolicy_best_effort(org.apache.cassandra.db.DirectoriesTest):   
> FAILED
> [junit] 
> [junit] junit.framework.AssertionFailedError: 
> [junit] at 
> org.apache.cassandra.db.DirectoriesTest.testDiskFailurePolicy_best_effort(DirectoriesTest.java:199)
> {noformat}
> It seems to me that we shouldn't be relying on failing the make the 
> directory.  If we're just going to test a nonexistent dir, why try to make 
> one at all?  And if that is supposed to succeed, then we have a problem with 
> either the test or blacklisting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8558) deleted row still can be selected out

2015-01-06 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-8558:

Priority: Blocker  (was: Major)

> deleted row still can be selected out
> -
>
> Key: CASSANDRA-8558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8558
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.1.2 
> java version "1.7.0_55"
>Reporter: zhaoyan
>Assignee: Philip Thompson
>Priority: Blocker
> Fix For: 2.1.3
>
>
> first
> {code}CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE  TABLE space1.table3(a int, b int, c text,primary key(a,b));
> CREATE  KEYSPACE space2 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};{code}
> second
> {code}CREATE  TABLE space2.table1(a int, b int, c int, primary key(a,b));
> CREATE  TABLE space2.table2(a int, b int, c int, primary key(a,b));
> INSERT INTO space1.table3(a,b,c) VALUES(1,1,'1');
> drop table space2.table1;
> DELETE FROM space1.table3 where a=1 and b=1;
> drop table space2.table2;
> select * from space1.table3 where a=1 and b=1;{code}
> you will find that the row (a=1 and b=1)  in space1.table3 is not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/3] cassandra git commit: Use lexical comparison for java revisions

2015-01-06 Thread brandonwilliams
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 1af8ea5ef -> e83304ddf
  refs/heads/trunk dc0102a8e -> 0f024a619


Use lexical comparison for java revisions

Patch by Michael Shuler, reviewed by brandonwilliams for CASSANDRA-8315


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e83304dd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e83304dd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e83304dd

Branch: refs/heads/cassandra-2.1
Commit: e83304ddfee1377417b334d393c61671895d8ed5
Parents: 1af8ea5
Author: Brandon Williams 
Authored: Tue Jan 6 08:24:48 2015 -0600
Committer: Brandon Williams 
Committed: Tue Jan 6 08:24:48 2015 -0600

--
 conf/cassandra-env.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e83304dd/conf/cassandra-env.sh
--
diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh
index 191fb7e..755f962 100644
--- a/conf/cassandra-env.sh
+++ b/conf/cassandra-env.sh
@@ -99,7 +99,7 @@ if [ "$JVM_VERSION" \< "1.7" ] ; then
 exit 1;
 fi
 
-if [ "$JVM_VERSION" \< "1.8" ] && [ "$JVM_PATCH_VERSION" -lt "25" ] ; then
+if [ "$JVM_VERSION" \< "1.8" ] && [ "$JVM_PATCH_VERSION" \< "25" ] ; then
 echo "Cassandra 2.0 and later require Java 7u25 or later."
 exit 1;
 fi



[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-01-06 Thread brandonwilliams
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0f024a61
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0f024a61
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0f024a61

Branch: refs/heads/trunk
Commit: 0f024a6192f185b93439ae53a69ce113b363
Parents: dc0102a e83304d
Author: Brandon Williams 
Authored: Tue Jan 6 08:25:56 2015 -0600
Committer: Brandon Williams 
Committed: Tue Jan 6 08:25:56 2015 -0600

--
 conf/cassandra-env.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0f024a61/conf/cassandra-env.sh
--



[2/3] cassandra git commit: Use lexical comparison for java revisions

2015-01-06 Thread brandonwilliams
Use lexical comparison for java revisions

Patch by Michael Shuler, reviewed by brandonwilliams for CASSANDRA-8315


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e83304dd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e83304dd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e83304dd

Branch: refs/heads/trunk
Commit: e83304ddfee1377417b334d393c61671895d8ed5
Parents: 1af8ea5
Author: Brandon Williams 
Authored: Tue Jan 6 08:24:48 2015 -0600
Committer: Brandon Williams 
Committed: Tue Jan 6 08:24:48 2015 -0600

--
 conf/cassandra-env.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/e83304dd/conf/cassandra-env.sh
--
diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh
index 191fb7e..755f962 100644
--- a/conf/cassandra-env.sh
+++ b/conf/cassandra-env.sh
@@ -99,7 +99,7 @@ if [ "$JVM_VERSION" \< "1.7" ] ; then
 exit 1;
 fi
 
-if [ "$JVM_VERSION" \< "1.8" ] && [ "$JVM_PATCH_VERSION" -lt "25" ] ; then
+if [ "$JVM_VERSION" \< "1.8" ] && [ "$JVM_PATCH_VERSION" \< "25" ] ; then
 echo "Cassandra 2.0 and later require Java 7u25 or later."
 exit 1;
 fi



[jira] [Commented] (CASSANDRA-7653) Add role based access control to Cassandra

2015-01-06 Thread Mike Adamson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266182#comment-14266182
 ] 

Mike Adamson commented on CASSANDRA-7653:
-

I have a couple of initial thoughts on this. 
# Is the IAuthenticator.constructInitialSaslToken method really necessary? The 
only usage of this is from login methods that are only going to use the plain 
text sasl implementation offered by the PasswordAuthenticator so they could 
build the initial token themselves.
# Is there any way of not having the Option enum? This fixes the options that 
an Authenticator can support and doesn't allow any 3rd party to have different 
options but still work with the CQL grammar. Could we have some similar to the 
replication strategies? Or perhaps keep the Option enum but have an option of 
EXTENSION (or other name) that would allow the passing in of a json set of 
extension options.

> Add role based access control to Cassandra
> --
>
> Key: CASSANDRA-7653
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7653
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Mike Adamson
>Assignee: Sam Tunnicliffe
> Fix For: 3.0
>
> Attachments: 7653.patch, CQLSmokeTest.java, cql_smoke_test.py
>
>
> The current authentication model supports granting permissions to individual 
> users. While this is OK for small or medium organizations wanting to 
> implement authorization, it does not work well in large organizations because 
> of the overhead of having to maintain the permissions for each user.
> Introducing roles into the authentication model would allow sets of 
> permissions to be controlled in one place as a role and then the role granted 
> to users. Roles should also be able to be granted to other roles to allow 
> hierarchical sets of permissions to be built up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8448) "Comparison method violates its general contract" in AbstractEndpointSnitch

2015-01-06 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266191#comment-14266191
 ] 

Brandon Williams commented on CASSANDRA-8448:
-

ISTM we already do that:

{noformat}
ArrayList sortedScores = new 
ArrayList<>(subsnitchOrderedScores);
Collections.sort(sortedScores);

Iterator sortedScoreIterator = sortedScores.iterator();
for (Double subsnitchScore : subsnitchOrderedScores)
{
if (subsnitchScore > (sortedScoreIterator.next() * (1.0 + 
BADNESS_THRESHOLD)))
{
sortByProximityWithScore(address, addresses);
return;
}
}
{noformat}

So I'm not sure what to do (and have no clear repro steps so I can't even try.) 
A simple workaround would be disabling badness, though.

> "Comparison method violates its general contract" in AbstractEndpointSnitch
> ---
>
> Key: CASSANDRA-8448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8448
> Project: Cassandra
>  Issue Type: Bug
>Reporter: J.B. Langston
>Assignee: Brandon Williams
>
> Seen in both 1.2 and 2.0.  The error is occurring here: 
> https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/locator/AbstractEndpointSnitch.java#L49
> {code}
> ERROR [Thrift:9] 2014-12-04 20:12:28,732 CustomTThreadPoolServer.java (line 
> 219) Error occurred during processing of message.
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2199)
>   at com.google.common.cache.LocalCache.get(LocalCache.java:3932)
>   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3936)
>   at 
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4806)
>   at 
> org.apache.cassandra.service.ClientState.authorize(ClientState.java:352)
>   at 
> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:224)
>   at 
> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:218)
>   at 
> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:202)
>   at 
> org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:822)
>   at 
> org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:954)
>   at com.datastax.bdp.server.DseServer.batch_mutate(DseServer.java:576)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3922)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3906)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Comparison method violates its 
> general contract!
>   at java.util.TimSort.mergeHi(TimSort.java:868)
>   at java.util.TimSort.mergeAt(TimSort.java:485)
>   at java.util.TimSort.mergeCollapse(TimSort.java:410)
>   at java.util.TimSort.sort(TimSort.java:214)
>   at java.util.TimSort.sort(TimSort.java:173)
>   at java.util.Arrays.sort(Arrays.java:659)
>   at java.util.Collections.sort(Collections.java:217)
>   at 
> org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49)
>   at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:157)
>   at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:186)
>   at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:151)
>   at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1408)
>   at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1402)
>   at 
> org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:148)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1223)
>   at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1165)
>   at 
> org.apache.cassandra.cq

[jira] [Updated] (CASSANDRA-8558) deleted row still can be selected out

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8558:
---
Assignee: Sylvain Lebresne  (was: Philip Thompson)

> deleted row still can be selected out
> -
>
> Key: CASSANDRA-8558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8558
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.1.2 
> java version "1.7.0_55"
>Reporter: zhaoyan
>Assignee: Sylvain Lebresne
>Priority: Blocker
> Fix For: 2.1.3
>
>
> first
> {code}CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE  TABLE space1.table3(a int, b int, c text,primary key(a,b));
> CREATE  KEYSPACE space2 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};{code}
> second
> {code}CREATE  TABLE space2.table1(a int, b int, c int, primary key(a,b));
> CREATE  TABLE space2.table2(a int, b int, c int, primary key(a,b));
> INSERT INTO space1.table3(a,b,c) VALUES(1,1,'1');
> drop table space2.table1;
> DELETE FROM space1.table3 where a=1 and b=1;
> drop table space2.table2;
> select * from space1.table3 where a=1 and b=1;{code}
> you will find that the row (a=1 and b=1)  in space1.table3 is not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8558) deleted row still can be selected out

2015-01-06 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266195#comment-14266195
 ] 

Philip Thompson commented on CASSANDRA-8558:


Thanks for the better test. The new bisect was much cleaner, the only failure I 
saw was the expected one highlighted by the reporter. Here is the bisect output:

{code}
362cc05352ec67e707e0ac790732e96a15e63f6b is the first bad commit
commit 362cc05352ec67e707e0ac790732e96a15e63f6b
Author: Sylvain Lebresne 
Date:   Tue Oct 29 11:03:52 2013 +0100

Push composites support in the storage engine

patch by slebresne; reviewed by benedict for CASSANDRA-5417

:100644 100644 c0d0f0d243f0454c1b1926957634bb52165295aa 
dcc7e33b064e52d245d8c6ba5de887c74e0f0a00 M  CHANGES.txt
:04 04 da620b5b36e8ba6b97cdb8efe4f690692a85e15e 
5dd1113af54ad7b90dd8694e462483bac6e7f985 M  src
:04 04 1fe1db9fca3bf03946828cdc917a0aef5bf7a9e1 
3bf0563361febc821a357cf61b5a0760f5a8eca4 M  test
{code}

> deleted row still can be selected out
> -
>
> Key: CASSANDRA-8558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8558
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.1.2 
> java version "1.7.0_55"
>Reporter: zhaoyan
>Assignee: Philip Thompson
>Priority: Blocker
> Fix For: 2.1.3
>
>
> first
> {code}CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE  TABLE space1.table3(a int, b int, c text,primary key(a,b));
> CREATE  KEYSPACE space2 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};{code}
> second
> {code}CREATE  TABLE space2.table1(a int, b int, c int, primary key(a,b));
> CREATE  TABLE space2.table2(a int, b int, c int, primary key(a,b));
> INSERT INTO space1.table3(a,b,c) VALUES(1,1,'1');
> drop table space2.table1;
> DELETE FROM space1.table3 where a=1 and b=1;
> drop table space2.table2;
> select * from space1.table3 where a=1 and b=1;{code}
> you will find that the row (a=1 and b=1)  in space1.table3 is not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8558) deleted row still can be selected out

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8558:
---
Tester: Philip Thompson

> deleted row still can be selected out
> -
>
> Key: CASSANDRA-8558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8558
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.1.2 
> java version "1.7.0_55"
>Reporter: zhaoyan
>Assignee: Sylvain Lebresne
>Priority: Blocker
> Fix For: 2.1.3
>
>
> first
> {code}CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE  TABLE space1.table3(a int, b int, c text,primary key(a,b));
> CREATE  KEYSPACE space2 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};{code}
> second
> {code}CREATE  TABLE space2.table1(a int, b int, c int, primary key(a,b));
> CREATE  TABLE space2.table2(a int, b int, c int, primary key(a,b));
> INSERT INTO space1.table3(a,b,c) VALUES(1,1,'1');
> drop table space2.table1;
> DELETE FROM space1.table3 where a=1 and b=1;
> drop table space2.table2;
> select * from space1.table3 where a=1 and b=1;{code}
> you will find that the row (a=1 and b=1)  in space1.table3 is not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8558) deleted row still can be selected out

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266220#comment-14266220
 ] 

Benedict commented on CASSANDRA-8558:
-

I had a quick debug, and it appears EOCs are the problem: the range tombstone 
has an EOC of start, but the SliceQuery has an EOC of NONE, and the 
SimpleBlockFetcher skips over every atom <= the slice start. But the name of a 
range tombstone is only its start, so this is considered prior to the slice's 
start, even though it stretches across the slice's range.

> deleted row still can be selected out
> -
>
> Key: CASSANDRA-8558
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8558
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.1.2 
> java version "1.7.0_55"
>Reporter: zhaoyan
>Assignee: Sylvain Lebresne
>Priority: Blocker
> Fix For: 2.1.3
>
>
> first
> {code}CREATE  KEYSPACE space1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};
> CREATE  TABLE space1.table3(a int, b int, c text,primary key(a,b));
> CREATE  KEYSPACE space2 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 3};{code}
> second
> {code}CREATE  TABLE space2.table1(a int, b int, c int, primary key(a,b));
> CREATE  TABLE space2.table2(a int, b int, c int, primary key(a,b));
> INSERT INTO space1.table3(a,b,c) VALUES(1,1,'1');
> drop table space2.table1;
> DELETE FROM space1.table3 where a=1 and b=1;
> drop table space2.table2;
> select * from space1.table3 where a=1 and b=1;{code}
> you will find that the row (a=1 and b=1)  in space1.table3 is not deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6246) EPaxos

2015-01-06 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266223#comment-14266223
 ] 

Blake Eggleston commented on CASSANDRA-6246:


By using the existing epaxos ordering constraints. Incrementing the epoch is 
done by an instance which takes all unacknowledged instances as dependencies 
for the token range it's incrementing the epoch for. The epoch can only be 
incremented if all previous instances have also been executed.

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>Priority: Minor
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-6246) EPaxos

2015-01-06 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266223#comment-14266223
 ] 

Blake Eggleston edited comment on CASSANDRA-6246 at 1/6/15 3:11 PM:


By using the existing epaxos ordering constraints. Incrementing the epoch is 
done by an instance which takes all unacknowledged instances as dependencies 
for the token range it's incrementing the epoch for. The epoch can only be 
incremented if all previous instances have also been executed. 

I pushed up some commits that add the epoch functionality yesterday if you'd 
like to take a look: 
https://github.com/bdeggleston/cassandra/tree/CASSANDRA-6246


was (Author: bdeggleston):
By using the existing epaxos ordering constraints. Incrementing the epoch is 
done by an instance which takes all unacknowledged instances as dependencies 
for the token range it's incrementing the epoch for. The epoch can only be 
incremented if all previous instances have also been executed.

> EPaxos
> --
>
> Key: CASSANDRA-6246
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6246
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Blake Eggleston
>Priority: Minor
>
> One reason we haven't optimized our Paxos implementation with Multi-paxos is 
> that Multi-paxos requires leader election and hence, a period of 
> unavailability when the leader dies.
> EPaxos is a Paxos variant that requires (1) less messages than multi-paxos, 
> (2) is particularly useful across multiple datacenters, and (3) allows any 
> node to act as coordinator: 
> http://sigops.org/sosp/sosp13/papers/p358-moraru.pdf
> However, there is substantial additional complexity involved if we choose to 
> implement it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-3025) PHP/PDO driver for Cassandra CQL

2015-01-06 Thread Lex Lythius (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266240#comment-14266240
 ] 

Lex Lythius commented on CASSANDRA-3025:


@Michael Yes, I've checked nearly all the drivers listed in that 
PlanetCassandra page.

While Cassandra JIRA != DataStax, there is considerable overlapping between 
Apache and DataStax regarding Cassandra. My goal is to get some indication from 
either of the main driving forces behind Cassandra.

@Alex's reply sheds some light on this matter: it will be a C++-built wrapper 
around the official C++ driver, which is native protocol-based. Will it be 
PDO-based as well? (with several C*-specific added features, to be sure). The 
one I've been contributing to is, but it uses the deprecated Thrift interface 
and, apart from a few bugs, it has no support for user-defined types and tuples.

I will be happy to contribute in my pretty modest capacity as C++ developer to 
a C* PHP driver -- I would just like to know I work in the right direction.

If this JIRA is not the right place to bring the issue to the table, where 
would that be?


> PHP/PDO driver for Cassandra CQL
> 
>
> Key: CASSANDRA-3025
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3025
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API
>Reporter: Mikko Koppanen
>Assignee: Mikko Koppanen
>  Labels: php
> Attachments: pdo_cassandra-0.1.0.tgz, pdo_cassandra-0.1.1.tgz, 
> pdo_cassandra-0.1.2.tgz, pdo_cassandra-0.1.3.tgz, pdo_cassandra-0.2.0.tgz, 
> pdo_cassandra-0.2.1.tgz, php_test_results_20110818_2317.txt
>
>
> Hello,
> attached is the initial version of the PDO driver for Cassandra CQL language. 
> This is a native PHP extension written in what I would call a combination of 
> C and C++, due to PHP being C. The thrift API used is the C++.
> The API looks roughly following:
> {code}
>  $db = new PDO('cassandra:host=127.0.0.1;port=9160');
> $db->exec ("CREATE KEYSPACE mytest with strategy_class = 'SimpleStrategy' and 
> strategy_options:replication_factor=1;");
> $db->exec ("USE mytest");
> $db->exec ("CREATE COLUMNFAMILY users (
>   my_key varchar PRIMARY KEY,
>   full_name varchar );");
>   
> $stmt = $db->prepare ("INSERT INTO users (my_key, full_name) VALUES (:key, 
> :full_name);");
> $stmt->execute (array (':key' => 'mikko', ':full_name' => 'Mikko K' ));
> {code}
> Currently prepared statements are emulated on the client side but I 
> understand that there is a plan to add prepared statements to Cassandra CQL 
> API as well. I will add this feature in to the extension as soon as they are 
> implemented.
> Additional documentation can be found in github 
> https://github.com/mkoppanen/php-pdo_cassandra, in the form of rendered 
> MarkDown file. Tests are currently not included in the package file and they 
> can be found in the github for now as well.
> I have created documentation in docbook format as well, but have not yet 
> rendered it.
> Comments and feedback are welcome.
> Thanks,
> Mikko



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[6/6] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-01-06 Thread brandonwilliams
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ffb7f649
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ffb7f649
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ffb7f649

Branch: refs/heads/trunk
Commit: ffb7f6491d905cd2edaba3a710a1c0bb1f1bf797
Parents: 0f024a6 136042e
Author: Brandon Williams 
Authored: Tue Jan 6 09:31:16 2015 -0600
Committer: Brandon Williams 
Committed: Tue Jan 6 09:31:16 2015 -0600

--
 .../apache/cassandra/gms/FailureDetector.java   |  8 +++
 .../apache/cassandra/gms/ArrivalWindowTest.java | 24 +++-
 2 files changed, 22 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ffb7f649/src/java/org/apache/cassandra/gms/FailureDetector.java
--



[5/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2015-01-06 Thread brandonwilliams
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/136042ec
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/136042ec
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/136042ec

Branch: refs/heads/cassandra-2.1
Commit: 136042ec3985bbacfcf55333ef5668779d2744fe
Parents: e83304d eb9c5bb
Author: Brandon Williams 
Authored: Tue Jan 6 09:31:05 2015 -0600
Committer: Brandon Williams 
Committed: Tue Jan 6 09:31:05 2015 -0600

--
 .../apache/cassandra/gms/FailureDetector.java   |  8 +++
 .../apache/cassandra/gms/ArrivalWindowTest.java | 24 +++-
 2 files changed, 22 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/136042ec/src/java/org/apache/cassandra/gms/FailureDetector.java
--



[3/6] cassandra git commit: Improve FD logging when the arrival time is ignored.

2015-01-06 Thread brandonwilliams
Improve FD logging when the arrival time is ignored.

Patch by Brandon Williams for CASSANDRA-8245


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb9c5bbc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb9c5bbc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb9c5bbc

Branch: refs/heads/trunk
Commit: eb9c5bbcfdaf498967052e02af65fa9f8167887d
Parents: a223082
Author: Brandon Williams 
Authored: Tue Jan 6 09:30:07 2015 -0600
Committer: Brandon Williams 
Committed: Tue Jan 6 09:30:07 2015 -0600

--
 .../apache/cassandra/gms/FailureDetector.java   |  8 +++
 .../apache/cassandra/gms/ArrivalWindowTest.java | 24 +++-
 2 files changed, 22 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb9c5bbc/src/java/org/apache/cassandra/gms/FailureDetector.java
--
diff --git a/src/java/org/apache/cassandra/gms/FailureDetector.java 
b/src/java/org/apache/cassandra/gms/FailureDetector.java
index 60729d3..e247e48 100644
--- a/src/java/org/apache/cassandra/gms/FailureDetector.java
+++ b/src/java/org/apache/cassandra/gms/FailureDetector.java
@@ -211,12 +211,12 @@ public class FailureDetector implements IFailureDetector, 
FailureDetectorMBean
 {
 // avoid adding an empty ArrivalWindow to the Map
 heartbeatWindow = new ArrivalWindow(SAMPLE_SIZE);
-heartbeatWindow.add(now);
+heartbeatWindow.add(now, ep);
 arrivalSamples.put(ep, heartbeatWindow);
 }
 else
 {
-heartbeatWindow.add(now);
+heartbeatWindow.add(now, ep);
 }
 }
 
@@ -326,7 +326,7 @@ class ArrivalWindow
 }
 }
 
-synchronized void add(long value)
+synchronized void add(long value, InetAddress ep)
 {
 assert tLast >= 0;
 if (tLast > 0L)
@@ -335,7 +335,7 @@ class ArrivalWindow
 if (interArrivalTime <= MAX_INTERVAL_IN_NANO)
 arrivalIntervals.add(interArrivalTime);
 else
-logger.debug("Ignoring interval time of {}", interArrivalTime);
+logger.debug("Ignoring interval time of {} for {}", 
interArrivalTime, ep);
 }
 else
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb9c5bbc/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
--
diff --git a/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java 
b/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
index e678d86..511511b 100644
--- a/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
+++ b/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
@@ -25,6 +25,10 @@ import static org.junit.Assert.*;
 
 import org.junit.Test;
 
+import java.lang.RuntimeException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+
 public class ArrivalWindowTest
 {
 @Test
@@ -32,12 +36,20 @@ public class ArrivalWindowTest
 {
 final ArrivalWindow windowWithNano = new ArrivalWindow(4);
 final long toNano = 100L;
-
-windowWithNano.add(111 * toNano);
-windowWithNano.add(222 * toNano);
-windowWithNano.add(333 * toNano);
-windowWithNano.add(444 * toNano);
-windowWithNano.add(555 * toNano);
+InetAddress ep;
+try
+{
+ep = InetAddress.getLocalHost();
+}
+catch (UnknownHostException e)
+{
+throw new RuntimeException(e);
+}
+windowWithNano.add(111 * toNano, ep);
+windowWithNano.add(222 * toNano, ep);
+windowWithNano.add(333 * toNano, ep);
+windowWithNano.add(444 * toNano, ep);
+windowWithNano.add(555 * toNano, ep);
 
 //all good
 assertEquals(1.0, windowWithNano.phi(666 * toNano), 0.01);



[2/6] cassandra git commit: Improve FD logging when the arrival time is ignored.

2015-01-06 Thread brandonwilliams
Improve FD logging when the arrival time is ignored.

Patch by Brandon Williams for CASSANDRA-8245


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb9c5bbc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb9c5bbc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb9c5bbc

Branch: refs/heads/cassandra-2.1
Commit: eb9c5bbcfdaf498967052e02af65fa9f8167887d
Parents: a223082
Author: Brandon Williams 
Authored: Tue Jan 6 09:30:07 2015 -0600
Committer: Brandon Williams 
Committed: Tue Jan 6 09:30:07 2015 -0600

--
 .../apache/cassandra/gms/FailureDetector.java   |  8 +++
 .../apache/cassandra/gms/ArrivalWindowTest.java | 24 +++-
 2 files changed, 22 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb9c5bbc/src/java/org/apache/cassandra/gms/FailureDetector.java
--
diff --git a/src/java/org/apache/cassandra/gms/FailureDetector.java 
b/src/java/org/apache/cassandra/gms/FailureDetector.java
index 60729d3..e247e48 100644
--- a/src/java/org/apache/cassandra/gms/FailureDetector.java
+++ b/src/java/org/apache/cassandra/gms/FailureDetector.java
@@ -211,12 +211,12 @@ public class FailureDetector implements IFailureDetector, 
FailureDetectorMBean
 {
 // avoid adding an empty ArrivalWindow to the Map
 heartbeatWindow = new ArrivalWindow(SAMPLE_SIZE);
-heartbeatWindow.add(now);
+heartbeatWindow.add(now, ep);
 arrivalSamples.put(ep, heartbeatWindow);
 }
 else
 {
-heartbeatWindow.add(now);
+heartbeatWindow.add(now, ep);
 }
 }
 
@@ -326,7 +326,7 @@ class ArrivalWindow
 }
 }
 
-synchronized void add(long value)
+synchronized void add(long value, InetAddress ep)
 {
 assert tLast >= 0;
 if (tLast > 0L)
@@ -335,7 +335,7 @@ class ArrivalWindow
 if (interArrivalTime <= MAX_INTERVAL_IN_NANO)
 arrivalIntervals.add(interArrivalTime);
 else
-logger.debug("Ignoring interval time of {}", interArrivalTime);
+logger.debug("Ignoring interval time of {} for {}", 
interArrivalTime, ep);
 }
 else
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb9c5bbc/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
--
diff --git a/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java 
b/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
index e678d86..511511b 100644
--- a/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
+++ b/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
@@ -25,6 +25,10 @@ import static org.junit.Assert.*;
 
 import org.junit.Test;
 
+import java.lang.RuntimeException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+
 public class ArrivalWindowTest
 {
 @Test
@@ -32,12 +36,20 @@ public class ArrivalWindowTest
 {
 final ArrivalWindow windowWithNano = new ArrivalWindow(4);
 final long toNano = 100L;
-
-windowWithNano.add(111 * toNano);
-windowWithNano.add(222 * toNano);
-windowWithNano.add(333 * toNano);
-windowWithNano.add(444 * toNano);
-windowWithNano.add(555 * toNano);
+InetAddress ep;
+try
+{
+ep = InetAddress.getLocalHost();
+}
+catch (UnknownHostException e)
+{
+throw new RuntimeException(e);
+}
+windowWithNano.add(111 * toNano, ep);
+windowWithNano.add(222 * toNano, ep);
+windowWithNano.add(333 * toNano, ep);
+windowWithNano.add(444 * toNano, ep);
+windowWithNano.add(555 * toNano, ep);
 
 //all good
 assertEquals(1.0, windowWithNano.phi(666 * toNano), 0.01);



[4/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2015-01-06 Thread brandonwilliams
Merge branch 'cassandra-2.0' into cassandra-2.1


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/136042ec
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/136042ec
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/136042ec

Branch: refs/heads/trunk
Commit: 136042ec3985bbacfcf55333ef5668779d2744fe
Parents: e83304d eb9c5bb
Author: Brandon Williams 
Authored: Tue Jan 6 09:31:05 2015 -0600
Committer: Brandon Williams 
Committed: Tue Jan 6 09:31:05 2015 -0600

--
 .../apache/cassandra/gms/FailureDetector.java   |  8 +++
 .../apache/cassandra/gms/ArrivalWindowTest.java | 24 +++-
 2 files changed, 22 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/136042ec/src/java/org/apache/cassandra/gms/FailureDetector.java
--



[1/6] cassandra git commit: Improve FD logging when the arrival time is ignored.

2015-01-06 Thread brandonwilliams
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 a22308294 -> eb9c5bbcf
  refs/heads/cassandra-2.1 e83304ddf -> 136042ec3
  refs/heads/trunk 0f024a619 -> ffb7f6491


Improve FD logging when the arrival time is ignored.

Patch by Brandon Williams for CASSANDRA-8245


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/eb9c5bbc
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/eb9c5bbc
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/eb9c5bbc

Branch: refs/heads/cassandra-2.0
Commit: eb9c5bbcfdaf498967052e02af65fa9f8167887d
Parents: a223082
Author: Brandon Williams 
Authored: Tue Jan 6 09:30:07 2015 -0600
Committer: Brandon Williams 
Committed: Tue Jan 6 09:30:07 2015 -0600

--
 .../apache/cassandra/gms/FailureDetector.java   |  8 +++
 .../apache/cassandra/gms/ArrivalWindowTest.java | 24 +++-
 2 files changed, 22 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb9c5bbc/src/java/org/apache/cassandra/gms/FailureDetector.java
--
diff --git a/src/java/org/apache/cassandra/gms/FailureDetector.java 
b/src/java/org/apache/cassandra/gms/FailureDetector.java
index 60729d3..e247e48 100644
--- a/src/java/org/apache/cassandra/gms/FailureDetector.java
+++ b/src/java/org/apache/cassandra/gms/FailureDetector.java
@@ -211,12 +211,12 @@ public class FailureDetector implements IFailureDetector, 
FailureDetectorMBean
 {
 // avoid adding an empty ArrivalWindow to the Map
 heartbeatWindow = new ArrivalWindow(SAMPLE_SIZE);
-heartbeatWindow.add(now);
+heartbeatWindow.add(now, ep);
 arrivalSamples.put(ep, heartbeatWindow);
 }
 else
 {
-heartbeatWindow.add(now);
+heartbeatWindow.add(now, ep);
 }
 }
 
@@ -326,7 +326,7 @@ class ArrivalWindow
 }
 }
 
-synchronized void add(long value)
+synchronized void add(long value, InetAddress ep)
 {
 assert tLast >= 0;
 if (tLast > 0L)
@@ -335,7 +335,7 @@ class ArrivalWindow
 if (interArrivalTime <= MAX_INTERVAL_IN_NANO)
 arrivalIntervals.add(interArrivalTime);
 else
-logger.debug("Ignoring interval time of {}", interArrivalTime);
+logger.debug("Ignoring interval time of {} for {}", 
interArrivalTime, ep);
 }
 else
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/eb9c5bbc/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
--
diff --git a/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java 
b/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
index e678d86..511511b 100644
--- a/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
+++ b/test/unit/org/apache/cassandra/gms/ArrivalWindowTest.java
@@ -25,6 +25,10 @@ import static org.junit.Assert.*;
 
 import org.junit.Test;
 
+import java.lang.RuntimeException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+
 public class ArrivalWindowTest
 {
 @Test
@@ -32,12 +36,20 @@ public class ArrivalWindowTest
 {
 final ArrivalWindow windowWithNano = new ArrivalWindow(4);
 final long toNano = 100L;
-
-windowWithNano.add(111 * toNano);
-windowWithNano.add(222 * toNano);
-windowWithNano.add(333 * toNano);
-windowWithNano.add(444 * toNano);
-windowWithNano.add(555 * toNano);
+InetAddress ep;
+try
+{
+ep = InetAddress.getLocalHost();
+}
+catch (UnknownHostException e)
+{
+throw new RuntimeException(e);
+}
+windowWithNano.add(111 * toNano, ep);
+windowWithNano.add(222 * toNano, ep);
+windowWithNano.add(333 * toNano, ep);
+windowWithNano.add(444 * toNano, ep);
+windowWithNano.add(555 * toNano, ep);
 
 //all good
 assertEquals(1.0, windowWithNano.phi(666 * toNano), 0.01);



[jira] [Resolved] (CASSANDRA-8245) Cassandra nodes periodically die in 2-DC configuration

2015-01-06 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-8245.
-
   Resolution: Fixed
Fix Version/s: 2.1.3
   2.0.12

I added the host to the logging.  Since there's nothing else actionable here, 
I'm resolving this ticket.

> Cassandra nodes periodically die in 2-DC configuration
> --
>
> Key: CASSANDRA-8245
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8245
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Scientific Linux release 6.5
> java version "1.7.0_51"
> Cassandra 2.0.9
>Reporter: Oleg Poleshuk
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 2.0.12, 2.1.3
>
> Attachments: stack1.txt, stack2.txt, stack3.txt, stack4.txt, 
> stack5.txt
>
>
> We have 2 DCs with 3 nodes in each.
> Second DC periodically has 1-2 nodes down.
> Looks like it looses connectivity with another nodes and then Gossiper starts 
> to accumulate tasks until Cassandra dies with OOM.
> WARN [MemoryMeter:1] 2014-08-12 14:34:59,803 Memtable.java (line 470) setting 
> live ratio to maximum of 64.0 instead of Infinity
>  WARN [GossipTasks:1] 2014-08-12 14:44:34,866 Gossiper.java (line 637) Gossip 
> stage has 1 pending tasks; skipping status check (no nodes will be marked 
> down)
>  WARN [GossipTasks:1] 2014-08-12 14:44:35,968 Gossiper.java (line 637) Gossip 
> stage has 4 pending tasks; skipping status check (no nodes will be marked 
> down)
>  WARN [GossipTasks:1] 2014-08-12 14:44:37,070 Gossiper.java (line 637) Gossip 
> stage has 8 pending tasks; skipping status check (no nodes will be marked 
> down)
>  WARN [GossipTasks:1] 2014-08-12 14:44:38,171 Gossiper.java (line 637) Gossip 
> stage has 11 pending tasks; skipping status check (no nodes will be marked 
> down)
> ...
> WARN [GossipTasks:1] 2014-10-06 21:42:51,575 Gossiper.java (line 637) Gossip 
> stage has 1014764 pending tasks; skipping status check (no nodes will be 
> marked down)
>  WARN [New I/O worker #13] 2014-10-06 21:54:27,010 Slf4JLogger.java (line 76) 
> Unexpected exception in the selector loop.
> java.lang.OutOfMemoryError: Java heap space
> Also those lines but not sure it is relevant:
> DEBUG [GossipStage:1] 2014-08-12 11:33:18,801 FailureDetector.java (line 338) 
> Ignoring interval time of 2085963047



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2015-01-06 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266284#comment-14266284
 ] 

Marcus Eriksson commented on CASSANDRA-8499:


2.1 v2 seems to double-close the files, first when we switch the writer, then 
when we call abort(), running SSTableRewriterTest.testNumberOfFiles_abort() 
outputs this: WARN  15:43:17 close(81) failed, errno (9).

> Ensure SSTableWriter cleans up properly after failure
> -
>
> Key: CASSANDRA-8499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 2.0.12
>
> Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2
>
>
> In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
> offheap memory for writing compression metadata. In both we attempt to flush 
> the BF despite having encountered an exception, making the exception slow to 
> propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8541) References to non-existent/deprecated CqlPagingInputFormat in code

2015-01-06 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266355#comment-14266355
 ] 

Jeremiah Jordan commented on CASSANDRA-8541:


Did you test this?  I don't think you can just drop in replace 
CqlPagingInputFormat with CqlInputFormat

> References to non-existent/deprecated CqlPagingInputFormat in code
> --
>
> Key: CASSANDRA-8541
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8541
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Rekha Joshi
>Assignee: Rekha Joshi
>  Labels: hadoop
> Fix For: 2.0.12
>
> Attachments: CASSANDRA-8541.txt
>
>
> On Mac 10.9.5, Java 1.7, latest cassandra trunk -
> References to non-existent/deprecated CqlPagingInputFormat in code.
> As per Changes.txt/7570 both CqlPagingInputFormat and CqlPagingRecordReader 
> are removed, but lingering references in WordCount,CqlStorage..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2015-01-06 Thread Michael Shuler (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-8325:
--
Reproduced In: 2.1.2, 2.1.1  (was: 2.1.1, 2.1.2)
   Tester: Michael Shuler

> Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
> -
>
> Key: CASSANDRA-8325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
> Project: Cassandra
>  Issue Type: Bug
> Environment: FreeBSD 10.0 with openjdk version "1.7.0_71", 64-Bit 
> Server VM
>Reporter: Leonid Shalupov
> Attachments: hs_err_pid1856.log, system.log, unsafeCopy1.txt, 
> untested_8325.patch
>
>
> See attached error file after JVM crash
> {quote}
> FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
> Jan 16 22:34:59 UTC 2014 
> r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
> {quote}
> {quote}
>  % java -version
> openjdk version "1.7.0_71"
> OpenJDK Runtime Environment (build 1.7.0_71-b14)
> OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2015-01-06 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266402#comment-14266402
 ] 

Ariel Weisberg commented on CASSANDRA-8457:
---

I think I stumbled onto what is going on based on Benedict's suggestion to 
disable TCP no delay. It looks like there is a small message performance issue. 

This is something I have seen before in EC2 where you can only send a 
surprisingly small number of messages in/out of a VM. I don't have the numbers 
from when I micro benchmarked it, but it is something like 450k messages with 
TCP no delay and a million or so without. Adding more sockets helps but it 
doesn't  even double the number of messages in/out. Throwing more cores at the 
problem doesn't help you just end up with under utilized cores which matches 
the mysterious levels of starvation I was seeing in C* even though I was 
exposing sufficient concurrency.

14 servers nodes. 6 client nodes. 500 threads per client. Server started with
"row_cache_size_in_mb" : "2000",
"key_cache_size_in_mb":"500",
"rpc_max_threads" : "1024",
"rpc_min_threads" : "16",
"native_transport_max_threads" : "1024"
8-gig old gen, 2 gig new gen.

Client running CL=ALL and the same schema I have been using throughout this 
ticket.

With no delay off
First set of runs
390264
387958
392322
After replacing 10 instances
366579
365818
378221

No delay on 
162987

Modified trunk to fix a bug batching messages and add a configurable window for 
coalescing multiple messages into a socket see 
https://github.com/aweisberg/cassandra/compare/f733996...49c6609
||Coalesce window microseconds|Throughput||
|250| 502614|
|200| 496206|
|150| 487195|
|100| 423415|
|50| 326648|
|25| 308175|
|12| 292894|
|6| 268456|
|0| 153688|

I did not expect get mileage out of coalescing at the application level but it 
works extremely well. CPU utilization is still low at 1800%. There seems to be 
less correlation between CPU utilization and throughput as I vary the 
coalescing window and throughput changes dramatically. I do see that core 0 is 
looking pretty saturated and is only 10% idle. That might be the next or actual 
bottleneck.

What role this optimization plays at different cluster sizes is an important 
question. There hast to be a tipping point where coalescing stops working  
because not enough packets go to each end point at the same time. With vnodes 
it wouldn't be unusual to be communicating with a large number of other hosts 
right?

It also takes a significant amount of additional latency to get the mileage at 
high levels of throughput, but at lower concurrency there is no benefit and it 
will probably show up as decreased throughput. It makes it tough to crank it up 
as a default. Either it is adaptive or most people don't get the benefit.

At high levels of throughput it is a clear latency win. Latency is much lower 
for individual requests on average. Making this a config option is viable as a 
starting point. Possibly a separate option for local/remote DC coalescing. 
Ideally we could make it adapt to the workload.

I am going to chase down what impact coalescing has at lower levels of 
concurrency so we can quantify the cost of turning it on. I'm also going to try 
and get to the bottom of all interrupts going to core 0. Maybe it is the real 
problem and coalescing is just a band aid to get more throughput.

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Ariel Weisberg
>  Labels: performance
> Fix For: 3.0
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8457) nio MessagingService

2015-01-06 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266402#comment-14266402
 ] 

Ariel Weisberg edited comment on CASSANDRA-8457 at 1/6/15 5:05 PM:
---

I think I stumbled onto what is going on based on Benedict's suggestion to 
disable TCP no delay. It looks like there is a small message performance issue. 

This is something I have seen before in EC2 where you can only send a 
surprisingly small number of messages in/out of a VM. I don't have the numbers 
from when I micro benchmarked it, but it is something like 450k messages with 
TCP no delay and a million or so without. Adding more sockets helps but it 
doesn't  even double the number of messages in/out. Throwing more cores at the 
problem doesn't help you just end up with under utilized cores which matches 
the mysterious levels of starvation I was seeing in C* even though I was 
exposing sufficient concurrency.

14 servers nodes. 6 client nodes. 500 threads per client. Server started with
"row_cache_size_in_mb" : "2000",
"key_cache_size_in_mb":"500",
"rpc_max_threads" : "1024",
"rpc_min_threads" : "16",
"native_transport_max_threads" : "1024"
8-gig old gen, 2 gig new gen.

Client running CL=ALL and the same schema I have been using throughout this 
ticket.

With no delay off
First set of runs
390264
387958
392322
After replacing 10 instances
366579
365818
378221

No delay on 
162987

Modified trunk to fix a bug batching messages and add a configurable window for 
coalescing multiple messages into a socket see 
https://github.com/aweisberg/cassandra/compare/f733996...49c6609
||Coalesce window microseconds|Throughput||
|250| 502614|
|200| 496206|
|150| 487195|
|100| 423415|
|50| 326648|
|25| 308175|
|12| 292894|
|6| 268456|
|0| 153688|

I did not expect to get mileage out of coalescing at the application level but 
it works extremely well. CPU utilization is still low at 1800%. There seems to 
be less correlation between CPU utilization and throughput as I vary the 
coalescing window and throughput changes dramatically. I do see that core 0 is 
looking pretty saturated and is only 10% idle. That might be the next or actual 
bottleneck.

What role this optimization plays at different cluster sizes is an important 
question. There has to be a tipping point where coalescing stops working 
because not enough packets go to each end point at the same time. With vnodes 
it wouldn't be unusual to be communicating with a large number of other hosts 
right?

It also takes a significant amount of additional latency to get the mileage at 
high levels of throughput, but at lower concurrency there is no benefit and it 
will probably show up as decreased throughput. It makes it tough to crank it up 
as a default. Either it is adaptive or most people don't get the benefit.

At high levels of throughput it is a clear latency win. Latency is much lower 
for individual requests on average. Making this a config option is viable as a 
starting point. Possibly a separate option for local/remote DC coalescing. 
Ideally we could make it adapt to the workload.

I am going to chase down what impact coalescing has at lower levels of 
concurrency so we can quantify the cost of turning it on. I'm also going to try 
and get to the bottom of all interrupts going to core 0. Maybe it is the real 
problem and coalescing is just a band aid to get more throughput.


was (Author: aweisberg):
I think I stumbled onto what is going on based on Benedict's suggestion to 
disable TCP no delay. It looks like there is a small message performance issue. 

This is something I have seen before in EC2 where you can only send a 
surprisingly small number of messages in/out of a VM. I don't have the numbers 
from when I micro benchmarked it, but it is something like 450k messages with 
TCP no delay and a million or so without. Adding more sockets helps but it 
doesn't  even double the number of messages in/out. Throwing more cores at the 
problem doesn't help you just end up with under utilized cores which matches 
the mysterious levels of starvation I was seeing in C* even though I was 
exposing sufficient concurrency.

14 servers nodes. 6 client nodes. 500 threads per client. Server started with
"row_cache_size_in_mb" : "2000",
"key_cache_size_in_mb":"500",
"rpc_max_threads" : "1024",
"rpc_min_threads" : "16",
"native_transport_max_threads" : "1024"
8-gig old gen, 2 gig new gen.

Client running CL=ALL and the same schema I have been using throughout this 
ticket.

With no delay off
First set of runs
390264
387958
392322
After replacing 10 instances
366579
365818
378221

No delay on 
162987

Modified trunk to fix a bug batching messages and add a configurable window for 
coalescing multiple messages into a socket see 
https://github.com/aweisberg/cassandra/compa

[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2015-01-06 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266422#comment-14266422
 ] 

Ariel Weisberg commented on CASSANDRA-8457:
---

https://forums.aws.amazon.com/thread.jspa?messageID=459260
This looks like a Xen/EC2 issue. I'll bet bare metal never has this issue 
because it has multiple interrupt vectors for NICs. The only work around is 
using multiple elastic network interfaces and doing something to make that 
practical for intra-cluster communication.

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Ariel Weisberg
>  Labels: performance
> Fix For: 3.0
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7653) Add role based access control to Cassandra

2015-01-06 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266424#comment-14266424
 ] 

Sam Tunnicliffe commented on CASSANDRA-7653:


I've been thinking about constuctInitialSaslToken too and it is truly 
unpleasant. It's there to support Thrift clients which could be sending 
arbitrary k/v pairs to a custom IAuthenticator via the login() call. So we have 
to support that somehow without changing the thrift interface. 
constructInitialSaslToken is one way to do it, but it sucks so my current plan 
is to replace it with a legacyAuthenticate() method which impls can decide 
whether to support or not (if they support Thrift and/or native protocol v1 
authentication).

On the second point, I think it's doable to support custom options using json 
syntax. Something like:

{code}
CREATE ROLE foo WITH PASSWORD 'bar' AND OPTIONS {'a' : 'aaa', 'b' : 1} 
NOSUPERUSER LOGIN;
{code}


> Add role based access control to Cassandra
> --
>
> Key: CASSANDRA-7653
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7653
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Mike Adamson
>Assignee: Sam Tunnicliffe
> Fix For: 3.0
>
> Attachments: 7653.patch, CQLSmokeTest.java, cql_smoke_test.py
>
>
> The current authentication model supports granting permissions to individual 
> users. While this is OK for small or medium organizations wanting to 
> implement authorization, it does not work well in large organizations because 
> of the overhead of having to maintain the permissions for each user.
> Introducing roles into the authentication model would allow sets of 
> permissions to be controlled in one place as a role and then the role granted 
> to users. Roles should also be able to be granted to other roles to allow 
> hierarchical sets of permissions to be built up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8403) limit disregarded when paging with IN clause under certain conditions

2015-01-06 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266425#comment-14266425
 ] 

Benjamin Lerer commented on CASSANDRA-8403:
---

I tried to reproduce the problem on a single node and on a 3 nodes cluster 
using cqlsh but without success. 
Could you tell me how many nodes you were using and how you interacted with 
Cassandra?

> limit disregarded when paging with IN clause under certain conditions
> -
>
> Key: CASSANDRA-8403
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8403
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Russ Hatch
>Assignee: Benjamin Lerer
>Priority: Minor
>
> This issue was originally reported on the python-driver userlist and 
> confirmed by [~aholmber]
> When:
> page_size < limit < data size,
> the limit value is disregarded and all rows are paged back.
> to repro:
> create a table and populate it with two partitions
> CREATE TABLE paging_test ( id int, value text, PRIMARY KEY (id, value) )
> Add data: in one partition create 10 rows, an in a second partition create 20 
> rows
> perform a query with page_size of 10 and a LIMIT of 20, like so:
> SELECT * FROM paging_test where id in (1,2) LIMIT 20;
> The limit is disregarded and three pages of 10 records each will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7875) Prepared statements using dropped indexes are not handled correctly

2015-01-06 Thread Arindam (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266433#comment-14266433
 ] 

Arindam commented on CASSANDRA-7875:


Hello, 
Just getting the same error in 2.0.9. Is there any steps to resolution or 
upgrade is the only option?
Thanks

> Prepared statements using dropped indexes are not handled correctly
> ---
>
> Key: CASSANDRA-7875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.1.3
>
> Attachments: repro.py
>
>
> When select statements are prepared, we verify that the column restrictions 
> use indexes (where necessary).  However, we don't perform a similar check 
> when the statement is executed, so it fails somewhere further down the line.  
> In this case, it hits an assertion:
> {noformat}
> java.lang.AssertionError: Sequential scan with filters is not supported (if 
> you just created an index, you need to wait for the creation to be propagated 
> to all nodes before querying it)
>   at 
> org.apache.cassandra.db.filter.ExtendedFilter$WithClauses.getExtraFilter(ExtendedFilter.java:259)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1759)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1709)
>   at 
> org.apache.cassandra.db.PagedRangeCommand.executeLocally(PagedRangeCommand.java:119)
>   at 
> org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1394)
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1936)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> {noformat}
> During execution, we should check that the indexes still exist and provide a 
> better error if they do not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7875) Prepared statements using dropped indexes are not handled correctly

2015-01-06 Thread Arindam (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266433#comment-14266433
 ] 

Arindam edited comment on CASSANDRA-7875 at 1/6/15 5:20 PM:


Hello, 
Just getting the same error in 2.0.9. Is there any steps to resolution or 
upgrade is the only option?

Note: The index is custom secondary index and it does exists as it has been 
created quite long back. The environment is having only 1 node cluster. 

Thanks
Arindam


was (Author: abose78):
Hello, 
Just getting the same error in 2.0.9. Is there any steps to resolution or 
upgrade is the only option?
Thanks

> Prepared statements using dropped indexes are not handled correctly
> ---
>
> Key: CASSANDRA-7875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.1.3
>
> Attachments: repro.py
>
>
> When select statements are prepared, we verify that the column restrictions 
> use indexes (where necessary).  However, we don't perform a similar check 
> when the statement is executed, so it fails somewhere further down the line.  
> In this case, it hits an assertion:
> {noformat}
> java.lang.AssertionError: Sequential scan with filters is not supported (if 
> you just created an index, you need to wait for the creation to be propagated 
> to all nodes before querying it)
>   at 
> org.apache.cassandra.db.filter.ExtendedFilter$WithClauses.getExtraFilter(ExtendedFilter.java:259)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1759)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1709)
>   at 
> org.apache.cassandra.db.PagedRangeCommand.executeLocally(PagedRangeCommand.java:119)
>   at 
> org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1394)
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1936)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> {noformat}
> During execution, we should check that the indexes still exist and provide a 
> better error if they do not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8457) nio MessagingService

2015-01-06 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266422#comment-14266422
 ] 

Ariel Weisberg edited comment on CASSANDRA-8457 at 1/6/15 5:31 PM:
---

https://forums.aws.amazon.com/thread.jspa?messageID=459260
This looks like a Xen/EC2 issue. I'll bet bare metal never has this issue 
because it has multiple interrupt vectors for NICs. The only work around is 
using multiple elastic network interfaces and doing something to make that 
practical for intra-cluster communication.

Apparently the right scale instances I am using don't have enhanced networking. 
To get enhanced networking I need to use VPC. I am not sure if starting 
instances with enhanced networking is possible via right scale, but I am going 
to find out. I don't know if enhanced networking addresses the interrupt vector 
issue. Will do more digging.


was (Author: aweisberg):
https://forums.aws.amazon.com/thread.jspa?messageID=459260
This looks like a Xen/EC2 issue. I'll bet bare metal never has this issue 
because it has multiple interrupt vectors for NICs. The only work around is 
using multiple elastic network interfaces and doing something to make that 
practical for intra-cluster communication.

> nio MessagingService
> 
>
> Key: CASSANDRA-8457
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Ariel Weisberg
>  Labels: performance
> Fix For: 3.0
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7875) Prepared statements using dropped indexes are not handled correctly

2015-01-06 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266463#comment-14266463
 ] 

Aleksey Yeschenko commented on CASSANDRA-7875:
--

Since 2.0 doesn't have CASSANDRA-7923 in, what you can do is re-prepare the 
statement.

> Prepared statements using dropped indexes are not handled correctly
> ---
>
> Key: CASSANDRA-7875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.1.3
>
> Attachments: repro.py
>
>
> When select statements are prepared, we verify that the column restrictions 
> use indexes (where necessary).  However, we don't perform a similar check 
> when the statement is executed, so it fails somewhere further down the line.  
> In this case, it hits an assertion:
> {noformat}
> java.lang.AssertionError: Sequential scan with filters is not supported (if 
> you just created an index, you need to wait for the creation to be propagated 
> to all nodes before querying it)
>   at 
> org.apache.cassandra.db.filter.ExtendedFilter$WithClauses.getExtraFilter(ExtendedFilter.java:259)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1759)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1709)
>   at 
> org.apache.cassandra.db.PagedRangeCommand.executeLocally(PagedRangeCommand.java:119)
>   at 
> org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1394)
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1936)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> {noformat}
> During execution, we should check that the indexes still exist and provide a 
> better error if they do not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7875) Prepared statements using dropped indexes are not handled correctly

2015-01-06 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266466#comment-14266466
 ] 

Aleksey Yeschenko commented on CASSANDRA-7875:
--

As for this ticket, I'm not sure that just adding an extra runtime check is the 
way to go, or sufficient.
Since this event (dropping an index) should be very rare in production, we 
should probably just extend CASSANDRA-7910 logic here, and just invalidate the 
potentially affected statements.

> Prepared statements using dropped indexes are not handled correctly
> ---
>
> Key: CASSANDRA-7875
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7875
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Tyler Hobbs
>Assignee: Tyler Hobbs
>Priority: Minor
> Fix For: 2.1.3
>
> Attachments: repro.py
>
>
> When select statements are prepared, we verify that the column restrictions 
> use indexes (where necessary).  However, we don't perform a similar check 
> when the statement is executed, so it fails somewhere further down the line.  
> In this case, it hits an assertion:
> {noformat}
> java.lang.AssertionError: Sequential scan with filters is not supported (if 
> you just created an index, you need to wait for the creation to be propagated 
> to all nodes before querying it)
>   at 
> org.apache.cassandra.db.filter.ExtendedFilter$WithClauses.getExtraFilter(ExtendedFilter.java:259)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1759)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1709)
>   at 
> org.apache.cassandra.db.PagedRangeCommand.executeLocally(PagedRangeCommand.java:119)
>   at 
> org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1394)
>   at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1936)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> {noformat}
> During execution, we should check that the indexes still exist and provide a 
> better error if they do not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2015-01-06 Thread Catalin Alexandru Zamfir (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266473#comment-14266473
 ] 

Catalin Alexandru Zamfir commented on CASSANDRA-7139:
-

Our set-up was RAID5 and the min (numberOfDisk, numberOfCores) would just be 2, 
when we have 40+ cores. The commented "concurrent_compactors" would be "2" 
meaning that a lot of SSTables are accumulating in high-cardinality tables 
(where the partition key is an UUID-type) because the compaction is limited to 
"2". Looking at "dstat" even if we've set compaction_throughput_in_mb_per_sec 
to 192 (spinning disk) the dstat -lrv1 disk write maxes out at 10MB/s.

IMHO, the concurrent_compactors should be 
number_of_cores/compaction_throughput_in_mb_per_sec * 100 which in our case (40 
cores) gives around 20/21 compactors. And on 8 cores (8/192 * 100 gives 4 
concurrent compactors).

> Default concurrent_compactors is probably too high
> --
>
> Key: CASSANDRA-7139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Jonathan Ellis
>Priority: Minor
> Fix For: 2.1 rc1
>
> Attachments: 7139.txt
>
>
> The default number of concurrent compactors is probably too high for modern 
> hardware with spinning disks for storage: A modern blade can easily have 24+ 
> Cores, which would result in a default of 24 concurrent compactions. This not 
> only increases random IO, it also keeps around a lot of obsoleted files for 
> an unnecessarily long time, as each compaction keeps references to any 
> possibly overlapping files that it isn't itself compacting - but these can 
> have been obsoleted part way through by compactions that finished earlier. If 
> you factor in the default compaction throughput rate of 16Mb/s, anything but 
> a single default concurrent_compactor makes very little sense, as a single 
> thread should always be able to handle 16Mb/s, will cause less interference 
> with other processes, and permits obsoleted files to be immediately removed.
> See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
> this change on a box with 24-cores and 8Tb of storage (first spike is default 
> settings)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7622) Implement virtual tables

2015-01-06 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-7622:
-
Fix Version/s: (was: 3.0)
   3.1

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
> Fix For: 3.1
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7139) Default concurrent_compactors is probably too high

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266493#comment-14266493
 ] 

Benedict commented on CASSANDRA-7139:
-

This is only the default. You are recommended to tune this default based on 
your own system's behaviour. With modern SSDs and many cores, many concurrent 
compactors is a great idea. For spinning disk setups, it can be terrible, and 
we want to avoid terrible default decisions.

Either way, I suspect the problem you are encountering is entirely different, 
i.e. that the default _compaction_throughput_mb_per_sec_ is 10, which would be 
why you are maxing out at exactly 10MB/s.

> Default concurrent_compactors is probably too high
> --
>
> Key: CASSANDRA-7139
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7139
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Benedict
>Assignee: Jonathan Ellis
>Priority: Minor
> Fix For: 2.1 rc1
>
> Attachments: 7139.txt
>
>
> The default number of concurrent compactors is probably too high for modern 
> hardware with spinning disks for storage: A modern blade can easily have 24+ 
> Cores, which would result in a default of 24 concurrent compactions. This not 
> only increases random IO, it also keeps around a lot of obsoleted files for 
> an unnecessarily long time, as each compaction keeps references to any 
> possibly overlapping files that it isn't itself compacting - but these can 
> have been obsoleted part way through by compactions that finished earlier. If 
> you factor in the default compaction throughput rate of 16Mb/s, anything but 
> a single default concurrent_compactor makes very little sense, as a single 
> thread should always be able to handle 16Mb/s, will cause less interference 
> with other processes, and permits obsoleted files to be immediately removed.
> See [http://imgur.com/HDqhxFp] for a graph demonstrating the result of making 
> this change on a box with 24-cores and 8Tb of storage (first spike is default 
> settings)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8552) Large compactions run out of off-heap RAM

2015-01-06 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266510#comment-14266510
 ] 

Philip Thompson commented on CASSANDRA-8552:


Yes, I will spin up a small cluster of Ubuntu 14.4 m1.xlarge nodes on EC2.

> Large compactions run out of off-heap RAM
> -
>
> Key: CASSANDRA-8552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.4 
> AWS EC2
> 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
> Java build 1.7.0_55-b13 and build 1.8.0_25-b17
>Reporter: Brent Haines
>Assignee: Benedict
>Priority: Blocker
> Fix For: 2.1.3
>
> Attachments: Screen Shot 2015-01-02 at 9.36.11 PM.png, fhandles.log, 
> freelog.log, lsof.txt, meminfo.txt, sysctl.txt, system.log
>
>
> We have a large table of storing, effectively event logs and a pair of 
> denormalized tables for indexing.
> When updating from 2.0 to 2.1 we saw performance improvements, but some 
> random and silent crashes during nightly repairs. We lost a node (totally 
> corrupted) and replaced it. That node has never stabilized -- it simply can't 
> finish the compactions. 
> Smaller compactions finish. Larger compactions, like these two never finish - 
> {code}
> pending tasks: 48
>compaction type   keyspace table completed total   
>  unit   progress
> Compaction   data   stories   16532973358   75977993784   
> bytes 21.76%
> Compaction   data   stories_by_text   10593780658   38555048812   
> bytes 27.48%
> Active compaction remaining time :   0h10m51s
> {code}
> We are not getting exceptions and are not running out of heap space. The 
> Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
> We watch memory in the opscenter console and it will grow. If we turn off the 
> OOM killer for the process, it will run until everything else is killed 
> instead and then the kernel panics.
> We have the following settings configured: 
> 2G Heap
> 512M New
> {code}
> memtable_heap_space_in_mb: 1024
> memtable_offheap_space_in_mb: 1024
> memtable_allocation_type: heap_buffers
> commitlog_total_space_in_mb: 2048
> concurrent_compactors: 1
> compaction_throughput_mb_per_sec: 128
> {code}
> The compaction strategy is leveled (these are read-intensive tables that are 
> rarely updated)
> I have tried every setting, every option and I have the system where the MTBF 
> is about an hour now, but we never finish compacting because there are some 
> large compactions pending. None of the GC tools or settings help because it 
> is not a GC problem. It is an off-heap memory problem.
> We are getting these messages in our syslog 
> {code}
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
> process java  pte:0320 pmd:2d6fa5067
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
> vm_flags:0870 anon_vma:  (null) mapping:  (null) 
> index:7fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
> Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
> 88020d43da98 81715ac4 7fb820be3000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
> 81174183 0320 0007fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
> 0320 7fb820be3000 7fb820be4000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [] 
> dump_stack+0x45/0x56
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [] 
> print_bad_pte+0x1a3/0x250
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [] 
> vm_normal_page+0x69/0x80
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [] 
> unmap_page_range+0x3bb/0x7f0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [] 
> unmap_single_vma+0x81/0xf0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [] 
> unmap_vmas+0x49/0x90
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219610]  [] 
> exit_mmap+0x9c/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219617]  [] 
> ? __delayacct_add_tsk+0x153/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219621]  [] 
> mmput+0x5c/0x120
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219625]  [] 
> do_exit+0x26c/0xa50
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219631]  [] 
> ? __unqueue_futex+0x31/0x60
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219634]  [] 
> ? futex_wait+0x126/0x290
> Jan  2 07:06:0

[jira] [Commented] (CASSANDRA-8552) Large compactions run out of off-heap RAM

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266533#comment-14266533
 ] 

Benedict commented on CASSANDRA-8552:
-

Thanks. I would try first off with a single node and simply see if it is enough 
to trigger the scenario by writing, say, 100G of data to the node. The simplest 
explanations shouldn't require a whole cluster; if we fail to reproduce we can 
try complicating the setup.

> Large compactions run out of off-heap RAM
> -
>
> Key: CASSANDRA-8552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.4 
> AWS EC2
> 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
> Java build 1.7.0_55-b13 and build 1.8.0_25-b17
>Reporter: Brent Haines
>Assignee: Benedict
>Priority: Blocker
> Fix For: 2.1.3
>
> Attachments: Screen Shot 2015-01-02 at 9.36.11 PM.png, fhandles.log, 
> freelog.log, lsof.txt, meminfo.txt, sysctl.txt, system.log
>
>
> We have a large table of storing, effectively event logs and a pair of 
> denormalized tables for indexing.
> When updating from 2.0 to 2.1 we saw performance improvements, but some 
> random and silent crashes during nightly repairs. We lost a node (totally 
> corrupted) and replaced it. That node has never stabilized -- it simply can't 
> finish the compactions. 
> Smaller compactions finish. Larger compactions, like these two never finish - 
> {code}
> pending tasks: 48
>compaction type   keyspace table completed total   
>  unit   progress
> Compaction   data   stories   16532973358   75977993784   
> bytes 21.76%
> Compaction   data   stories_by_text   10593780658   38555048812   
> bytes 27.48%
> Active compaction remaining time :   0h10m51s
> {code}
> We are not getting exceptions and are not running out of heap space. The 
> Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
> We watch memory in the opscenter console and it will grow. If we turn off the 
> OOM killer for the process, it will run until everything else is killed 
> instead and then the kernel panics.
> We have the following settings configured: 
> 2G Heap
> 512M New
> {code}
> memtable_heap_space_in_mb: 1024
> memtable_offheap_space_in_mb: 1024
> memtable_allocation_type: heap_buffers
> commitlog_total_space_in_mb: 2048
> concurrent_compactors: 1
> compaction_throughput_mb_per_sec: 128
> {code}
> The compaction strategy is leveled (these are read-intensive tables that are 
> rarely updated)
> I have tried every setting, every option and I have the system where the MTBF 
> is about an hour now, but we never finish compacting because there are some 
> large compactions pending. None of the GC tools or settings help because it 
> is not a GC problem. It is an off-heap memory problem.
> We are getting these messages in our syslog 
> {code}
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
> process java  pte:0320 pmd:2d6fa5067
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
> vm_flags:0870 anon_vma:  (null) mapping:  (null) 
> index:7fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
> Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
> 88020d43da98 81715ac4 7fb820be3000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
> 81174183 0320 0007fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
> 0320 7fb820be3000 7fb820be4000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [] 
> dump_stack+0x45/0x56
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [] 
> print_bad_pte+0x1a3/0x250
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [] 
> vm_normal_page+0x69/0x80
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [] 
> unmap_page_range+0x3bb/0x7f0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [] 
> unmap_single_vma+0x81/0xf0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [] 
> unmap_vmas+0x49/0x90
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219610]  [] 
> exit_mmap+0x9c/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219617]  [] 
> ? __delayacct_add_tsk+0x153/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219621]  [] 
> mmput+0x5c/0x120
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219625]  [] 
> do_exit+0x26c/0xa50
> Jan  2 07:06:00 

[jira] [Commented] (CASSANDRA-8303) Provide "strict mode" for CQL Queries

2015-01-06 Thread Jonathan Shook (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266550#comment-14266550
 ] 

Jonathan Shook commented on CASSANDRA-8303:
---

It might be nice if the auth system was always in play (when that auth provider 
is set), but the system defaults are applied to a virtual role with a name like 
"defaults". This cleans up any layering questions by casting the yaml defaults 
into the authz conceptual model. If a user isn't assigned to another defined 
role, they should be automatically assigned to the defaults role.

Otherwise, explaining the result of layering them, even with precedence, might 
become overly cumbersome. With it, you can use both.



> Provide "strict mode" for CQL Queries
> -
>
> Key: CASSANDRA-8303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8303
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Anupam Arora
> Fix For: 3.0
>
>
> Please provide a "strict mode" option in cassandra that will kick out any CQL 
> queries that are expensive, e.g. any query with ALLOWS FILTERING, 
> multi-partition queries, secondary index queries, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8552) Large compactions run out of off-heap RAM

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8552:
---
Reproduced In: 2.1.2, 2.1.1  (was: 2.1.1, 2.1.2)
   Tester: Alan Boudreault

> Large compactions run out of off-heap RAM
> -
>
> Key: CASSANDRA-8552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.4 
> AWS EC2
> 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
> Java build 1.7.0_55-b13 and build 1.8.0_25-b17
>Reporter: Brent Haines
>Assignee: Benedict
>Priority: Blocker
> Fix For: 2.1.3
>
> Attachments: Screen Shot 2015-01-02 at 9.36.11 PM.png, fhandles.log, 
> freelog.log, lsof.txt, meminfo.txt, sysctl.txt, system.log
>
>
> We have a large table of storing, effectively event logs and a pair of 
> denormalized tables for indexing.
> When updating from 2.0 to 2.1 we saw performance improvements, but some 
> random and silent crashes during nightly repairs. We lost a node (totally 
> corrupted) and replaced it. That node has never stabilized -- it simply can't 
> finish the compactions. 
> Smaller compactions finish. Larger compactions, like these two never finish - 
> {code}
> pending tasks: 48
>compaction type   keyspace table completed total   
>  unit   progress
> Compaction   data   stories   16532973358   75977993784   
> bytes 21.76%
> Compaction   data   stories_by_text   10593780658   38555048812   
> bytes 27.48%
> Active compaction remaining time :   0h10m51s
> {code}
> We are not getting exceptions and are not running out of heap space. The 
> Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
> We watch memory in the opscenter console and it will grow. If we turn off the 
> OOM killer for the process, it will run until everything else is killed 
> instead and then the kernel panics.
> We have the following settings configured: 
> 2G Heap
> 512M New
> {code}
> memtable_heap_space_in_mb: 1024
> memtable_offheap_space_in_mb: 1024
> memtable_allocation_type: heap_buffers
> commitlog_total_space_in_mb: 2048
> concurrent_compactors: 1
> compaction_throughput_mb_per_sec: 128
> {code}
> The compaction strategy is leveled (these are read-intensive tables that are 
> rarely updated)
> I have tried every setting, every option and I have the system where the MTBF 
> is about an hour now, but we never finish compacting because there are some 
> large compactions pending. None of the GC tools or settings help because it 
> is not a GC problem. It is an off-heap memory problem.
> We are getting these messages in our syslog 
> {code}
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
> process java  pte:0320 pmd:2d6fa5067
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
> vm_flags:0870 anon_vma:  (null) mapping:  (null) 
> index:7fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
> Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
> 88020d43da98 81715ac4 7fb820be3000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
> 81174183 0320 0007fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
> 0320 7fb820be3000 7fb820be4000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [] 
> dump_stack+0x45/0x56
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [] 
> print_bad_pte+0x1a3/0x250
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [] 
> vm_normal_page+0x69/0x80
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [] 
> unmap_page_range+0x3bb/0x7f0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [] 
> unmap_single_vma+0x81/0xf0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [] 
> unmap_vmas+0x49/0x90
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219610]  [] 
> exit_mmap+0x9c/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219617]  [] 
> ? __delayacct_add_tsk+0x153/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219621]  [] 
> mmput+0x5c/0x120
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219625]  [] 
> do_exit+0x26c/0xa50
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219631]  [] 
> ? __unqueue_futex+0x31/0x60
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219634]  [] 
> ? futex_wait+0x126/0x290
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219640]  []

[jira] [Updated] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2015-01-06 Thread Benedict (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-8499:

Attachment: 8499-21v3

Heh. Looks like we were previously instead at risk of leaking the directory 
file descriptor, if we aborted prior to closing. While the new behaviour was 
benign, a simple modification avoids it

> Ensure SSTableWriter cleans up properly after failure
> -
>
> Key: CASSANDRA-8499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 2.0.12
>
> Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2, 8499-21v3
>
>
> In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
> offheap memory for writing compression metadata. In both we attempt to flush 
> the BF despite having encountered an exception, making the exception slow to 
> propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8559) OOM caused by large tombstone warning.

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8559:
---
Fix Version/s: 2.0.12

> OOM caused by large tombstone warning.
> --
>
> Key: CASSANDRA-8559
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8559
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: 2.0.11 / 2.1
>Reporter: Dominic Letz
> Fix For: 2.0.12
>
> Attachments: Selection_048.png, stacktrace.log
>
>
> When running with high amount of tombstones the error message generation from 
> CASSANDRA-6117 can lead to out of memory situation with the default setting.
> Attached a heapdump viewed in visualvm showing how this construct created two 
> 777mb strings to print the error message for a read query and then crashed 
> OOM.
> {code}
> if (respectTombstoneThresholds() && columnCounter.ignored() > 
> DatabaseDescriptor.getTombstoneWarnThreshold())
> {
> StringBuilder sb = new StringBuilder();
> CellNameType type = container.metadata().comparator;
> for (ColumnSlice sl : slices)
> {
> assert sl != null;
> sb.append('[');
> sb.append(type.getString(sl.start));
> sb.append('-');
> sb.append(type.getString(sl.finish));
> sb.append(']');
> }
> logger.warn("Read {} live and {} tombstoned cells in {}.{} (see 
> tombstone_warn_threshold). {} columns was requested, slices={}, delInfo={}",
> columnCounter.live(), columnCounter.ignored(), 
> container.metadata().ksName, container.metadata().cfName, count, sb, 
> container.deletionInfo());
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8566) node crash (while auto-compaction?)

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8566:
---
Reproduced In: 2.1.2
Fix Version/s: 2.1.3

> node crash (while auto-compaction?)
> ---
>
> Key: CASSANDRA-8566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8566
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux CentOS 6.6 64bit, Cassandra 2.1.2 (release)
>Reporter: Dmitri Dmitrienko
> Fix For: 2.1.3
>
> Attachments: 1.log
>
>
> As data size became 20-24GB/node this issue started happening quite 
> frequently. With 7GB/node I didn't notice any crashes.
> HEAP size was 10GB, now increased to 16GB and it didn't help.
> Log is attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8569) org.apache.cassandra.db.KeyspaceTest failing

2015-01-06 Thread Philip Thompson (JIRA)
Philip Thompson created CASSANDRA-8569:
--

 Summary: org.apache.cassandra.db.KeyspaceTest failing
 Key: CASSANDRA-8569
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8569
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
Reporter: Philip Thompson
Assignee: Brandon Williams
 Fix For: 2.1.3


org.apache.cassandra.db.KeyspaceTest began failing after the patch for 
CASSANDRA-8245.

{code}
java.lang.NullPointerException
at 
org.apache.cassandra.db.KeyspaceTest.testGetSliceFromLarge(KeyspaceTest.java:425)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8570) org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge failing

2015-01-06 Thread Philip Thompson (JIRA)
Philip Thompson created CASSANDRA-8570:
--

 Summary: 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge
 failing
 Key: CASSANDRA-8570
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8570
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
Reporter: Philip Thompson
Assignee: Marcus Eriksson
 Fix For: 2.1.3


The patch for 8429 broke the test 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge

{code}
java.lang.NullPointerException
at 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge(CompactionsPurgeTest.java:138)

 Standard Output

ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8570) org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge failing

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8570:
---
Description: 
The patch for CASSANDRA-8429 broke the test 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge

{code}
java.lang.NullPointerException
at 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge(CompactionsPurgeTest.java:138)

 Standard Output

ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
{code}

  was:
The patch for 8429 broke the test 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge

{code}
java.lang.NullPointerException
at 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge(CompactionsPurgeTest.java:138)

 Standard Output

ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
{code}


> org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge
>  failing
> 
>
> Key: CASSANDRA-8570
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8570
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tests
>Reporter: Philip Thompson
>Assignee: Marcus Eriksson
> Fix For: 2.1.3
>
>
> The patch for CASSANDRA-8429 broke the test 
> org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge(CompactionsPurgeTest.java:138)
>  Standard Output
> ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user 
> defined compaction
> ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user 
> defined compaction
> ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user 
> defined compaction
> ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user 
> defined compaction
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8316) "Did not get positive replies from all endpoints" error on incremental repair

2015-01-06 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266614#comment-14266614
 ] 

Yuki Morishita commented on CASSANDRA-8316:
---

bq. 4. B finishes preparing and marks a bunch of sstables as being repaired

B does not mark sstables as repaired for just receiving prepare message, 
doesn't it?

I understand that the current issue we have is prepared repair session is left 
on replica nodes when preparing timed out on coordinator.
(In that case, user can work around by doing "forceTerminateRepairSession" 
manually.)

I prefer sending cancel message, though adding new message may be difficult in 
minor release. Also we have to make sure message won't get dropped since 
AntiEntropyStage may be still busy preparing when cancel message arrives.

Alternatively, I think the right solution to automatically remove left sessions 
is to track repair status as we do in CASSANDRA-5839 and use that to determine 
which prepared session can be removed.

Either way, I think we can move this to resolve in 3.0 if I didn't miss the 
severity of the issue.

>  "Did not get positive replies from all endpoints" error on incremental repair
> --
>
> Key: CASSANDRA-8316
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8316
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: cassandra 2.1.2
>Reporter: Loic Lambiel
>Assignee: Marcus Eriksson
> Fix For: 2.1.3
>
> Attachments: 0001-patch.patch, 8316-v2.patch, 
> CassandraDaemon-2014-11-25-2.snapshot.tar.gz, 
> CassandraDaemon-2014-12-14.snapshot.tar.gz, test.sh
>
>
> Hi,
> I've got an issue with incremental repairs on our production 15 nodes 2.1.2 
> (new cluster, not yet loaded, RF=3)
> After having successfully performed an incremental repair (-par -inc) on 3 
> nodes, I started receiving "Repair failed with error Did not get positive 
> replies from all endpoints." from nodetool on all remaining nodes :
> [2014-11-14 09:12:36,488] Starting repair command #3, repairing 108 ranges 
> for keyspace  (seq=false, full=false)
> [2014-11-14 09:12:47,919] Repair failed with error Did not get positive 
> replies from all endpoints.
> All the nodes are up and running and the local system log shows that the 
> repair commands got started and that's it.
> I've also noticed that soon after the repair, several nodes started having 
> more cpu load indefinitely without any particular reason (no tasks / queries, 
> nothing in the logs). I then restarted C* on these nodes and retried the 
> repair on several nodes, which were successful until facing the issue again.
> I tried to repro on our 3 nodes preproduction cluster without success
> It looks like I'm not the only one having this issue: 
> http://www.mail-archive.com/user%40cassandra.apache.org/msg39145.html
> Any idea?
> Thanks
> Loic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8571) Free space management does not work very well

2015-01-06 Thread JIRA
Bartłomiej Romański created CASSANDRA-8571:
--

 Summary: Free space management does not work very well
 Key: CASSANDRA-8571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8571
 Project: Cassandra
  Issue Type: Bug
Reporter: Bartłomiej Romański


Hi all,

We've got a cluster of 2.1.2 with 18 nodes equipped with 3x 480GB SSD each 
(JBODs). We mostly use LCS.

Recently, our nodes starts failing with 'no space left on device'. It all 
started with our mistake - we let our LCS accumulate too much in L0.

As a result, STCS woke up and we end with some big sstables on each node (let's 
say 5-10 sstables, 20-50gb each).

During normal operation we keep our disks about 50% full. This gives about 200 
GB free space on each of them. This was too little for compacting all 
accumulated L0 sstables at once. Cassandra kept trying to do that and keep 
failing...

Evantually, we managed to stabilized the situation (with some crazy code 
hacking, manually moving sstables etc...). However, there are a few things that 
would be more than helpful in recovering from such situations more 
automatically... 

First, please look at DiskAwareRunnable.runMayThrow(). This methods initiates 
(local) variable: writeSize. I believe we should check somewhere here if we 
have enough space on a chosen disk. The problem is that writeSize is never 
read... Am I missing something here?

Btw, while in STCS we first look for the least overloaded disk, and then (if 
there are more than one such disks) for the one with the most free space 
(please note the sort order in Directories.getWriteableLocation()). That's 
often suboptimal (it's usually better to wait for the bigger disk than to 
compact fewer sstables now), but probably not crucial.

Second, the strategy (used by LCS) that we first choose target disk and then 
use it for whole compaction is not the best one. For big compactions (eg. after 
some massive operations like bootstrap or repair; or after some issues with LCS 
like in our case) on small drives (eg. JBOD of SSDs) these will never succeed. 
Much better strategy would be to choose target drive for each output sstable 
separately, or at least round robin them.

Third, it would be helpful if the default check for MAX_COMPACTING_L0 in 
LeveledManifest.getCandidatesFor() would be expanded to support also limit for 
total space. After fallback STCS in L0 you end up with very big sstables an 32 
of them is just too much for one compaction on a small drives.

We finally used some hack similar the last option (as it was the easiest one to 
implement in a hurry), but any improvents described above would save us from 
all this.

Thanks,
BR




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8571) Free space management does not work very well

2015-01-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bartłomiej Romański updated CASSANDRA-8571:
---
Description: 
Hi all,

We've got a cluster of 2.1.2 with 18 nodes equipped with 3x 480GB SSD each 
(JBODs). We mostly use LCS.

Recently, our nodes starts failing with 'no space left on device'. It all 
started with our mistake - we let our LCS accumulate too much in L0.

As a result, STCS woke up and we end with some big sstables on each node (let's 
say 5-10 sstables, 20-50gb each).

During normal operation we keep our disks about 50% full. This gives about 200 
GB free space on each of them. This was too little for compacting all 
accumulated L0 sstables at once. Cassandra kept trying to do that and keep 
failing...

Evantually, we managed to stabilized the situation (with some crazy code 
hacking, manually moving sstables etc...). However, there are a few things that 
would be more than helpful in recovering from such situations more 
automatically... 

First, please look at DiskAwareRunnable.runMayThrow(). This methods initiates 
(local) variable: writeSize. I believe we should check somewhere here if we 
have enough space on a chosen disk. The problem is that writeSize is never 
read... Am I missing something here?

Btw, while in STCS we first look for the least overloaded disk, and then (if 
there are more than one such disks) for the one with the most free space 
(please note the sort order in Directories.getWriteableLocation()). That's 
often suboptimal (it's usually better to wait for the bigger disk than to 
compact fewer sstables now), but probably not crucial.

Second, the strategy (used by LCS) that we first choose target disk and then 
use it for whole compaction is not the best one. For big compactions (eg. after 
some massive operations like bootstrap or repair; or after some issues with LCS 
like in our case) on small drives (eg. JBOD of SSDs) these will never succeed. 
Much better strategy would be to choose target drive for each output sstable 
separately, or at least round robin them.

Third, it would be helpful if the default check for MAX_COMPACTING_L0 in 
LeveledManifest.getCandidatesFor() would be expanded to support also limit for 
total space. After fallback STCS in L0 you end up with very big sstables and 32 
of them is just too much for one compaction on a small drives.

We finally used some hack similar the last option (as it was the easiest one to 
implement in a hurry), but any improvents described above would save us from 
all this.

Thanks,
BR


  was:
Hi all,

We've got a cluster of 2.1.2 with 18 nodes equipped with 3x 480GB SSD each 
(JBODs). We mostly use LCS.

Recently, our nodes starts failing with 'no space left on device'. It all 
started with our mistake - we let our LCS accumulate too much in L0.

As a result, STCS woke up and we end with some big sstables on each node (let's 
say 5-10 sstables, 20-50gb each).

During normal operation we keep our disks about 50% full. This gives about 200 
GB free space on each of them. This was too little for compacting all 
accumulated L0 sstables at once. Cassandra kept trying to do that and keep 
failing...

Evantually, we managed to stabilized the situation (with some crazy code 
hacking, manually moving sstables etc...). However, there are a few things that 
would be more than helpful in recovering from such situations more 
automatically... 

First, please look at DiskAwareRunnable.runMayThrow(). This methods initiates 
(local) variable: writeSize. I believe we should check somewhere here if we 
have enough space on a chosen disk. The problem is that writeSize is never 
read... Am I missing something here?

Btw, while in STCS we first look for the least overloaded disk, and then (if 
there are more than one such disks) for the one with the most free space 
(please note the sort order in Directories.getWriteableLocation()). That's 
often suboptimal (it's usually better to wait for the bigger disk than to 
compact fewer sstables now), but probably not crucial.

Second, the strategy (used by LCS) that we first choose target disk and then 
use it for whole compaction is not the best one. For big compactions (eg. after 
some massive operations like bootstrap or repair; or after some issues with LCS 
like in our case) on small drives (eg. JBOD of SSDs) these will never succeed. 
Much better strategy would be to choose target drive for each output sstable 
separately, or at least round robin them.

Third, it would be helpful if the default check for MAX_COMPACTING_L0 in 
LeveledManifest.getCandidatesFor() would be expanded to support also limit for 
total space. After fallback STCS in L0 you end up with very big sstables an 32 
of them is just too much for one compaction on a small drives.

We finally used some hack similar the last option (as it was the easiest one to 
implement in a hurry), but any improvents

[jira] [Updated] (CASSANDRA-8570) org.apache.cassandra.db.compaction.CompactionsPurgeTest failing

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8570:
---
Description: 
The patch for CASSANDRA-8429 broke the tests 
{{org.apache.cassandra.db.compaction.CompactionsPurgeTest.testCompactionPurgeTombstonedRow}}
 and 
{{org.apache.cassandra.db.compaction.CompactionsPurgeTest.testRowTombstoneObservedBeforePurging}}

{code}
junit.framework.AssertionFailedError: 
at 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testCompactionPurgeTombstonedRow(CompactionsPurgeTest.java:308)
{code}

{code}expected:<0> but was:<1>

 Stack Trace

junit.framework.AssertionFailedError: expected:<0> but was:<1>
at 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testRowTombstoneObservedBeforePurging(CompactionsPurgeTest.java:372)

{code}

  was:
The patch for CASSANDRA-8429 broke the test 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge

{code}
java.lang.NullPointerException
at 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge(CompactionsPurgeTest.java:138)

 Standard Output

ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
ERROR 16:19:44 You can't mix repaired and unrepaired sstables in a user defined 
compaction
{code}

Summary: org.apache.cassandra.db.compaction.CompactionsPurgeTest 
failing  (was: 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinorCompactionPurge
 failing)

> org.apache.cassandra.db.compaction.CompactionsPurgeTest failing
> ---
>
> Key: CASSANDRA-8570
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8570
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tests
>Reporter: Philip Thompson
>Assignee: Marcus Eriksson
> Fix For: 2.1.3
>
>
> The patch for CASSANDRA-8429 broke the tests 
> {{org.apache.cassandra.db.compaction.CompactionsPurgeTest.testCompactionPurgeTombstonedRow}}
>  and 
> {{org.apache.cassandra.db.compaction.CompactionsPurgeTest.testRowTombstoneObservedBeforePurging}}
> {code}
> junit.framework.AssertionFailedError: 
>   at 
> org.apache.cassandra.db.compaction.CompactionsPurgeTest.testCompactionPurgeTombstonedRow(CompactionsPurgeTest.java:308)
> {code}
> {code}expected:<0> but was:<1>
>  Stack Trace
> junit.framework.AssertionFailedError: expected:<0> but was:<1>
>   at 
> org.apache.cassandra.db.compaction.CompactionsPurgeTest.testRowTombstoneObservedBeforePurging(CompactionsPurgeTest.java:372)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8571) Free space management does not work very well

2015-01-06 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266624#comment-14266624
 ] 

Jeremy Hanna commented on CASSANDRA-8571:
-

See CASSANDRA-8329.  That may be what you're looking for in 2.0.12.  Also 
related is CASSANDRA-7386.

> Free space management does not work very well
> -
>
> Key: CASSANDRA-8571
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8571
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bartłomiej Romański
>
> Hi all,
> We've got a cluster of 2.1.2 with 18 nodes equipped with 3x 480GB SSD each 
> (JBODs). We mostly use LCS.
> Recently, our nodes starts failing with 'no space left on device'. It all 
> started with our mistake - we let our LCS accumulate too much in L0.
> As a result, STCS woke up and we end with some big sstables on each node 
> (let's say 5-10 sstables, 20-50gb each).
> During normal operation we keep our disks about 50% full. This gives about 
> 200 GB free space on each of them. This was too little for compacting all 
> accumulated L0 sstables at once. Cassandra kept trying to do that and keep 
> failing...
> Evantually, we managed to stabilized the situation (with some crazy code 
> hacking, manually moving sstables etc...). However, there are a few things 
> that would be more than helpful in recovering from such situations more 
> automatically... 
> First, please look at DiskAwareRunnable.runMayThrow(). This methods initiates 
> (local) variable: writeSize. I believe we should check somewhere here if we 
> have enough space on a chosen disk. The problem is that writeSize is never 
> read... Am I missing something here?
> Btw, while in STCS we first look for the least overloaded disk, and then (if 
> there are more than one such disks) for the one with the most free space 
> (please note the sort order in Directories.getWriteableLocation()). That's 
> often suboptimal (it's usually better to wait for the bigger disk than to 
> compact fewer sstables now), but probably not crucial.
> Second, the strategy (used by LCS) that we first choose target disk and then 
> use it for whole compaction is not the best one. For big compactions (eg. 
> after some massive operations like bootstrap or repair; or after some issues 
> with LCS like in our case) on small drives (eg. JBOD of SSDs) these will 
> never succeed. Much better strategy would be to choose target drive for each 
> output sstable separately, or at least round robin them.
> Third, it would be helpful if the default check for MAX_COMPACTING_L0 in 
> LeveledManifest.getCandidatesFor() would be expanded to support also limit 
> for total space. After fallback STCS in L0 you end up with very big sstables 
> and 32 of them is just too much for one compaction on a small drives.
> We finally used some hack similar the last option (as it was the easiest one 
> to implement in a hurry), but any improvents described above would save us 
> from all this.
> Thanks,
> BR



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8571) Free space management does not work very well

2015-01-06 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266624#comment-14266624
 ] 

Jeremy Hanna edited comment on CASSANDRA-8571 at 1/6/15 7:43 PM:
-

See CASSANDRA-8329.  That may be what you're looking for in 2.1.3 I believe.  
Also related is CASSANDRA-7386.


was (Author: jeromatron):
See CASSANDRA-8329.  That may be what you're looking for in 2.0.12.  Also 
related is CASSANDRA-7386.

> Free space management does not work very well
> -
>
> Key: CASSANDRA-8571
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8571
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bartłomiej Romański
>
> Hi all,
> We've got a cluster of 2.1.2 with 18 nodes equipped with 3x 480GB SSD each 
> (JBODs). We mostly use LCS.
> Recently, our nodes starts failing with 'no space left on device'. It all 
> started with our mistake - we let our LCS accumulate too much in L0.
> As a result, STCS woke up and we end with some big sstables on each node 
> (let's say 5-10 sstables, 20-50gb each).
> During normal operation we keep our disks about 50% full. This gives about 
> 200 GB free space on each of them. This was too little for compacting all 
> accumulated L0 sstables at once. Cassandra kept trying to do that and keep 
> failing...
> Evantually, we managed to stabilized the situation (with some crazy code 
> hacking, manually moving sstables etc...). However, there are a few things 
> that would be more than helpful in recovering from such situations more 
> automatically... 
> First, please look at DiskAwareRunnable.runMayThrow(). This methods initiates 
> (local) variable: writeSize. I believe we should check somewhere here if we 
> have enough space on a chosen disk. The problem is that writeSize is never 
> read... Am I missing something here?
> Btw, while in STCS we first look for the least overloaded disk, and then (if 
> there are more than one such disks) for the one with the most free space 
> (please note the sort order in Directories.getWriteableLocation()). That's 
> often suboptimal (it's usually better to wait for the bigger disk than to 
> compact fewer sstables now), but probably not crucial.
> Second, the strategy (used by LCS) that we first choose target disk and then 
> use it for whole compaction is not the best one. For big compactions (eg. 
> after some massive operations like bootstrap or repair; or after some issues 
> with LCS like in our case) on small drives (eg. JBOD of SSDs) these will 
> never succeed. Much better strategy would be to choose target drive for each 
> output sstable separately, or at least round robin them.
> Third, it would be helpful if the default check for MAX_COMPACTING_L0 in 
> LeveledManifest.getCandidatesFor() would be expanded to support also limit 
> for total space. After fallback STCS in L0 you end up with very big sstables 
> and 32 of them is just too much for one compaction on a small drives.
> We finally used some hack similar the last option (as it was the easiest one 
> to implement in a hurry), but any improvents described above would save us 
> from all this.
> Thanks,
> BR



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8448) "Comparison method violates its general contract" in AbstractEndpointSnitch

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266626#comment-14266626
 ] 

Benedict commented on CASSANDRA-8448:
-

compareEndpoints() doesn't snapshot, it uses the values from the shared 
"scores" object property, and sortByProximityWithScore ultimately delegates to 
this.


> "Comparison method violates its general contract" in AbstractEndpointSnitch
> ---
>
> Key: CASSANDRA-8448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8448
> Project: Cassandra
>  Issue Type: Bug
>Reporter: J.B. Langston
>Assignee: Brandon Williams
>
> Seen in both 1.2 and 2.0.  The error is occurring here: 
> https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/locator/AbstractEndpointSnitch.java#L49
> {code}
> ERROR [Thrift:9] 2014-12-04 20:12:28,732 CustomTThreadPoolServer.java (line 
> 219) Error occurred during processing of message.
> com.google.common.util.concurrent.UncheckedExecutionException: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2199)
>   at com.google.common.cache.LocalCache.get(LocalCache.java:3932)
>   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3936)
>   at 
> com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4806)
>   at 
> org.apache.cassandra.service.ClientState.authorize(ClientState.java:352)
>   at 
> org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:224)
>   at 
> org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:218)
>   at 
> org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:202)
>   at 
> org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:822)
>   at 
> org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:954)
>   at com.datastax.bdp.server.DseServer.batch_mutate(DseServer.java:576)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3922)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3906)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Comparison method violates its 
> general contract!
>   at java.util.TimSort.mergeHi(TimSort.java:868)
>   at java.util.TimSort.mergeAt(TimSort.java:485)
>   at java.util.TimSort.mergeCollapse(TimSort.java:410)
>   at java.util.TimSort.sort(TimSort.java:214)
>   at java.util.TimSort.sort(TimSort.java:173)
>   at java.util.Arrays.sort(Arrays.java:659)
>   at java.util.Collections.sort(Collections.java:217)
>   at 
> org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49)
>   at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:157)
>   at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:186)
>   at 
> org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:151)
>   at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1408)
>   at 
> org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1402)
>   at 
> org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:148)
>   at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1223)
>   at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1165)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:255)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:225)
>   at org.apache.cassandra.auth.Auth.selectUser(Auth.java:243)
>   at org.apache.cassandra.auth.Auth.isSuperuser(Auth.java:84)
>   at 
> org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:50)
>   at 
> org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:69)
>   at org.a

[jira] [Commented] (CASSANDRA-8403) limit disregarded when paging with IN clause under certain conditions

2015-01-06 Thread Russ Hatch (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266644#comment-14266644
 ] 

Russ Hatch commented on CASSANDRA-8403:
---

[~blerer] I did a bisect of sorts and it looks like that was handled by the 
patch for CASSANDRA-8408.

> limit disregarded when paging with IN clause under certain conditions
> -
>
> Key: CASSANDRA-8403
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8403
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Russ Hatch
>Assignee: Benjamin Lerer
>Priority: Minor
>
> This issue was originally reported on the python-driver userlist and 
> confirmed by [~aholmber]
> When:
> page_size < limit < data size,
> the limit value is disregarded and all rows are paged back.
> to repro:
> create a table and populate it with two partitions
> CREATE TABLE paging_test ( id int, value text, PRIMARY KEY (id, value) )
> Add data: in one partition create 10 rows, an in a second partition create 20 
> rows
> perform a query with page_size of 10 and a LIMIT of 20, like so:
> SELECT * FROM paging_test where id in (1,2) LIMIT 20;
> The limit is disregarded and three pages of 10 records each will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-7622) Implement virtual tables

2015-01-06 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer reassigned CASSANDRA-7622:
-

Assignee: Benjamin Lerer

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Tupshin Harper
>Assignee: Benjamin Lerer
> Fix For: 3.1
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8552) Large compactions run out of off-heap RAM

2015-01-06 Thread Brent Haines (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266643#comment-14266643
 ] 

Brent Haines commented on CASSANDRA-8552:
-

Quick update - the node was cleared and has re-joined the cluster. It managed 
to work through the sync and resultant compactions (more than 500 jobs) without 
fail. This is a good sign, but it worked the first time too, failing during the 
first repair thereafter. 

I have started the repair this morning and we are part of the way through it. I 
will let you know if / when the issue reappears.

> Large compactions run out of off-heap RAM
> -
>
> Key: CASSANDRA-8552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Ubuntu 14.4 
> AWS EC2
> 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
> Java build 1.7.0_55-b13 and build 1.8.0_25-b17
>Reporter: Brent Haines
>Assignee: Benedict
>Priority: Blocker
> Fix For: 2.1.3
>
> Attachments: Screen Shot 2015-01-02 at 9.36.11 PM.png, fhandles.log, 
> freelog.log, lsof.txt, meminfo.txt, sysctl.txt, system.log
>
>
> We have a large table of storing, effectively event logs and a pair of 
> denormalized tables for indexing.
> When updating from 2.0 to 2.1 we saw performance improvements, but some 
> random and silent crashes during nightly repairs. We lost a node (totally 
> corrupted) and replaced it. That node has never stabilized -- it simply can't 
> finish the compactions. 
> Smaller compactions finish. Larger compactions, like these two never finish - 
> {code}
> pending tasks: 48
>compaction type   keyspace table completed total   
>  unit   progress
> Compaction   data   stories   16532973358   75977993784   
> bytes 21.76%
> Compaction   data   stories_by_text   10593780658   38555048812   
> bytes 27.48%
> Active compaction remaining time :   0h10m51s
> {code}
> We are not getting exceptions and are not running out of heap space. The 
> Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
> We watch memory in the opscenter console and it will grow. If we turn off the 
> OOM killer for the process, it will run until everything else is killed 
> instead and then the kernel panics.
> We have the following settings configured: 
> 2G Heap
> 512M New
> {code}
> memtable_heap_space_in_mb: 1024
> memtable_offheap_space_in_mb: 1024
> memtable_allocation_type: heap_buffers
> commitlog_total_space_in_mb: 2048
> concurrent_compactors: 1
> compaction_throughput_mb_per_sec: 128
> {code}
> The compaction strategy is leveled (these are read-intensive tables that are 
> rarely updated)
> I have tried every setting, every option and I have the system where the MTBF 
> is about an hour now, but we never finish compacting because there are some 
> large compactions pending. None of the GC tools or settings help because it 
> is not a GC problem. It is an off-heap memory problem.
> We are getting these messages in our syslog 
> {code}
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
> process java  pte:0320 pmd:2d6fa5067
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
> vm_flags:0870 anon_vma:  (null) mapping:  (null) 
> index:7fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
> Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
> 88020d43da98 81715ac4 7fb820be3000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
> 81174183 0320 0007fb820be3
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
> 0320 7fb820be3000 7fb820be4000
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [] 
> dump_stack+0x45/0x56
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [] 
> print_bad_pte+0x1a3/0x250
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [] 
> vm_normal_page+0x69/0x80
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [] 
> unmap_page_range+0x3bb/0x7f0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [] 
> unmap_single_vma+0x81/0xf0
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [] 
> unmap_vmas+0x49/0x90
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219610]  [] 
> exit_mmap+0x9c/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219617]  [] 
> ? __delayacct_add_tsk+0x153/0x170
> Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151

[jira] [Resolved] (CASSANDRA-8403) limit disregarded when paging with IN clause under certain conditions

2015-01-06 Thread Russ Hatch (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch resolved CASSANDRA-8403.
---
Resolution: Not a Problem

> limit disregarded when paging with IN clause under certain conditions
> -
>
> Key: CASSANDRA-8403
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8403
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Russ Hatch
>Assignee: Benjamin Lerer
>Priority: Minor
>
> This issue was originally reported on the python-driver userlist and 
> confirmed by [~aholmber]
> When:
> page_size < limit < data size,
> the limit value is disregarded and all rows are paged back.
> to repro:
> create a table and populate it with two partitions
> CREATE TABLE paging_test ( id int, value text, PRIMARY KEY (id, value) )
> Add data: in one partition create 10 rows, an in a second partition create 20 
> rows
> perform a query with page_size of 10 and a LIMIT of 20, like so:
> SELECT * FROM paging_test where id in (1,2) LIMIT 20;
> The limit is disregarded and three pages of 10 records each will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8329) LeveledCompactionStrategy should split large files across data directories when compacting

2015-01-06 Thread Jeremy Hanna (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-8329:

Fix Version/s: 2.1.3

> LeveledCompactionStrategy should split large files across data directories 
> when compacting
> --
>
> Key: CASSANDRA-8329
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8329
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: J.B. Langston
>Assignee: Marcus Eriksson
> Fix For: 2.0.12, 2.1.3
>
> Attachments: 
> 0001-get-new-sstable-directory-for-every-new-file-during-.patch, 
> test_no_patch_2.0.jpg, test_with_patch_2.0.jpg
>
>
> Because we fall back to STCS for L0 when LCS gets behind, the sstables in L0 
> can get quite large during sustained periods of heavy writes.  This can 
> result in large imbalances between data volumes when using JBOD support.  
> Eventually these large files get broken up as L0 sstables are moved up into 
> higher levels; however, because LCS only chooses a single volume on which to 
> write all of the sstables created during a single compaction, the imbalance 
> is persisted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8571) Free space management does not work very well

2015-01-06 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266624#comment-14266624
 ] 

Jeremy Hanna edited comment on CASSANDRA-8571 at 1/6/15 7:54 PM:
-

See CASSANDRA-8329.  That may be what you're looking for in the upcoming 2.1.3. 
 Also related is CASSANDRA-7386.


was (Author: jeromatron):
See CASSANDRA-8329.  That may be what you're looking for in 2.1.3 I believe.  
Also related is CASSANDRA-7386.

> Free space management does not work very well
> -
>
> Key: CASSANDRA-8571
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8571
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bartłomiej Romański
>
> Hi all,
> We've got a cluster of 2.1.2 with 18 nodes equipped with 3x 480GB SSD each 
> (JBODs). We mostly use LCS.
> Recently, our nodes starts failing with 'no space left on device'. It all 
> started with our mistake - we let our LCS accumulate too much in L0.
> As a result, STCS woke up and we end with some big sstables on each node 
> (let's say 5-10 sstables, 20-50gb each).
> During normal operation we keep our disks about 50% full. This gives about 
> 200 GB free space on each of them. This was too little for compacting all 
> accumulated L0 sstables at once. Cassandra kept trying to do that and keep 
> failing...
> Evantually, we managed to stabilized the situation (with some crazy code 
> hacking, manually moving sstables etc...). However, there are a few things 
> that would be more than helpful in recovering from such situations more 
> automatically... 
> First, please look at DiskAwareRunnable.runMayThrow(). This methods initiates 
> (local) variable: writeSize. I believe we should check somewhere here if we 
> have enough space on a chosen disk. The problem is that writeSize is never 
> read... Am I missing something here?
> Btw, while in STCS we first look for the least overloaded disk, and then (if 
> there are more than one such disks) for the one with the most free space 
> (please note the sort order in Directories.getWriteableLocation()). That's 
> often suboptimal (it's usually better to wait for the bigger disk than to 
> compact fewer sstables now), but probably not crucial.
> Second, the strategy (used by LCS) that we first choose target disk and then 
> use it for whole compaction is not the best one. For big compactions (eg. 
> after some massive operations like bootstrap or repair; or after some issues 
> with LCS like in our case) on small drives (eg. JBOD of SSDs) these will 
> never succeed. Much better strategy would be to choose target drive for each 
> output sstable separately, or at least round robin them.
> Third, it would be helpful if the default check for MAX_COMPACTING_L0 in 
> LeveledManifest.getCandidatesFor() would be expanded to support also limit 
> for total space. After fallback STCS in L0 you end up with very big sstables 
> and 32 of them is just too much for one compaction on a small drives.
> We finally used some hack similar the last option (as it was the easiest one 
> to implement in a hurry), but any improvents described above would save us 
> from all this.
> Thanks,
> BR



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8570) org.apache.cassandra.db.compaction.CompactionsPurgeTest failing

2015-01-06 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266651#comment-14266651
 ] 

Philip Thompson commented on CASSANDRA-8570:


It appears to have also broken 
{{org.apache.cassandra.db.compaction.LeveledCompactionStrategyTest.testNewRepairedSSTable}}

> org.apache.cassandra.db.compaction.CompactionsPurgeTest failing
> ---
>
> Key: CASSANDRA-8570
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8570
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tests
>Reporter: Philip Thompson
>Assignee: Marcus Eriksson
> Fix For: 2.1.3
>
>
> The patch for CASSANDRA-8429 broke the tests 
> {{org.apache.cassandra.db.compaction.CompactionsPurgeTest.testCompactionPurgeTombstonedRow}}
>  and 
> {{org.apache.cassandra.db.compaction.CompactionsPurgeTest.testRowTombstoneObservedBeforePurging}}
> {code}
> junit.framework.AssertionFailedError: 
>   at 
> org.apache.cassandra.db.compaction.CompactionsPurgeTest.testCompactionPurgeTombstonedRow(CompactionsPurgeTest.java:308)
> {code}
> {code}expected:<0> but was:<1>
>  Stack Trace
> junit.framework.AssertionFailedError: expected:<0> but was:<1>
>   at 
> org.apache.cassandra.db.compaction.CompactionsPurgeTest.testRowTombstoneObservedBeforePurging(CompactionsPurgeTest.java:372)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8571) Free space management does not work very well

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8571:
---
Reproduced In: 2.1.2
Fix Version/s: 2.1.3

> Free space management does not work very well
> -
>
> Key: CASSANDRA-8571
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8571
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bartłomiej Romański
> Fix For: 2.1.3
>
>
> Hi all,
> We've got a cluster of 2.1.2 with 18 nodes equipped with 3x 480GB SSD each 
> (JBODs). We mostly use LCS.
> Recently, our nodes starts failing with 'no space left on device'. It all 
> started with our mistake - we let our LCS accumulate too much in L0.
> As a result, STCS woke up and we end with some big sstables on each node 
> (let's say 5-10 sstables, 20-50gb each).
> During normal operation we keep our disks about 50% full. This gives about 
> 200 GB free space on each of them. This was too little for compacting all 
> accumulated L0 sstables at once. Cassandra kept trying to do that and keep 
> failing...
> Evantually, we managed to stabilized the situation (with some crazy code 
> hacking, manually moving sstables etc...). However, there are a few things 
> that would be more than helpful in recovering from such situations more 
> automatically... 
> First, please look at DiskAwareRunnable.runMayThrow(). This methods initiates 
> (local) variable: writeSize. I believe we should check somewhere here if we 
> have enough space on a chosen disk. The problem is that writeSize is never 
> read... Am I missing something here?
> Btw, while in STCS we first look for the least overloaded disk, and then (if 
> there are more than one such disks) for the one with the most free space 
> (please note the sort order in Directories.getWriteableLocation()). That's 
> often suboptimal (it's usually better to wait for the bigger disk than to 
> compact fewer sstables now), but probably not crucial.
> Second, the strategy (used by LCS) that we first choose target disk and then 
> use it for whole compaction is not the best one. For big compactions (eg. 
> after some massive operations like bootstrap or repair; or after some issues 
> with LCS like in our case) on small drives (eg. JBOD of SSDs) these will 
> never succeed. Much better strategy would be to choose target drive for each 
> output sstable separately, or at least round robin them.
> Third, it would be helpful if the default check for MAX_COMPACTING_L0 in 
> LeveledManifest.getCandidatesFor() would be expanded to support also limit 
> for total space. After fallback STCS in L0 you end up with very big sstables 
> and 32 of them is just too much for one compaction on a small drives.
> We finally used some hack similar the last option (as it was the easiest one 
> to implement in a hurry), but any improvents described above would save us 
> from all this.
> Thanks,
> BR



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure

2015-01-06 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1427#comment-1427
 ] 

Marcus Eriksson commented on CASSANDRA-8499:


+1  (remove the unused "boolean closed" in SequentialWriter on commit)

> Ensure SSTableWriter cleans up properly after failure
> -
>
> Key: CASSANDRA-8499
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Benedict
>Assignee: Benedict
> Fix For: 2.0.12
>
> Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2, 8499-21v3
>
>
> In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
> offheap memory for writing compression metadata. In both we attempt to flush 
> the BF despite having encountered an exception, making the exception slow to 
> propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7159) sstablemetadata command should print some more stuff

2015-01-06 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266671#comment-14266671
 ] 

Philip Thompson commented on CASSANDRA-7159:


[~vsinjavin], any update on this? 2.1.3 is rolling out soon, and it would be 
nice to get this patch in.

> sstablemetadata command should print some more stuff
> 
>
> Key: CASSANDRA-7159
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7159
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jeremiah Jordan
>Assignee: Vladislav Sinjavin
>Priority: Trivial
>  Labels: lhf
> Fix For: 2.1.3
>
> Attachments: 
> CASSANDRA-7159_-_sstablemetadata_command_should_print_some_more_stuff.patch
>
>
> It would be nice if the sstablemetadata command printed out some more of the 
> stuff we track.  Like the Min/Max column names and the min/max token in the 
> file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7269) Make CqlInputFormat and CqlRecordReader consistent with comments

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-7269:
---
 Reviewer: Brandon Williams
Fix Version/s: 2.1.3

> Make CqlInputFormat and CqlRecordReader consistent with comments
> 
>
> Key: CASSANDRA-7269
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7269
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Jeremy Hanna
>Assignee: Rekha Joshi
>Priority: Minor
>  Labels: hadoop
> Fix For: 2.1.3
>
> Attachments: CASSANDRA-7269.txt
>
>
> Both the CqlInputFormat and CqlPagingInputFormat have the following comment:
> {code}
> /**
> ...
>  *   the number of CQL rows per page
>  *   CQLConfigHelper.setInputCQLPageRowSize. The default page row size is 
> 1000. You 
>  *   should set it to "as big as possible, but no bigger." It set the LIMIT 
> for the CQL 
>  *   query, so you need set it big enough to minimize the network overhead, 
> and also
>  *   not too big to avoid out of memory issue.
> ...
> **/
> {code}
> The property is used in both classes, but the default is only set to 1000 in 
> CqlPagingRecordReader explicitly.
> We should either make the default part of the CqlConfigHelper so it's set in 
> both places or update the comments in the CqlInputFormat to say that if it's 
> not set, it will default to the java driver fetch size which is 5000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8566) node crash (while auto-compaction?)

2015-01-06 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266696#comment-14266696
 ] 

Philip Thompson commented on CASSANDRA-8566:


This looks possibly related to CASSANDRA-8552

> node crash (while auto-compaction?)
> ---
>
> Key: CASSANDRA-8566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8566
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux CentOS 6.6 64bit, Cassandra 2.1.2 (release)
>Reporter: Dmitri Dmitrienko
> Fix For: 2.1.3
>
> Attachments: 1.log
>
>
> As data size became 20-24GB/node this issue started happening quite 
> frequently. With 7GB/node I didn't notice any crashes.
> HEAP size was 10GB, now increased to 16GB and it didn't help.
> Log is attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8563) cqlsh broken for some thrift created tables.

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8563:
---
Assignee: Tyler Hobbs

> cqlsh broken for some thrift created tables.
> 
>
> Key: CASSANDRA-8563
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8563
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Jeremiah Jordan
>Assignee: Tyler Hobbs
>  Labels: cqlsh
> Fix For: 2.1.3
>
>
> The new python driver based cqlsh is broken for some tables.  This was fixed 
> recently in: https://datastax-oss.atlassian.net/browse/PYTHON-192
> So we should pull in a new version of the python driver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8357) ArrayOutOfBounds in cassandra-stress with inverted exponential distribution

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8357:
---
Reproduced In: 2.1.1
Fix Version/s: 2.1.3

> ArrayOutOfBounds in cassandra-stress with inverted exponential distribution
> ---
>
> Key: CASSANDRA-8357
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8357
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: 6-node cassandra cluster (2.1.1) on debian.
>Reporter: Jens Preußner
> Fix For: 2.1.3
>
>
> When using the CQLstress example from GitHub 
> (https://github.com/apache/cassandra/blob/trunk/tools/cqlstress-example.yaml) 
> with an inverted exponential distribution in the insert-partitions field, 
> generated threads fail with
> Exception in thread "Thread-20" java.lang.ArrayIndexOutOfBoundsException: 20 
> at 
> org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:307)
> See the gist https://gist.github.com/jenzopr/9edde53122554729c852 for the 
> typetest.yaml I used.
> The call was:
> cassandra-stress user profile=typetest.yaml ops\(insert=1\) -node $NODES



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8310) Assertion error in 2.1.1: SSTableReader.cloneWithNewSummarySamplingLevel(SSTableReader.java:988)

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8310:
---
Reproduced In: 2.1.1
Fix Version/s: 2.1.3

> Assertion error in 2.1.1: 
> SSTableReader.cloneWithNewSummarySamplingLevel(SSTableReader.java:988)
> 
>
> Key: CASSANDRA-8310
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8310
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Donald Smith
> Fix For: 2.1.3
>
>
> Using C* 2.1.1  on linux Centos 6.4, we're getting this AssertionError on 5 
> nodes in a 12 node cluster. Also, compactions are lagging on all nodes.
> {noformat}
> ERROR [IndexSummaryManager:1] 2014-11-13 09:15:16,221 CassandraDaemon.java 
> (line 153) Exception in thread Thread[IndexSummaryManager:1,1,main]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.io.sstable.SSTableReader.cloneWithNewSummarySamplingLevel(SSTableReader.java:988)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.io.sstable.IndexSummaryManager.adjustSamplingLevels(IndexSummaryManager.java:420)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:298)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:238)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:139)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:77)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> [na:1.7.0_60]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
> [na:1.7.0_60]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>  [na:1.7.0_60]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  [na:1.7.0_60]
> {noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8311) C* 2.1.1: AssertionError in AbstractionCompactionTask "not correctly marked compacting"

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8311:
---
Reproduced In: 2.1.1
Fix Version/s: 2.1.3

> C* 2.1.1:  AssertionError in AbstractionCompactionTask "not correctly marked 
> compacting"
> 
>
> Key: CASSANDRA-8311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8311
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Donald Smith
> Fix For: 2.1.3
>
>
> Using 2.1.1 on CentOS6.4, we see this AssertionError on 3 out of 12 nodes in 
> one DC.
> {noformat}
> ERROR [CompactionExecutor:7] 2014-11-12 10:15:13,980 CassandraDaemon.java 
> (line 153) Exception in thread Thread[CompactionExecutor:7,1,RMI Runtime]
> java.lang.AssertionError: 
> /data/data/KEYSPACE_NAME/TABLE_NAME/KEYSPACE_NAME-TABLE_NAME-jb-308572-Data.db
>  is not correctly marked compacting
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.(AbstractCompactionTask.java:49)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.db.compaction.CompactionTask.(CompactionTask.java:62)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionTask.(LeveledCompactionTask.java:33)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getCompactionTask(LeveledCompactionStrategy.java:170)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8314) C* 2.1.1: AssertionError: "stream can only read forward"

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8314:
---
Reproduced In: 2.1.1
Fix Version/s: 2.1.3

> C* 2.1.1: AssertionError:  "stream can only read forward"
> -
>
> Key: CASSANDRA-8314
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8314
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Donald Smith
> Fix For: 2.1.3
>
>
> I see this multiple nodes on a 2.1.1 cluster running on CentOS 6.4:
> {noformat}
> ERROR [STREAM-IN-/10.6.1.104] 2014-11-13 14:13:16,565 StreamSession.java 
> (line 470) [Stream #45bdfe30-6b81-11e4-a7ca-b150b4554347] Streaming error 
> occurred
> java.io.IOException: Too many retries for Header (cfId: 
> aaefa7d7-9d72-3d18-b5f0-02b30cee5bd7, #29, version: jb, estimated keys: 
> 12672, transfer size: 130005779, compressed?: true, repairedAt: 0)
> at 
> org.apache.cassandra.streaming.StreamSession.doRetry(StreamSession.java:594) 
> [apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:53)
>  [apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:38)
>  [apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
>  [apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
>  [apache-cassandra-2.1.1.jar:2.1.1]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_60]
> Caused by: java.lang.AssertionError: stream can only read forward.
> at 
> org.apache.cassandra.streaming.compress.CompressedInputStream.position(CompressedInputStream.java:107)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:85)
>  ~[apache-cassandra-2.1.1.jar:2.1.1]
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:48)
>  [apache-cassandra-2.1.1.jar:2.1.1]
> ... 4 common frames omitted
> {noformat}
> We couldn't upgrade SStables due to exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8347) 2.1.1: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException after accidental computer crash

2015-01-06 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266717#comment-14266717
 ] 

Philip Thompson commented on CASSANDRA-8347:


If your sstable is corrupt, run nodetool scrub against it, then repair. If that 
doesn't fix the problem, please re-open this.

> 2.1.1: org.apache.cassandra.io.sstable.CorruptSSTableException: 
> java.io.EOFException after accidental computer crash
> 
>
> Key: CASSANDRA-8347
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8347
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Evgeny Pasynkov
>
> {code}9:08:56.972 [SSTableBatchOpen:1] ERROR o.a.c.service.CassandraDaemon - 
> Exception in thread Thread[SSTableBatchOpen:1,5,main]
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
>  at 
> org.apache.cassandra.io.compress.CompressionMetadata.(CompressionMetadata.java:129)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_65]
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_65]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_65]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
>  at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: java.io.EOFException: null
>  at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) 
> ~[na:1.7.0_65]
>  at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_65]
>  at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_65]
>  at 
> org.apache.cassandra.io.compress.CompressionMetadata.(CompressionMetadata.java:104)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  ... 13 common frames omitted{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-8347) 2.1.1: org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException after accidental computer crash

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson resolved CASSANDRA-8347.

Resolution: Not a Problem

> 2.1.1: org.apache.cassandra.io.sstable.CorruptSSTableException: 
> java.io.EOFException after accidental computer crash
> 
>
> Key: CASSANDRA-8347
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8347
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Evgeny Pasynkov
>
> {code}9:08:56.972 [SSTableBatchOpen:1] ERROR o.a.c.service.CassandraDaemon - 
> Exception in thread Thread[SSTableBatchOpen:1,5,main]
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
>  at 
> org.apache.cassandra.io.compress.CompressionMetadata.(CompressionMetadata.java:129)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at 
> org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
> ~[cassandra-all-2.1.1.jar:2.1.1]
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> ~[na:1.7.0_65]
>  at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_65]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  ~[na:1.7.0_65]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
>  at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> Caused by: java.io.EOFException: null
>  at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) 
> ~[na:1.7.0_65]
>  at java.io.DataInputStream.readUTF(DataInputStream.java:589) ~[na:1.7.0_65]
>  at java.io.DataInputStream.readUTF(DataInputStream.java:564) ~[na:1.7.0_65]
>  at 
> org.apache.cassandra.io.compress.CompressionMetadata.(CompressionMetadata.java:104)
>  ~[cassandra-all-2.1.1.jar:2.1.1]
>  ... 13 common frames omitted{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8041) Utility sstablesplit should prevent users from running when C* is running

2015-01-06 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8041:
---
Fix Version/s: 2.0.12

> Utility sstablesplit should prevent users from running when C* is running
> -
>
> Key: CASSANDRA-8041
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8041
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation & website, Tools
>Reporter: Erick Ramirez
>Priority: Minor
> Fix For: 2.0.12
>
>
> The sstablesplit utility is designed for use when C* is offline, but there is 
> nothing stopping the user from running it on a live system. There are also no 
> warning messages alerting the user to this effect.
> The help information should also be updated to explicitly state that the 
> utility should only be used when C* is offline.
> Finally, this utility is not included in any of the documentation. Please 
> update accordingly. Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-06 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266809#comment-14266809
 ] 

Robert Stupp commented on CASSANDRA-7438:
-

OHC works in Cassandra:
* unit tests pass ({{ant test}}, not difference against trunk)
* get and put verified in debugger and a (simple) table
* row cache saving and load working, too

> Serializing Row cache alternative (Fully off heap)
> --
>
> Key: CASSANDRA-7438
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux
>Reporter: Vijay
>Assignee: Robert Stupp
>  Labels: performance
> Fix For: 3.0
>
> Attachments: 0001-CASSANDRA-7438.patch, tests.zip
>
>
> Currently SerializingCache is partially off heap, keys are still stored in 
> JVM heap as BB, 
> * There is a higher GC costs for a reasonably big cache.
> * Some users have used the row cache efficiently in production for better 
> results, but this requires careful tunning.
> * Overhead in Memory for the cache entries are relatively high.
> So the proposal for this ticket is to move the LRU cache logic completely off 
> heap and use JNI to interact with cache. We might want to ensure that the new 
> implementation match the existing API's (ICache), and the implementation 
> needs to have safe memory access, low overhead in memory and less memcpy's 
> (As much as possible).
> We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8566) node crash (while auto-compaction?)

2015-01-06 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266829#comment-14266829
 ] 

Benedict commented on CASSANDRA-8566:
-

Could you provide the information requested off of Brent in CASSANDRA-8552 
[here|https://issues.apache.org/jira/browse/CASSANDRA-8552?focusedCommentId=14263495&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14263495]?
 Could you also attach the contents of /var/log/kernel/? Optionally, including 
a thread and heap dump could also be helpful (if the latter, please reduce your 
heap size, so that we can explore it easily). 

Heap size will not affect this (except to make it more likely), as it is native 
memory related.

> node crash (while auto-compaction?)
> ---
>
> Key: CASSANDRA-8566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8566
> Project: Cassandra
>  Issue Type: Bug
> Environment: Linux CentOS 6.6 64bit, Cassandra 2.1.2 (release)
>Reporter: Dmitri Dmitrienko
> Fix For: 2.1.3
>
> Attachments: 1.log
>
>
> As data size became 20-24GB/node this issue started happening quite 
> frequently. With 7GB/node I didn't notice any crashes.
> HEAP size was 10GB, now increased to 16GB and it didn't help.
> Log is attached



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >