date:20150102

[
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263426#comment-14263426
]

Brent Haines commented on CASSANDRA-8552:
-

I am not sure how to compact each table individually or how to do it offline. I
don't think this would help because the heavy compactions are still pending and
they are currently insurmountable -- I don't have enough memory to finish them
without crashing.

I looked more carefully at the kernel message in the syslog and upgraded the
kernel to 3.13.0-43-generic which resolved that issue. I was hopeful that
this was the problem because you can clearly seen munmap in the trace. The fix
did seem to make a positive impact on the system and I was excited about that,
but it finally did crash. I have attached the interesting part of the syslog to
the bug.

I am not certain what else I can give you to help diagnose. It doesn't take
long to run the available system RAM up to just about 100MB free. Finishing
smaller compactions usually frees up a gig or so and the system does hang out
at that memory point for up to an hour before dying. I've attached a screenshot
of the opscenter dashboard for the node.

I'm happy to run whatever you need to help you dig into this. We're going to
try a debian distro when my ops guy gets back on Monday to see if that helps,
but somehow, I doubt it. This definitely seems like memory is used to some
high-water mark that we aren't hitting with the other nodes because their
compactions are smaller. I wish there was a way to limit the size of the
compactions.

Large compactions run out of off-heap RAM
-

Key: CASSANDRA-8552
URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
Project: Cassandra
Issue Type: Bug
Components: Core
Environment: Ubuntu 14.4
AWS EC2
12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
Java build 1.7.0_55-b13 and build 1.8.0_25-b17
Reporter: Brent Haines
Assignee: Marcus Eriksson
Priority: Blocker
Fix For: 2.1.3

Attachments: Screen Shot 2015-01-02 at 9.36.11 PM.png, system.log

We have a large table of storing, effectively event logs and a pair of
denormalized tables for indexing.
When updating from 2.0 to 2.1 we saw performance improvements, but some
random and silent crashes during nightly repairs. We lost a node (totally
corrupted) and replaced it. That node has never stabilized -- it simply can't
finish the compactions.
Smaller compactions finish. Larger compactions, like these two never finish -
{code}
pending tasks: 48
compaction type keyspace table completed total
unit progress
Compaction data stories 16532973358 75977993784
bytes 21.76%
Compaction data stories_by_text 10593780658 38555048812
bytes 27.48%
Active compaction remaining time : 0h10m51s
{code}
We are not getting exceptions and are not running out of heap space. The
Ubuntu OOM killer is reaping the process after all of the memory is consumed.
We watch memory in the opscenter console and it will grow. If we turn off the
OOM killer for the process, it will run until everything else is killed
instead and then the kernel panics.
We have the following settings configured:
2G Heap
512M New
{code}
memtable_heap_space_in_mb: 1024
memtable_offheap_space_in_mb: 1024
memtable_allocation_type: heap_buffers
commitlog_total_space_in_mb: 2048
concurrent_compactors: 1
compaction_throughput_mb_per_sec: 128
{code}
The compaction strategy is leveled (these are read-intensive tables that are
rarely updated)
I have tried every setting, every option and I have the system where the MTBF
is about an hour now, but we never finish compacting because there are some
large compactions pending. None of the GC tools or settings help because it
is not a GC problem. It is an off-heap memory problem.
We are getting these messages in our syslog
{code}
Jan 2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in
process java pte:0320 pmd:2d6fa5067
Jan 2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000
vm_flags:0870 anon_vma: (null) mapping: (null)
index:7fb820be3
Jan 2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344
Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
Jan 2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559] 880028510e40
88020d43da98 81715ac4 7fb820be3000
Jan 2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565] 88020d43dae0
81174183 0320 0007fb820be3
Jan 2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568] 8802d6fa5f18

[jira] [Commented] (CASSANDRA-8552) Large compactions run out of off-heap RAM


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263288#comment-14263288
 ] 

Benedict commented on CASSANDRA-8552:
-

This is going to be a challenging one to diagnose, since the kernel log 
suggests this most likely isn't a bug with C*, although there may be some 
problematic behaviour with so many source files, or large target files, that 
triggers it.

Could you establish for sure this isn't the bug: 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1308796 ? It seems to 
describe the situation closely enough to be a candidate issue that may be 
solved by a different Ubuntu image. If not we can try and think of a method of 
attack for pinning down which C* behaviours are making it a problem.

You also might be able to compact each table individually, offline, to get the 
system started up.



 Large compactions run out of off-heap RAM
 -

 Key: CASSANDRA-8552
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 14.4 
 AWS EC2
 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
 Java build 1.7.0_55-b13 and build 1.8.0_25-b17
Reporter: Brent Haines
Assignee: Marcus Eriksson
Priority: Blocker
 Fix For: 2.1.3

 Attachments: system.log


 We have a large table of storing, effectively event logs and a pair of 
 denormalized tables for indexing.
 When updating from 2.0 to 2.1 we saw performance improvements, but some 
 random and silent crashes during nightly repairs. We lost a node (totally 
 corrupted) and replaced it. That node has never stabilized -- it simply can't 
 finish the compactions. 
 Smaller compactions finish. Larger compactions, like these two never finish - 
 {code}
 pending tasks: 48
compaction type   keyspace table completed total   
  unit   progress
 Compaction   data   stories   16532973358   75977993784   
 bytes 21.76%
 Compaction   data   stories_by_text   10593780658   38555048812   
 bytes 27.48%
 Active compaction remaining time :   0h10m51s
 {code}
 We are not getting exceptions and are not running out of heap space. The 
 Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
 We watch memory in the opscenter console and it will grow. If we turn off the 
 OOM killer for the process, it will run until everything else is killed 
 instead and then the kernel panics.
 We have the following settings configured: 
 2G Heap
 512M New
 {code}
 memtable_heap_space_in_mb: 1024
 memtable_offheap_space_in_mb: 1024
 memtable_allocation_type: heap_buffers
 commitlog_total_space_in_mb: 2048
 concurrent_compactors: 1
 compaction_throughput_mb_per_sec: 128
 {code}
 The compaction strategy is leveled (these are read-intensive tables that are 
 rarely updated)
 I have tried every setting, every option and I have the system where the MTBF 
 is about an hour now, but we never finish compacting because there are some 
 large compactions pending. None of the GC tools or settings help because it 
 is not a GC problem. It is an off-heap memory problem.
 We are getting these messages in our syslog 
 {code}
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
 process java  pte:0320 pmd:2d6fa5067
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
 vm_flags:0870 anon_vma:  (null) mapping:  (null) 
 index:7fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
 Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
 88020d43da98 81715ac4 7fb820be3000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
 81174183 0320 0007fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
 0320 7fb820be3000 7fb820be4000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [81715ac4] 
 dump_stack+0x45/0x56
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [81174183] 
 print_bad_pte+0x1a3/0x250
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [81175439] 
 vm_normal_page+0x69/0x80
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [8117580b] 
 unmap_page_range+0x3bb/0x7f0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [81175cc1] 
 unmap_single_vma+0x81/0xf0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [81176d39] 
 unmap_vmas+0x49/0x90
 Jan  2 07:06:00 ip-10-0-2-226 kernel:

[jira] [Created] (CASSANDRA-8556) ColumnMetadata sometimes identifies type as int for float and double in java API

2015-01-02 Thread Jason Kania (JIRA)

Jason Kania created CASSANDRA-8556:
--

 Summary: ColumnMetadata sometimes identifies type as int for float 
and double in java API
 Key: CASSANDRA-8556
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8556
 Project: Cassandra
  Issue Type: Bug
  Components: API
 Environment: 2.1.4 API, Windows 8 and Debian latest, patch current
Reporter: Jason Kania


Sometimes when querying the metadata for columns through the java API, the data 
types for columns that are float or double are incorrectly indicated as int. 

columnMetadata.getType();

This prevents the binding of prepared statements as it complains about needing 
an integer when it should take a float or double respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: Invalidate prepared stmts when table is altered

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 6124a733c - 9f613ab42


Invalidate prepared stmts when table is altered

Patch by Tyler Hobbs; reviewed by Aleksey Yeschenko for CASSANDRA-7910


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9f613ab4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9f613ab4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9f613ab4

Branch: refs/heads/cassandra-2.1
Commit: 9f613ab42c76783191af7d20f50d716309e4aa5c
Parents: 6124a73
Author: Tyler Hobbs ty...@datastax.com
Authored: Fri Jan 2 11:19:57 2015 -0600
Committer: Tyler Hobbs ty...@datastax.com
Committed: Fri Jan 2 11:19:57 2015 -0600

--
 CHANGES.txt |  2 ++
 src/java/org/apache/cassandra/auth/Auth.java| 34 ++--
 .../org/apache/cassandra/config/CFMetaData.java | 15 +++--
 .../apache/cassandra/cql3/QueryProcessor.java   | 22 +++--
 .../org/apache/cassandra/db/DefsTables.java |  4 +--
 .../cassandra/service/IMigrationListener.java   | 33 ---
 .../cassandra/service/MigrationListener.java| 33 +++
 .../cassandra/service/MigrationManager.java | 28 
 .../org/apache/cassandra/transport/Server.java  |  6 ++--
 9 files changed, 80 insertions(+), 97 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f613ab4/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index ec64aa9..f69a3fc 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,6 @@
 2.1.3
+ * Invalidate affected prepared statements when a table's columns
+   are altered (CASSANDRA-7910)
  * Stress - user defined writes should populate sequentally (CASSANDRA-8524)
  * Fix regression in SSTableRewriter causing some rows to become unreadable 
during compaction (CASSANDRA-8429)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f613ab4/src/java/org/apache/cassandra/auth/Auth.java
--
diff --git a/src/java/org/apache/cassandra/auth/Auth.java 
b/src/java/org/apache/cassandra/auth/Auth.java
index ed7aa87..0c3b0fe 100644
--- a/src/java/org/apache/cassandra/auth/Auth.java
+++ b/src/java/org/apache/cassandra/auth/Auth.java
@@ -185,7 +185,7 @@ public class Auth implements AuthMBean
 DatabaseDescriptor.getAuthorizer().setup();
 
 // register a custom MigrationListener for permissions cleanup after 
dropped keyspaces/cfs.
-MigrationManager.instance.register(new MigrationListener());
+MigrationManager.instance.register(new AuthMigrationListener());
 
 // the delay is here to give the node some time to see its peers - to 
reduce
 // Skipped default superuser setup: some nodes were not ready log 
spam.
@@ -318,9 +318,9 @@ public class Auth implements AuthMBean
 }
 
 /**
- * IMigrationListener implementation that cleans up permissions on dropped 
resources.
+ * MigrationListener implementation that cleans up permissions on dropped 
resources.
  */
-public static class MigrationListener implements IMigrationListener
+public static class AuthMigrationListener extends MigrationListener
 {
 public void onDropKeyspace(String ksName)
 {
@@ -331,33 +331,5 @@ public class Auth implements AuthMBean
 {
 
DatabaseDescriptor.getAuthorizer().revokeAll(DataResource.columnFamily(ksName, 
cfName));
 }
-
-public void onDropUserType(String ksName, String userType)
-{
-}
-
-public void onCreateKeyspace(String ksName)
-{
-}
-
-public void onCreateColumnFamily(String ksName, String cfName)
-{
-}
-
-public void onCreateUserType(String ksName, String userType)
-{
-}
-
-public void onUpdateKeyspace(String ksName)
-{
-}
-
-public void onUpdateColumnFamily(String ksName, String cfName)
-{
-}
-
-public void onUpdateUserType(String ksName, String userType)
-{
-}
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f613ab4/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index 74bd5f8..e75abb7 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -,7 +,11 @@ public final class CFMetaData
 return m;
 }
 
-public void reload()
+/**
+ * Updates this object in place to match the

[2/2] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-01-02 Thread marcuse

Merge branch 'cassandra-2.1' into trunk

Conflicts:
src/java/org/apache/cassandra/db/compaction/CompactionTask.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ba9c477
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ba9c477
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ba9c477

Branch: refs/heads/trunk
Commit: 0ba9c4775f2a37327bfd1756012287d8ea1f45ff
Parents: c11e1a9 4e1e92b
Author: Marcus Eriksson marc...@apache.org
Authored: Fri Jan 2 18:41:02 2015 +0100
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jan 2 18:41:02 2015 +0100

--
 CHANGES.txt | 1 +
 src/java/org/apache/cassandra/db/compaction/CompactionTask.java | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ba9c477/CHANGES.txt
--
diff --cc CHANGES.txt
index 82f1d20,c24..dbcdcdf
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,48 -1,5 +1,49 @@@
 +3.0
 + * Support index key/value entries on map collections (CASSANDRA-8473)
 + * Modernize schema tables (CASSANDRA-8261)
 + * Support for user-defined aggregation functions (CASSANDRA-8053)
 + * Fix NPE in SelectStatement with empty IN values (CASSANDRA-8419)
 + * Refactor SelectStatement, return IN results in natural order instead
 +   of IN value list order (CASSANDRA-7981)
 + * Support UDTs, tuples, and collections in user-defined
 +   functions (CASSANDRA-7563)
 + * Fix aggregate fn results on empty selection, result column name,
 +   and cqlsh parsing (CASSANDRA-8229)
 + * Mark sstables as repaired after full repair (CASSANDRA-7586)
 + * Extend Descriptor to include a format value and refactor reader/writer
 +   APIs (CASSANDRA-7443)
 + * Integrate JMH for microbenchmarks (CASSANDRA-8151)
 + * Keep sstable levels when bootstrapping (CASSANDRA-7460)
 + * Add Sigar library and perform basic OS settings check on startup 
(CASSANDRA-7838)
 + * Support for aggregation functions (CASSANDRA-4914)
 + * Remove cassandra-cli (CASSANDRA-7920)
 + * Accept dollar quoted strings in CQL (CASSANDRA-7769)
 + * Make assassinate a first class command (CASSANDRA-7935)
 + * Support IN clause on any clustering column (CASSANDRA-4762)
 + * Improve compaction logging (CASSANDRA-7818)
 + * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917)
 + * Do anticompaction in groups (CASSANDRA-6851)
 + * Support user-defined functions (CASSANDRA-7395, 7526, 7562, 7740, 7781, 
7929,
 +   7924, 7812, 8063, 7813, 7708)
 + * Permit configurable timestamps with cassandra-stress (CASSANDRA-7416)
 + * Move sstable RandomAccessReader to nio2, which allows using the
 +   FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050)
 + * Remove CQL2 (CASSANDRA-5918)
 + * Add Thrift get_multi_slice call (CASSANDRA-6757)
 + * Optimize fetching multiple cells by name (CASSANDRA-6933)
 + * Allow compilation in java 8 (CASSANDRA-7028)
 + * Make incremental repair default (CASSANDRA-7250)
 + * Enable code coverage thru JaCoCo (CASSANDRA-7226)
 + * Switch external naming of 'column families' to 'tables' (CASSANDRA-4369) 
 + * Shorten SSTable path (CASSANDRA-6962)
 + * Use unsafe mutations for most unit tests (CASSANDRA-6969)
 + * Fix race condition during calculation of pending ranges (CASSANDRA-7390)
 + * Fail on very large batch sizes (CASSANDRA-8011)
 + * Improve concurrency of repair (CASSANDRA-6455, 8208)
 +
 +
  2.1.3
+  * Properly calculate expected write size during compaction (CASSANDRA-8532)
   * Invalidate affected prepared statements when a table's columns
 are altered (CASSANDRA-7910)
   * Stress - user defined writes should populate sequentally (CASSANDRA-8524)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ba9c477/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
--
diff --cc src/java/org/apache/cassandra/db/compaction/CompactionTask.java
index 1abb4ee,4885bc8..2543f47
--- a/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
@@@ -152,8 -146,7 +152,9 @@@ public class CompactionTask extends Abs
  long estimatedTotalKeys = 
Math.max(cfs.metadata.getMinIndexInterval(), 
SSTableReader.getApproximateKeyCount(actuallyCompact));
  long estimatedSSTables = Math.max(1, 
SSTableReader.getTotalBytes(actuallyCompact) / strategy.getMaxSSTableBytes());
  long keysPerSSTable = (long) Math.ceil((double) 
estimatedTotalKeys / estimatedSSTables);
 +SSTableFormat.Type sstableFormat = getFormatType(sstables);
 +
+ long expectedSSTableSize = Math.min(getExpectedWriteSize(),

[1/2] cassandra git commit: Fix calculation of expected sstable size during compaction

2015-01-02 Thread marcuse

Repository: cassandra
Updated Branches:
  refs/heads/trunk c11e1a9d8 - 0ba9c4775


Fix calculation of expected sstable size during compaction

Patch by marcuse; reviewed by rstupp for CASSANDRA-8532


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e1e92b3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e1e92b3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e1e92b3

Branch: refs/heads/trunk
Commit: 4e1e92b3104a17917328f5352a59bec8287fe82d
Parents: 9f613ab
Author: Marcus Eriksson marc...@apache.org
Authored: Tue Dec 23 09:57:05 2014 +0100
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jan 2 18:35:22 2015 +0100

--
 CHANGES.txt | 1 +
 src/java/org/apache/cassandra/db/compaction/CompactionTask.java | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e1e92b3/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index f69a3fc..c24 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.3
+ * Properly calculate expected write size during compaction (CASSANDRA-8532)
  * Invalidate affected prepared statements when a table's columns
are altered (CASSANDRA-7910)
  * Stress - user defined writes should populate sequentally (CASSANDRA-8524)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e1e92b3/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionTask.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
index 0e8900d..4885bc8 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
@@ -146,6 +146,7 @@ public class CompactionTask extends AbstractCompactionTask
 long estimatedTotalKeys = 
Math.max(cfs.metadata.getMinIndexInterval(), 
SSTableReader.getApproximateKeyCount(actuallyCompact));
 long estimatedSSTables = Math.max(1, 
SSTableReader.getTotalBytes(actuallyCompact) / strategy.getMaxSSTableBytes());
 long keysPerSSTable = (long) Math.ceil((double) estimatedTotalKeys 
/ estimatedSSTables);
+long expectedSSTableSize = Math.min(getExpectedWriteSize(), 
strategy.getMaxSSTableBytes());
 logger.debug(Expected bloom filter size : {}, keysPerSSTable);
 
 try (AbstractCompactionStrategy.ScannerList scanners = 
strategy.getScanners(actuallyCompact))
@@ -173,7 +174,7 @@ public class CompactionTask extends AbstractCompactionTask
 return;
 }
 
-
writer.switchWriter(createCompactionWriter(cfs.directories.getLocationForDisk(getWriteDirectory(estimatedTotalKeys/estimatedSSTables)),
 keysPerSSTable, minRepairedAt));
+
writer.switchWriter(createCompactionWriter(cfs.directories.getLocationForDisk(getWriteDirectory(expectedSSTableSize)),
 keysPerSSTable, minRepairedAt));
 while (iter.hasNext())
 {
 if (ci.isStopRequested())
@@ -185,7 +186,7 @@ public class CompactionTask extends AbstractCompactionTask
 totalKeysWritten++;
 if 
(newSSTableSegmentThresholdReached(writer.currentWriter()))
 {
-
writer.switchWriter(createCompactionWriter(cfs.directories.getLocationForDisk(getWriteDirectory(estimatedTotalKeys/estimatedSSTables)),
 keysPerSSTable, minRepairedAt));
+
writer.switchWriter(createCompactionWriter(cfs.directories.getLocationForDisk(getWriteDirectory(expectedSSTableSize)),
 keysPerSSTable, minRepairedAt));
 }
 }

[jira] [Commented] (CASSANDRA-7408) System hints corruption - dataSize ... would be larger than file

2015-01-02 Thread Maxim Ivanov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262824#comment-14262824
 ] 

Maxim Ivanov commented on CASSANDRA-7408:
-

See same problem on 1.2.19, it seems to happening for system hints only.

While scrubbing it is reporting following errors:
{code}
 WARN [CompactionExecutor:17] 2015-01-02 10:09:41,454 OutputHandler.java (line 
57) Non-fatal error reading row (stacktrace follows)
java.io.IOError: java.io.IOException: Impossible row size 92971968326729728
at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171)
at 
org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:526)
at 
org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:515)
at 
org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:70)
at 
org.apache.cassandra.db.compaction.CompactionManager$3.perform(CompactionManager.java:280)
at 
org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:250)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Impossible row size 92971968326729728
... 10 more
 WARN [CompactionExecutor:17] 2015-01-02 10:09:41,456 OutputHandler.java (line 
52) Row at 299608 is unreadable; skipping to next
 WARN [CompactionExecutor:17] 2015-01-02 10:09:41,457 OutputHandler.java (line 
57) Non-fatal error reading row (stacktrace follows)
java.io.IOError: java.io.IOException: Impossible row size 7881707469400978738
at org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:171)
at 
org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:526)
at 
org.apache.cassandra.db.compaction.CompactionManager.doScrub(CompactionManager.java:515)
at 
org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:70)
at 
org.apache.cassandra.db.compaction.CompactionManager$3.perform(CompactionManager.java:280)
at 
org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:250)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Impossible row size 7881707469400978738
... 10 more
 WARN [CompactionExecutor:17] 2015-01-02 10:09:41,459 OutputHandler.java (line 
52) Row at 304082 is unreadable; skipping to next
{code}

 System hints corruption - dataSize ... would be larger than file
 

 Key: CASSANDRA-7408
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7408
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: RHEL 6.5
 Cassandra 1.2.16
 RF=3
 Thrift
Reporter: Jeff Griffith

 I've found several unresolved JIRA tickets related to SSTable corruption but 
 not sure if they apply to the case we are seeing in system/hints. We see 
 periodic exceptions such as:
 {noformat}
 dataSize of 144115248479299639 starting at 17209 would be larger than file 
 /home/y/var/cassandra/data/system/hints/system-hints-ic-219-Data.db length 
 35542
 {noformat}
 Is there something we could possibly be doing from the application to cause 
 this sort of corruption? We also see it on some of our own column families 
 also some *negative* lengths which are presumably a similar corruption.
 {noformat}
 ERROR [HintedHandoff:57] 2014-06-17 17:08:04,690 CassandraDaemon.java (line 
 191) Exception in thread Thread[HintedHandoff:57,1,main]
 java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
 org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
 dataSize of 144115248479299639 starting at 17209 would be larger than file 
 /home/y/var/cassandra/data/system/hints/system-hints-ic-219-Data.db length 
 35542
 at 
 org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:441)
 at 
 org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:282)
 at 
 org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:90)
 at 
 org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:508)
 at

[jira] [Commented] (CASSANDRA-8399) Reference Counter exception when dropping user type

2015-01-02 Thread Joshua McKenzie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262940#comment-14262940
 ] 

Joshua McKenzie commented on CASSANDRA-8399:


[~benedict]: If we're going to go the route of surgical tinkering inside 
DataTracker would you be willing to take over this ticket?  Given the 
complexity and lack of formality surrounding these relationships I'd prefer not 
to make these changes personally as I don't currently have full context / 
expertise in this area of the code-base.

If these are indeed indicative of bugs in our current implementation, if we can 
get a fix in for them quickly (i.e. pre 2.1.5+)  that would probably be best.

 Reference Counter exception when dropping user type
 ---

 Key: CASSANDRA-8399
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8399
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Joshua McKenzie
 Fix For: 2.1.3

 Attachments: 8399_fix_empty_results.txt, 8399_v2.txt, node2.log, 
 ubuntu-8399.log


 When running the dtest 
 {{user_types_test.py:TestUserTypes.test_type_keyspace_permission_isolation}} 
 with the current 2.1-HEAD code, very frequently, but not always, when 
 dropping a type, the following exception is seen:{code}
 ERROR [MigrationStage:1] 2014-12-01 13:54:54,824 CassandraDaemon.java:170 - 
 Exception in thread Thread[MigrationStage:1,5,main]
 java.lang.AssertionError: Reference counter -1 for 
 /var/folders/v3/z4wf_34n1q506_xjdy49gb78gn/T/dtest-eW2RXj/test/node2/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-sche
 ma_keyspaces-ka-14-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1662)
  ~[main/:na]
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.close(SSTableScanner.java:164) 
 ~[main/:na]
 at 
 org.apache.cassandra.utils.MergeIterator.close(MergeIterator.java:62) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore$8.close(ColumnFamilyStore.java:1943)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:2116) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:2029)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1963)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:744)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:731)
  ~[main/:na]
 at org.apache.cassandra.config.Schema.updateVersion(Schema.java:374) 
 ~[main/:na]
 at 
 org.apache.cassandra.config.Schema.updateVersionAndAnnounce(Schema.java:399) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:167) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:49)
  ~[main/:na]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[main/:na]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_67]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_67]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_67]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_67]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67]{code}
 Log of the node with the error is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions

2015-01-02 Thread Christian Spriegel (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262943#comment-14262943
 ] 

Christian Spriegel edited comment on CASSANDRA-7886 at 1/2/15 3:22 PM:
---

Hi [~thobbs],

uploaded new patch: V6

Here is what I did:
- Fixed logging of TOEs...
-- ... in StorageProxy for local reads
-- ... in MessageDeliveryTask for remote reads
- Added partitionKey(as DecoratedKey) and lastCellName logging to TOE.
- Changed SliceQueryFilter not to throw TOEs Exception for System-keyspace. 
Cassandra does not seem to like TOEs in system queries. These TOEs will always 
be logged as warnings instead.


This is how TOEs look like in system.log:
{code}
ERROR [SharedPool-Worker-1] 2015-01-02 15:07:24,878 MessageDeliveryTask.java:81 
- Scanned over 201 tombstones in test.test; 100 columns were requested; query 
aborted (see tombstone_failure_threshold); 
partitionKey=DecoratedKey(78703492656118554854272571946195123045, 31); 
lastCell=188; delInfo={deletedAt=-9223372036854775808, 
localDeletion=2147483647}; slices=[-]
{code}

kind regards,
Christian



was (Author: christianmovi):
Hi [~thobbs],

uploaded new patch: V6

Here is what I did:
- Fixed logging of TOEs...
-- ... in StorageProxy for local reads
-- ... in MessageDeliveryTask for remote reads
- Added partitionKey(as DecoratedKey) and lastCellName logging to TOE.
- Changed SliceQueryFilter not to throw TOEs Exception for System-keyspace. 
Cassandra does not seem to like TOEs in system queries. These TOEs will always 
be logged as warnings instead.


This is how TOEs look like in system.log:
{quote}
ERROR [SharedPool-Worker-1] 2015-01-02 15:07:24,878 MessageDeliveryTask.java:81 
- Scanned over 201 tombstones in test.test; 100 columns were requested; query 
aborted (see tombstone_failure_threshold); 
partitionKey=DecoratedKey(78703492656118554854272571946195123045, 31); 
lastCell=188; delInfo={deletedAt=-9223372036854775808, 
localDeletion=2147483647}; slices=[-]
{quote}

kind regards,
Christian


 Coordinator should not wait for read timeouts when replicas hit Exceptions
 --

 Key: CASSANDRA-7886
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Tested with Cassandra 2.0.8
Reporter: Christian Spriegel
Assignee: Christian Spriegel
Priority: Minor
  Labels: protocolv4
 Fix For: 3.0

 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, 
 7886_v4_trunk.txt, 7886_v5_trunk.txt, 7886_v6_trunk.txt


 *Issue*
 When you have TombstoneOverwhelmingExceptions occuring in queries, this will 
 cause the query to be simply dropped on every data-node, but no response is 
 sent back to the coordinator. Instead the coordinator waits for the specified 
 read_request_timeout_in_ms.
 On the application side this can cause memory issues, since the application 
 is waiting for the timeout interval for every request.Therefore, if our 
 application runs into TombstoneOverwhelmingExceptions, then (sooner or later) 
 our entire application cluster goes down :-(
 *Proposed solution*
 I think the data nodes should send a error message to the coordinator when 
 they run into a TombstoneOverwhelmingException. Then the coordinator does not 
 have to wait for the timeout-interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions

2015-01-02 Thread Christian Spriegel (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262943#comment-14262943
 ] 

Christian Spriegel commented on CASSANDRA-7886:
---

Hi [~thobbs],

uploaded new patch: V6

Here is what I did:
- Fixed logging of TOEs...
-- ... in StorageProxy for local reads
-- ... in MessageDeliveryTask for remote reads
- Added partitionKey(as DecoratedKey) and lastCellName logging to TOE.
- Changed SliceQueryFilter not to throw TOEs Exception for System-keyspace. 
Cassandra does not seem to like TOEs in system queries. These TOEs will always 
be logged as warnings instead.


This is how TOEs look like in system.log:
{quote}
ERROR [SharedPool-Worker-1] 2015-01-02 15:07:24,878 MessageDeliveryTask.java:81 
- Scanned over 201 tombstones in test.test; 100 columns were requested; query 
aborted (see tombstone_failure_threshold); 
partitionKey=DecoratedKey(78703492656118554854272571946195123045, 31); 
lastCell=188; delInfo={deletedAt=-9223372036854775808, 
localDeletion=2147483647}; slices=[-]
{quote}

kind regards,
Christian


 Coordinator should not wait for read timeouts when replicas hit Exceptions
 --

 Key: CASSANDRA-7886
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Tested with Cassandra 2.0.8
Reporter: Christian Spriegel
Assignee: Christian Spriegel
Priority: Minor
  Labels: protocolv4
 Fix For: 3.0

 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, 
 7886_v4_trunk.txt, 7886_v5_trunk.txt, 7886_v6_trunk.txt


 *Issue*
 When you have TombstoneOverwhelmingExceptions occuring in queries, this will 
 cause the query to be simply dropped on every data-node, but no response is 
 sent back to the coordinator. Instead the coordinator waits for the specified 
 read_request_timeout_in_ms.
 On the application side this can cause memory issues, since the application 
 is waiting for the timeout interval for every request.Therefore, if our 
 application runs into TombstoneOverwhelmingExceptions, then (sooner or later) 
 our entire application cluster goes down :-(
 *Proposed solution*
 I think the data nodes should send a error message to the coordinator when 
 they run into a TombstoneOverwhelmingException. Then the coordinator does not 
 have to wait for the timeout-interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: Fix calculation of expected sstable size during compaction

2015-01-02 Thread marcuse

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 9f613ab42 - 4e1e92b31


Fix calculation of expected sstable size during compaction

Patch by marcuse; reviewed by rstupp for CASSANDRA-8532


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4e1e92b3
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4e1e92b3
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4e1e92b3

Branch: refs/heads/cassandra-2.1
Commit: 4e1e92b3104a17917328f5352a59bec8287fe82d
Parents: 9f613ab
Author: Marcus Eriksson marc...@apache.org
Authored: Tue Dec 23 09:57:05 2014 +0100
Committer: Marcus Eriksson marc...@apache.org
Committed: Fri Jan 2 18:35:22 2015 +0100

--
 CHANGES.txt | 1 +
 src/java/org/apache/cassandra/db/compaction/CompactionTask.java | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e1e92b3/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index f69a3fc..c24 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.3
+ * Properly calculate expected write size during compaction (CASSANDRA-8532)
  * Invalidate affected prepared statements when a table's columns
are altered (CASSANDRA-7910)
  * Stress - user defined writes should populate sequentally (CASSANDRA-8524)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/4e1e92b3/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
--
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionTask.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
index 0e8900d..4885bc8 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
@@ -146,6 +146,7 @@ public class CompactionTask extends AbstractCompactionTask
 long estimatedTotalKeys = 
Math.max(cfs.metadata.getMinIndexInterval(), 
SSTableReader.getApproximateKeyCount(actuallyCompact));
 long estimatedSSTables = Math.max(1, 
SSTableReader.getTotalBytes(actuallyCompact) / strategy.getMaxSSTableBytes());
 long keysPerSSTable = (long) Math.ceil((double) estimatedTotalKeys 
/ estimatedSSTables);
+long expectedSSTableSize = Math.min(getExpectedWriteSize(), 
strategy.getMaxSSTableBytes());
 logger.debug(Expected bloom filter size : {}, keysPerSSTable);
 
 try (AbstractCompactionStrategy.ScannerList scanners = 
strategy.getScanners(actuallyCompact))
@@ -173,7 +174,7 @@ public class CompactionTask extends AbstractCompactionTask
 return;
 }
 
-
writer.switchWriter(createCompactionWriter(cfs.directories.getLocationForDisk(getWriteDirectory(estimatedTotalKeys/estimatedSSTables)),
 keysPerSSTable, minRepairedAt));
+
writer.switchWriter(createCompactionWriter(cfs.directories.getLocationForDisk(getWriteDirectory(expectedSSTableSize)),
 keysPerSSTable, minRepairedAt));
 while (iter.hasNext())
 {
 if (ci.isStopRequested())
@@ -185,7 +186,7 @@ public class CompactionTask extends AbstractCompactionTask
 totalKeysWritten++;
 if 
(newSSTableSegmentThresholdReached(writer.currentWriter()))
 {
-
writer.switchWriter(createCompactionWriter(cfs.directories.getLocationForDisk(getWriteDirectory(estimatedTotalKeys/estimatedSSTables)),
 keysPerSSTable, minRepairedAt));
+
writer.switchWriter(createCompactionWriter(cfs.directories.getLocationForDisk(getWriteDirectory(expectedSSTableSize)),
 keysPerSSTable, minRepairedAt));
 }
 }

[jira] [Created] (CASSANDRA-8554) Nodetool drain shows all nodes as UP on the drained node

2015-01-02 Thread Mark Curtis (JIRA)

Mark Curtis created CASSANDRA-8554:
--

 Summary: Nodetool drain shows all nodes as UP on the drained node
 Key: CASSANDRA-8554
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8554
 Project: Cassandra
  Issue Type: Bug
 Environment: Centos 6.5, DSE4.5.1 tarball install
Reporter: Mark Curtis
Priority: Minor


When running nodetool drain, the drained node will still show the status of 
itself as UP in nodetool status even after the drain has finished. For example 
using a 3 node cluster on one of the nodes that is still operating and not 
drained we see this:

{code}
$ ./dse-4.5.1/bin/nodetool status
Note: Ownership information does not include topology; for complete 
information, specify a keyspace
Datacenter: Central
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  Owns   Host ID
   Rack
UN  192.168.56.21  210.78 KB  256 32.1%  
82eb2fca-4f57-467b-a972-93096ec5d69f  RAC1
DN  192.168.56.23  2.22 GB256 33.5%  
a11bfac1-fad0-440b-bd68-7562a89ce3c7  RAC1
UN  192.168.56.22  2.22 GB256 34.4%  
4250cb05-97be-4bac-887a-acc307d1bc0c  RAC1
{code}

While on the drained node we see this:

{code}
[datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool drain
[datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool status
Note: Ownership information does not include topology; for complete 
information, specify a keyspace
Datacenter: Central
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  AddressLoad   Tokens  Owns   Host ID
   Rack
UN  192.168.56.21  210.78 KB  256 32.1%  
82eb2fca-4f57-467b-a972-93096ec5d69f  RAC1
UN  192.168.56.23  2.22 GB256 33.5%  
a11bfac1-fad0-440b-bd68-7562a89ce3c7  RAC1
UN  192.168.56.22  2.22 GB256 34.4%  
4250cb05-97be-4bac-887a-acc307d1bc0c  RAC1
{code}

Netstat shows outgoing connections from the drained node to other nodes as 
still established on port 7000 but the node is no longer listening on port 7000 
which I believe is expected.

However the output of nodetool status on the drained node could be interpreted 
as misleading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions

2015-01-02 Thread Christian Spriegel (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Spriegel updated CASSANDRA-7886:
--
Attachment: 7886_v6_trunk.txt

 Coordinator should not wait for read timeouts when replicas hit Exceptions
 --

 Key: CASSANDRA-7886
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Tested with Cassandra 2.0.8
Reporter: Christian Spriegel
Assignee: Christian Spriegel
Priority: Minor
  Labels: protocolv4
 Fix For: 3.0

 Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt, 
 7886_v4_trunk.txt, 7886_v5_trunk.txt, 7886_v6_trunk.txt


 *Issue*
 When you have TombstoneOverwhelmingExceptions occuring in queries, this will 
 cause the query to be simply dropped on every data-node, but no response is 
 sent back to the coordinator. Instead the coordinator waits for the specified 
 read_request_timeout_in_ms.
 On the application side this can cause memory issues, since the application 
 is waiting for the timeout interval for every request.Therefore, if our 
 application runs into TombstoneOverwhelmingExceptions, then (sooner or later) 
 our entire application cluster goes down :-(
 *Proposed solution*
 I think the data nodes should send a error message to the coordinator when 
 they run into a TombstoneOverwhelmingException. Then the coordinator does not 
 have to wait for the timeout-interval.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8552) Large compactions run out of off-heap RAM

2015-01-02 Thread Philip Thompson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8552:
---
Reproduced In: 2.1.2, 2.1.1  (was: 2.1.1, 2.1.2)
Fix Version/s: 2.1.3
 Assignee: Marcus Eriksson

 Large compactions run out of off-heap RAM
 -

 Key: CASSANDRA-8552
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 14.4 
 AWS EC2
 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
Reporter: Brent Haines
Assignee: Marcus Eriksson
Priority: Blocker
 Fix For: 2.1.3


 We have a large table of storing, effectively event logs and a pair of 
 denormalized tables for indexing.
 When updating from 2.0 to 2.1 we saw performance improvements, but some 
 random and silent crashes during nightly repairs. We lost a node (totally 
 corrupted) and replaced it. That node has never stabilized -- it simply can't 
 finish the compactions. 
 Smaller compactions finish. Larger compactions, like these two never finish - 
 {code}
 pending tasks: 48
compaction type   keyspace table completed total   
  unit   progress
 Compaction   data   stories   16532973358   75977993784   
 bytes 21.76%
 Compaction   data   stories_by_text   10593780658   38555048812   
 bytes 27.48%
 Active compaction remaining time :   0h10m51s
 {code}
 We are not getting exceptions and are not running out of heap space. The 
 Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
 We watch memory in the opscenter console and it will grow. If we turn off the 
 OOM killer for the process, it will run until everything else is killed 
 instead and then the kernel panics.
 We have the following settings configured: 
 2G Heap
 512M New
 {code}
 memtable_heap_space_in_mb: 1024
 memtable_offheap_space_in_mb: 1024
 memtable_allocation_type: heap_buffers
 commitlog_total_space_in_mb: 2048
 concurrent_compactors: 1
 compaction_throughput_mb_per_sec: 128
 {code}
 The compaction strategy is leveled (these are read-intensive tables that are 
 rarely updated)
 I have tried every setting, every option and I have the system where the MTBF 
 is about an hour now, but we never finish compacting because there are some 
 large compactions pending. None of the GC tools or settings help because it 
 is not a GC problem. It is an off-heap memory problem.
 We are getting these messages in our syslog 
 {code}
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
 process java  pte:0320 pmd:2d6fa5067
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
 vm_flags:0870 anon_vma:  (null) mapping:  (null) 
 index:7fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
 Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
 88020d43da98 81715ac4 7fb820be3000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
 81174183 0320 0007fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
 0320 7fb820be3000 7fb820be4000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [81715ac4] 
 dump_stack+0x45/0x56
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [81174183] 
 print_bad_pte+0x1a3/0x250
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [81175439] 
 vm_normal_page+0x69/0x80
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [8117580b] 
 unmap_page_range+0x3bb/0x7f0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [81175cc1] 
 unmap_single_vma+0x81/0xf0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [81176d39] 
 unmap_vmas+0x49/0x90
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219610]  [8117feec] 
 exit_mmap+0x9c/0x170
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219617]  [8110fcf3] 
 ? __delayacct_add_tsk+0x153/0x170
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219621]  [8106482c] 
 mmput+0x5c/0x120
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219625]  [81069bbc] 
 do_exit+0x26c/0xa50
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219631]  [810d7591] 
 ? __unqueue_futex+0x31/0x60
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219634]  [810d83b6] 
 ? futex_wait+0x126/0x290
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219640]  [8171d8e0] 
 ? _raw_spin_unlock_irqrestore+0x20/0x40
 Jan

[2/2] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1dcf66b0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1dcf66b0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1dcf66b0

Branch: refs/heads/trunk
Commit: 1dcf66b0938a9a73dcb475ea4539ce5f427a95a5
Parents: 0ba9c47 aeb7d3f
Author: Tyler Hobbs ty...@datastax.com
Authored: Fri Jan 2 12:15:11 2015 -0600
Committer: Tyler Hobbs ty...@datastax.com
Committed: Fri Jan 2 12:15:11 2015 -0600

--
 CHANGES.txt |  1 +
 bin/cqlsh   | 25 +
 2 files changed, 22 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1dcf66b0/CHANGES.txt
--
diff --cc CHANGES.txt
index dbcdcdf,1c1bfe2..6fd55e6
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,48 -1,5 +1,49 @@@
 +3.0
 + * Support index key/value entries on map collections (CASSANDRA-8473)
 + * Modernize schema tables (CASSANDRA-8261)
 + * Support for user-defined aggregation functions (CASSANDRA-8053)
 + * Fix NPE in SelectStatement with empty IN values (CASSANDRA-8419)
 + * Refactor SelectStatement, return IN results in natural order instead
 +   of IN value list order (CASSANDRA-7981)
 + * Support UDTs, tuples, and collections in user-defined
 +   functions (CASSANDRA-7563)
 + * Fix aggregate fn results on empty selection, result column name,
 +   and cqlsh parsing (CASSANDRA-8229)
 + * Mark sstables as repaired after full repair (CASSANDRA-7586)
 + * Extend Descriptor to include a format value and refactor reader/writer
 +   APIs (CASSANDRA-7443)
 + * Integrate JMH for microbenchmarks (CASSANDRA-8151)
 + * Keep sstable levels when bootstrapping (CASSANDRA-7460)
 + * Add Sigar library and perform basic OS settings check on startup 
(CASSANDRA-7838)
 + * Support for aggregation functions (CASSANDRA-4914)
 + * Remove cassandra-cli (CASSANDRA-7920)
 + * Accept dollar quoted strings in CQL (CASSANDRA-7769)
 + * Make assassinate a first class command (CASSANDRA-7935)
 + * Support IN clause on any clustering column (CASSANDRA-4762)
 + * Improve compaction logging (CASSANDRA-7818)
 + * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917)
 + * Do anticompaction in groups (CASSANDRA-6851)
 + * Support user-defined functions (CASSANDRA-7395, 7526, 7562, 7740, 7781, 
7929,
 +   7924, 7812, 8063, 7813, 7708)
 + * Permit configurable timestamps with cassandra-stress (CASSANDRA-7416)
 + * Move sstable RandomAccessReader to nio2, which allows using the
 +   FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050)
 + * Remove CQL2 (CASSANDRA-5918)
 + * Add Thrift get_multi_slice call (CASSANDRA-6757)
 + * Optimize fetching multiple cells by name (CASSANDRA-6933)
 + * Allow compilation in java 8 (CASSANDRA-7028)
 + * Make incremental repair default (CASSANDRA-7250)
 + * Enable code coverage thru JaCoCo (CASSANDRA-7226)
 + * Switch external naming of 'column families' to 'tables' (CASSANDRA-4369) 
 + * Shorten SSTable path (CASSANDRA-6962)
 + * Use unsafe mutations for most unit tests (CASSANDRA-6969)
 + * Fix race condition during calculation of pending ranges (CASSANDRA-7390)
 + * Fail on very large batch sizes (CASSANDRA-8011)
 + * Improve concurrency of repair (CASSANDRA-6455, 8208)
 +
 +
  2.1.3
+  * (cqlsh) Handle a schema mismatch being detected on startup (CASSANDRA-8512)
   * Properly calculate expected write size during compaction (CASSANDRA-8532)
   * Invalidate affected prepared statements when a table's columns
 are altered (CASSANDRA-7910)

[1/2] cassandra git commit: cqlsh: handle schema mismatch on startup

Repository: cassandra
Updated Branches:
  refs/heads/trunk 0ba9c4775 - 1dcf66b09


cqlsh: handle schema mismatch on startup

Patch by Tyler Hobbs; reviewed by Aleksey Yeschenko for CASSANDRA-8512


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/aeb7d3f2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/aeb7d3f2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/aeb7d3f2

Branch: refs/heads/trunk
Commit: aeb7d3f2eefcd5aa452012c048341deb814cf0b0
Parents: 4e1e92b
Author: Tyler Hobbs ty...@datastax.com
Authored: Fri Jan 2 12:14:24 2015 -0600
Committer: Tyler Hobbs ty...@datastax.com
Committed: Fri Jan 2 12:14:24 2015 -0600

--
 CHANGES.txt |  1 +
 bin/cqlsh   | 25 +
 2 files changed, 22 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/aeb7d3f2/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index c24..1c1bfe2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.3
+ * (cqlsh) Handle a schema mismatch being detected on startup (CASSANDRA-8512)
  * Properly calculate expected write size during compaction (CASSANDRA-8532)
  * Invalidate affected prepared statements when a table's columns
are altered (CASSANDRA-7910)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/aeb7d3f2/bin/cqlsh
--
diff --git a/bin/cqlsh b/bin/cqlsh
index a714f48..363a4f6 100755
--- a/bin/cqlsh
+++ b/bin/cqlsh
@@ -562,15 +562,32 @@ class Shell(cmd.Cmd):
 self.session = self.conn.connect(keyspace)
 else:
 self.session = self.conn.connect()
+
+self.color = color
+self.display_time_format = display_time_format
+self.display_float_precision = display_float_precision
+
+# Workaround for CASSANDRA-8521 until PYTHON-205 is resolved.
+# If there is no schema metadata present (due to a schema mismatch),
+# get rid of the code that checks for a schema mismatch and force
+# the schema metadata to be built.
+if not self.conn.metadata.keyspaces:
+self.printerr(Warning: schema version mismatch detected; check 
the schema versions of your 
+  nodes in system.local and system.peers.)
+original_method = 
self.conn.control_connection._get_schema_mismatches
+try:
+self.conn.control_connection._get_schema_mismatches = lambda 
*args, **kwargs: None
+future = self.conn.submit_schema_refresh()
+future.result(timeout=10)
+finally:
+self.conn.control_connection._get_schema_mismatches = 
original_method
+
 self.session.default_timeout = client_timeout
 self.session.row_factory = ordered_dict_factory
 self.get_connection_versions()
 
 self.current_keyspace = keyspace
 
-self.color = color
-self.display_time_format = display_time_format
-self.display_float_precision = display_float_precision
 self.max_trace_wait = max_trace_wait
 self.session.max_trace_wait = max_trace_wait
 if encoding is None:
@@ -980,7 +997,7 @@ class Shell(cmd.Cmd):
 rows = self.session.execute(statement, 
trace=self.tracing_enabled)
 break
 except CQL_ERRORS, err:
-self.printerr(str(err))
+self.printerr(str(err.__class__.__name__) + :  + str(err))
 return False
 except Exception, err:
 import traceback

[jira] [Commented] (CASSANDRA-8192) Better error logging on corrupt compressed SSTables: currently AssertionError in Memory.java

2015-01-02 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263084#comment-14263084
 ] 

Marcus Eriksson commented on CASSANDRA-8192:


+1

 Better error logging on corrupt compressed SSTables: currently AssertionError 
 in Memory.java
 

 Key: CASSANDRA-8192
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8192
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Windows-7-32 bit, 3GB RAM, Java 1.7.0_67
Reporter: Andreas Schnitzerling
Assignee: Joshua McKenzie
Priority: Minor
 Fix For: 2.1.3

 Attachments: 8192_v1.txt, 8192_v2.txt, cassandra.bat, cassandra.yaml, 
 logdata-onlinedata-ka-196504-CompressionInfo.zip, printChunkOffsetErrors.txt, 
 system-compactions_in_progress-ka-47594-CompressionInfo.zip, 
 system-sstable_activity-jb-25-Filter.zip, system.log, system_AssertionTest.log


 Since update of 1 of 12 nodes from 2.1.0-rel to 2.1.1-rel Exception during 
 start up.
 {panel:title=system.log}
 ERROR [SSTableBatchOpen:1] 2014-10-27 09:44:00,079 CassandraDaemon.java:153 - 
 Exception in thread Thread[SSTableBatchOpen:1,5,main]
 java.lang.AssertionError: null
   at org.apache.cassandra.io.util.Memory.size(Memory.java:307) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:135)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:83)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:50)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:766) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:725) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:402) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:302) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at 
 org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:438) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
   at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
 ~[na:1.7.0_55]
   at java.util.concurrent.FutureTask.run(Unknown Source) ~[na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
 [na:1.7.0_55]
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
 [na:1.7.0_55]
   at java.lang.Thread.run(Unknown Source) [na:1.7.0_55]
 {panel}
 In the attached log you can still see as well CASSANDRA-8069 and 
 CASSANDRA-6283.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2015-01-02 Thread Ariel Weisberg (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263083#comment-14263083
]

Ariel Weisberg commented on CASSANDRA-7438:
---

I went to run the benchmark myself and I noticed you used a uniform
distribution for the keys. I don't think that makes sense for testing a cache
where the primary benefit is going to be from cacheable access patterns. I
would use extreme with .6 or .5 for the shape.

I am also confused by the benchmark implementation. There are threads
generating the tasks and then handing them off to other threads for execution.
This means the benchmark is measuring unrelated things like the performance of
the queue used for receiving tasks and returning results as well as the general
design of the harness. It makes me wonder if that is the source of the
under-utilization issue.

I think this might work well as a JMH benchmark and the parameterization would
make it easy to put together a full test matrix that anyone can run with one
command.

I tried to run it and it seems to go for longer than expected. I specified -d
300 and it is still going. The benchmark is doing work according to top.

I ran on a c3.8xlarge using the Rightscale 14.1 base server template running
Ubuntu 14.04, Oracle JDK8u25, I got jemalloc from the libjemalloc1 package.
Cloned OHC today and ran the benchmarking using
bq.java -jar ohc-benchmark/target/ohc-benchmark-0.2-SNAPSHOT.jar -rkd
'gaussian(1..2000,2)' -wkd 'gaussian(1..2000,2)' -vs
'gaussian(1024..4096,2)' -r .9 -cap 160 -d 300 -t 30 -dr 8
after running mvn package.

Serializing Row cache alternative (Fully off heap)
--

Key: CASSANDRA-7438
URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
Project: Cassandra
Issue Type: Improvement
Components: Core
Environment: Linux
Reporter: Vijay
Assignee: Vijay
Labels: performance
Fix For: 3.0

Attachments: 0001-CASSANDRA-7438.patch, tests.zip

Currently SerializingCache is partially off heap, keys are still stored in
JVM heap as BB,
* There is a higher GC costs for a reasonably big cache.
* Some users have used the row cache efficiently in production for better
results, but this requires careful tunning.
* Overhead in Memory for the cache entries are relatively high.
So the proposal for this ticket is to move the LRU cache logic completely off
heap and use JNI to interact with cache. We might want to ensure that the new
implementation match the existing API's (ICache), and the implementation
needs to have safe memory access, low overhead in memory and less memcpy's
(As much as possible).
We might also want to make this cache configurable.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7016) can't map/reduce over subset of rows with cql

2015-01-02 Thread Tyler Hobbs (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263185#comment-14263185
 ] 

Tyler Hobbs commented on CASSANDRA-7016:


Overall the patch looks good.

There's one bug, though: if the two ends of a token range are equal, you'll get 
an error like the following:

{noformat}
SELECT * FROM system.schema_columns WHERE token(keyspace_name)  token('abc') 
AND token(keyspace_name)  token('abc') AND keyspace_name IN ('system', 
'system_traces');
{noformat}

{noformat}
ERROR 18:59:53 Unexpected exception during request
java.lang.IllegalArgumentException: Invalid range: 
(-5434086359492102041‥-5434086359492102041)
at com.google.common.collect.Range.init(Range.java:363) 
~[guava-16.0.jar:na]
at com.google.common.collect.Range.create(Range.java:153) 
~[guava-16.0.jar:na]
at com.google.common.collect.Range.range(Range.java:226) 
~[guava-16.0.jar:na]
at 
org.apache.cassandra.cql3.restrictions.TokenFilter.toRangSet(TokenFilter.java:197)
 ~[main/:na]
at 
org.apache.cassandra.cql3.restrictions.TokenFilter.filter(TokenFilter.java:133) 
~[main/:na]
at 
org.apache.cassandra.cql3.restrictions.TokenFilter.values(TokenFilter.java:82) 
~[main/:na]
at 
org.apache.cassandra.cql3.restrictions.StatementRestrictions.getPartitionKeys(StatementRestrictions.java:361)
 ~[main/:na]
at 
org.apache.cassandra.cql3.statements.SelectStatement.getSliceCommands(SelectStatement.java:296)
 ~[main/:na]
at 
org.apache.cassandra.cql3.statements.SelectStatement.getPageableCommand(SelectStatement.java:205)
 ~[main/:na]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:165)
 ~[main/:na]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:72)
 ~[main/:na]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:239)
 ~[main/:na]
at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:261) 
~[main/:na]
at 
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:118)
 ~[main/:na]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
 [main/:na]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
 [main/:na]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_40]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [main/:na]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[main/:na]
at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40]
{noformat}

Besides that, a few nits:
* The params in the javadoc for {{TokenFilter.toRangeSet()}} are incorrect
* Whitespace is off in {{TokenFilter.filter()}}
* In {{TokenFilter.isOnToken()}}, I think the comment and logic could be a 
little clearer.  It seems like the check could use {{}} instead of {{!=}}, 
correct? Perhaps the comment should say something like if all partition key 
columns have non-token restrictions, we can simply use the token range to 
filter those restrictions and then ignore the token range.

 can't map/reduce over subset of rows with cql
 -

 Key: CASSANDRA-7016
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7016
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Hadoop
Reporter: Jonathan Halliday
Assignee: Benjamin Lerer
Priority: Minor
  Labels: cql, docs
 Fix For: 3.0

 Attachments: CASSANDRA-7016-V2.txt, CASSANDRA-7016-V3.txt, 
 CASSANDRA-7016-V4-trunk.txt, CASSANDRA-7016.txt


 select ... where token(k)  x and token(k) = y and k in (a,b) allow 
 filtering;
 This fails on 2.0.6: can't restrict k by more than one relation.
 In the context of map/reduce (hence the token range) I want to map over only 
 a subset of the keys (hence the 'in').  Pushing the 'in' filter down to cql 
 is substantially cheaper than pulling all rows to the client and then 
 discarding most of them.
 Currently this

[jira] [Commented] (CASSANDRA-8522) Getting partial set of columns in a 'select *' query

2015-01-02 Thread Fabiano C. Botelho (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263163#comment-14263163
 ] 

Fabiano C. Botelho commented on CASSANDRA-8522:
---

any update on this?
Thanks, Fabiano.

 Getting partial set of columns in a 'select *' query
 

 Key: CASSANDRA-8522
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8522
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Fabiano C. Botelho
Assignee: Aleksey Yeschenko
 Fix For: 2.0.12

 Attachments: systemlogs.zip


 Configuration:
3 node cluster, where two nodes are fine and just one sees the issue 
 reported here. It is an in-memory state  on the server that gets cleared with 
 a cassandra restart on the problematic  node.
 Problem:
 Scenario (this is a run through on the problematic node after at least 6 
 hours the problem had surfaced):
 1. After schema had been installed, one can do a  'describe table events' and 
 that shows all the columns in the table, see below:
 {code}
 Use HELP for help.
 cqlsh:sd DESCRIBE TABLE events
 CREATE TABLE events (
   dayhour text,
   id text,
   event_info text,
   event_series_id text,
   event_type text,
   internal_timestamp bigint,
   is_read boolean,
   is_user_visible boolean,
   link text,
   node_id text,
   time timestamp,
   PRIMARY KEY ((dayhour), id)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 CREATE INDEX events_id_idx ON events (id);
 CREATE INDEX events_event_series_id_idx ON events (event_series_id);
 {code}
 2. run a query selecting all columns on the same table above:
 {code}
 cqlsh:sd select * from events limit 10;
  dayhour   | id   | event_series_id   
| is_user_visible
 ---+--+--+-
  2014-12-19:12 | 3a70e8f8-0b04-4485-bf8f-c3d4031687ed | 
 7c129287-2b3d-4342-8f2b-f1eba61267f6 |   False
  2014-12-19:12 | 49a854fb-0e6c-43e9-830e-6f833689df0b | 
 1a130faf-d755-4e52-9f93-82a380d86f31 |   False
  2014-12-19:12 | 6df0b844-d810-423e-8e43-5b3d44213699 | 
 7c129287-2b3d-4342-8f2b-f1eba61267f6 |   False
  2014-12-19:12 | 92d55ff9-724a-4bc4-a57f-dfeee09e46a4 | 
 1a130faf-d755-4e52-9f93-82a380d86f31 |   False
  2014-12-19:17 | 2e0ea98c-4d5a-4ad2-b386-bc181e2e7cec | 
 a9cf80e9-b8de-4154-9a37-13ed95459a91 |   False
  2014-12-19:17 | 8837dc3f-abae-45e6-80cb-c3dffd3f08aa | 
 cb0e4867-0f27-47e3-acde-26b105e0fdc9 |   False
  2014-12-19:17 | b36baa5b-b084-4596-a8a5-d85671952313 | 
 cb0e4867-0f27-47e3-acde-26b105e0fdc9 |   False
  2014-12-19:17 | f73f9438-cba7-4961-880e-77e134175390 | 
 a9cf80e9-b8de-4154-9a37-13ed95459a91 |   False
  2014-12-19:16 | 47b47745-c4f6-496b-a976-381a545f7326 | 
 4bc7979f-2c68-4d65-91a1-e1999a3bbc7a |   False
  2014-12-19:16 | 5708098f-0c0a-4372-be03-ea7057a3bd44 | 
 10ac9312-9487-4de9-b706-0d0af18bf9fd |   False
 {code}
 Note that not all columns show up in the result.
 3. Try a query that refers to at least one of the missing columns in the 
 result above, but off course one that is in the schema.
 {code}
 cqlsh:sd select dayhour, id, event_info from events
   ... ;
 Bad Request: Undefined name event_info in selection clause
 {code}
 Note that it failed saying that 'event_info' was not defined.
 This problem goes away with a restart of cassandra in the problematic node. 
 This does not seem to be the java-320 bug where the fix is supposed to be 
 fixed in driver 2.0.2. We are using driver version 2.0.1. Note that this 
 issue surfaces both with the driver as well as with cqlsh, which points to a 
 problem in the cassandra server. Would appreciate some help with a fix or a 
 quick workaround that is not simply restarting the server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: cqlsh: handle schema mismatch on startup

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 4e1e92b31 - aeb7d3f2e


cqlsh: handle schema mismatch on startup

Patch by Tyler Hobbs; reviewed by Aleksey Yeschenko for CASSANDRA-8512


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/aeb7d3f2
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/aeb7d3f2
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/aeb7d3f2

Branch: refs/heads/cassandra-2.1
Commit: aeb7d3f2eefcd5aa452012c048341deb814cf0b0
Parents: 4e1e92b
Author: Tyler Hobbs ty...@datastax.com
Authored: Fri Jan 2 12:14:24 2015 -0600
Committer: Tyler Hobbs ty...@datastax.com
Committed: Fri Jan 2 12:14:24 2015 -0600

--
 CHANGES.txt |  1 +
 bin/cqlsh   | 25 +
 2 files changed, 22 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/aeb7d3f2/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index c24..1c1bfe2 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.3
+ * (cqlsh) Handle a schema mismatch being detected on startup (CASSANDRA-8512)
  * Properly calculate expected write size during compaction (CASSANDRA-8532)
  * Invalidate affected prepared statements when a table's columns
are altered (CASSANDRA-7910)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/aeb7d3f2/bin/cqlsh
--
diff --git a/bin/cqlsh b/bin/cqlsh
index a714f48..363a4f6 100755
--- a/bin/cqlsh
+++ b/bin/cqlsh
@@ -562,15 +562,32 @@ class Shell(cmd.Cmd):
 self.session = self.conn.connect(keyspace)
 else:
 self.session = self.conn.connect()
+
+self.color = color
+self.display_time_format = display_time_format
+self.display_float_precision = display_float_precision
+
+# Workaround for CASSANDRA-8521 until PYTHON-205 is resolved.
+# If there is no schema metadata present (due to a schema mismatch),
+# get rid of the code that checks for a schema mismatch and force
+# the schema metadata to be built.
+if not self.conn.metadata.keyspaces:
+self.printerr(Warning: schema version mismatch detected; check 
the schema versions of your 
+  nodes in system.local and system.peers.)
+original_method = 
self.conn.control_connection._get_schema_mismatches
+try:
+self.conn.control_connection._get_schema_mismatches = lambda 
*args, **kwargs: None
+future = self.conn.submit_schema_refresh()
+future.result(timeout=10)
+finally:
+self.conn.control_connection._get_schema_mismatches = 
original_method
+
 self.session.default_timeout = client_timeout
 self.session.row_factory = ordered_dict_factory
 self.get_connection_versions()
 
 self.current_keyspace = keyspace
 
-self.color = color
-self.display_time_format = display_time_format
-self.display_float_precision = display_float_precision
 self.max_trace_wait = max_trace_wait
 self.session.max_trace_wait = max_trace_wait
 if encoding is None:
@@ -980,7 +997,7 @@ class Shell(cmd.Cmd):
 rows = self.session.execute(statement, 
trace=self.tracing_enabled)
 break
 except CQL_ERRORS, err:
-self.printerr(str(err))
+self.printerr(str(err.__class__.__name__) + :  + str(err))
 return False
 except Exception, err:
 import traceback

[jira] [Commented] (CASSANDRA-8457) nio MessagingService

2015-01-02 Thread Ariel Weisberg (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263096#comment-14263096
]

Ariel Weisberg commented on CASSANDRA-8457:
---

i can't get performance counters for cache behaviors on EC2 as far as I can
tell and I don't have a good answer for why I get the performance numbers I am
seeing.

I ran the measurements with CL.QUORUM, ONE, and ALL against trunk and my branch
with/without rpc_max_threads increased to 1024.

This was prompted by measurements on a 15 node cluster where CL.ONE was 10x
faster then CL.ALL. I measured the full matrix on a 9 node cluster and CL.ONE
was 5x faster than CL.ALL which with RF=5 is the expected performance delta. I
definitely see under utilization. With CL.ONE run right at 1600% and with
CL.ALL they don't make it up that high although trunk does better in that
respect.

The under utilization is worse with the modified code that uses SEPExecutor. I
maybe have to run with 15 nodes again to see if the jump from 9-15 is what
causes CL.ALL to perform worse or if the difference is that I was using a
placement group and 14.04 in the 9 node cluster.

The change to use SEPExecutor for writes was slightly slower to a lot slower in
QUORUM and ALL cases at 9 nodes. I think that is a dead end, but I do wonder if
that is because SEPExecutor might not have the same cache friendly behavior
that running dedicated threads does. Dedicated threads require signaling and
context switching, but thread scheduling policies could result in threads
servicing each socket alway running in the same spot.

I am going to try again with netty. I should at least be able to match the
performance of trunk with a non-blocking approach so I think it is still worth
digging.

nio MessagingService

Key: CASSANDRA-8457
URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
Labels: performance
Fix For: 3.0

Thread-per-peer (actually two each incoming and outbound) is a big
contributor to context switching, especially for larger clusters. Let's look
at switching to nio, possibly via Netty.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-8555) Immediate sequential update of column sometimes not immediately applied (OS X only?)

2015-01-02 Thread Andy Tolbert (JIRA)

Andy Tolbert created CASSANDRA-8555:
---

 Summary: Immediate sequential update of column sometimes not 
immediately applied (OS X only?)
 Key: CASSANDRA-8555
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8555
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: OS X, Oracle JDK 1.7.0_71-b14, cassandra-2.0 HEAD, 
1.2.19, 2.0.11,  2.1.2.   1 node cluster.
Reporter: Andy Tolbert
Priority: Minor


There was [a question on stack 
overflow|http://stackoverflow.com/questions/27707081/cassandra-writes-after-setting-a-column-to-null-are-lost-randomly-is-this-a-bu]
 from a user where they had a problem when doing the following:

{code:java}
@Test
public void testWriteUpdateRead() throws Exception {
  Cluster cluster = Cluster.builder()
  .addContactPoint(127.0.0.1)
  .build();
  Session cs = cluster.connect();
  cs.execute(DROP KEYSPACE if exists readtest;);
  cs.execute(CREATE KEYSPACE readtest WITH replication  +
  = {'class':'SimpleStrategy', 'replication_factor':1};);
  cs.execute(create table readtest.sessions( +
  id text primary key, +
  passwordHash text, +
  ););

  for (int i = 0; i  1000; i++) {
String sessionID = UUID.randomUUID().toString();
cs.execute(insert into readtest.sessions (id, passwordHash) values(' + 
sessionID + ', null));
cs.execute(update readtest.sessions set passwordHash=' + sessionID + ' 
where id = ' + sessionID + ' );
ResultSet rs = cs.execute(select * from readtest.sessions where id = ' + 
sessionID + ');
Row row = rs.one();
assertThat(failed ith time= + i, row.getString(passwordHash), 
equalTo(sessionID));
  }
  cs.close();
  cluster.close();
}
{code}

Running this test, there are times where the 'passwordHash' column was null, 
making it seem like the update statement was never applied.

I can only reproduce this on OS X for some reason.  I suspect this may be a 
duplicate or was resolved coincidentally by a recent change, since it appears 
to be resolved in the cassandra-2.1 and trunk branches, but I can reproduce the 
issue against cassandra-2.1.2.  The problem appears to still exist in 
cassandra-2.0 HEAD.  I went through CHANGES.txt for 2.1.3 and no fix stuck out 
so I figured I'd create an issue just in case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8554) Node where gossip is disabled still shows as UP on that node; other nodes show it as DN

2015-01-02 Thread Mark Curtis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Curtis updated CASSANDRA-8554:
---
Summary: Node where gossip is disabled still shows  as UP on that node; 
other nodes show it as DN  (was: Nodetool drain shows all nodes as UP on the 
drained node)

 Node where gossip is disabled still shows  as UP on that node; other nodes 
 show it as DN
 

 Key: CASSANDRA-8554
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8554
 Project: Cassandra
  Issue Type: Bug
 Environment: Centos 6.5, DSE4.5.1 tarball install
Reporter: Mark Curtis
Priority: Minor

 When running nodetool drain, the drained node will still show the status of 
 itself as UP in nodetool status even after the drain has finished. For 
 example using a 3 node cluster on one of the nodes that is still operating 
 and not drained we see this:
 {code}
 $ ./dse-4.5.1/bin/nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: Central
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  192.168.56.21  210.78 KB  256 32.1%  
 82eb2fca-4f57-467b-a972-93096ec5d69f  RAC1
 DN  192.168.56.23  2.22 GB256 33.5%  
 a11bfac1-fad0-440b-bd68-7562a89ce3c7  RAC1
 UN  192.168.56.22  2.22 GB256 34.4%  
 4250cb05-97be-4bac-887a-acc307d1bc0c  RAC1
 {code}
 While on the drained node we see this:
 {code}
 [datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool drain
 [datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: Central
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  192.168.56.21  210.78 KB  256 32.1%  
 82eb2fca-4f57-467b-a972-93096ec5d69f  RAC1
 UN  192.168.56.23  2.22 GB256 33.5%  
 a11bfac1-fad0-440b-bd68-7562a89ce3c7  RAC1
 UN  192.168.56.22  2.22 GB256 34.4%  
 4250cb05-97be-4bac-887a-acc307d1bc0c  RAC1
 {code}
 Netstat shows outgoing connections from the drained node to other nodes as 
 still established on port 7000 but the node is no longer listening on port 
 7000 which I believe is expected.
 However the output of nodetool status on the drained node could be 
 interpreted as misleading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7910) wildcard prepared statements are incorrect after a column is added to the table

2015-01-02 Thread Tyler Hobbs (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-7910:
---
Labels: client-impacting  (was: )

 wildcard prepared statements are incorrect after a column is added to the 
 table
 ---

 Key: CASSANDRA-7910
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7910
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Oded Peer
Assignee: Tyler Hobbs
Priority: Minor
  Labels: client-impacting
 Fix For: 2.1.3

 Attachments: 7910-2.1.txt, 7910-trunk.txt, 
 PreparedStatementAfterAddColumnTest.java


 1. Prepare a statement with a wildcard in the select clause.
 2. Alter the table - add a column
 3. execute the prepared statement
 Expected result - get all the columns including the new column
 Actual result - get the columns except the new column
 Attached a test using cassandra-unit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8552) Large compactions run out of off-heap RAM


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Haines updated CASSANDRA-8552:

Reproduced In: 2.1.2, 2.1.1  (was: 2.1.1, 2.1.2)
  Environment: 
Ubuntu 14.4 
AWS EC2
12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
Java build 1.7.0_55-b13 and build 1.8.0_25-b17

  was:
Ubuntu 14.4 
AWS EC2
12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]

   Attachment: system.log

I added the full system.log and clarified the versions of Java that we have run 
this over. The system.log will reflect many restarts including a hard reboot 
done last night when the system became unresponsive.

 Large compactions run out of off-heap RAM
 -

 Key: CASSANDRA-8552
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 14.4 
 AWS EC2
 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
 Java build 1.7.0_55-b13 and build 1.8.0_25-b17
Reporter: Brent Haines
Assignee: Marcus Eriksson
Priority: Blocker
 Fix For: 2.1.3

 Attachments: system.log


 We have a large table of storing, effectively event logs and a pair of 
 denormalized tables for indexing.
 When updating from 2.0 to 2.1 we saw performance improvements, but some 
 random and silent crashes during nightly repairs. We lost a node (totally 
 corrupted) and replaced it. That node has never stabilized -- it simply can't 
 finish the compactions. 
 Smaller compactions finish. Larger compactions, like these two never finish - 
 {code}
 pending tasks: 48
compaction type   keyspace table completed total   
  unit   progress
 Compaction   data   stories   16532973358   75977993784   
 bytes 21.76%
 Compaction   data   stories_by_text   10593780658   38555048812   
 bytes 27.48%
 Active compaction remaining time :   0h10m51s
 {code}
 We are not getting exceptions and are not running out of heap space. The 
 Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
 We watch memory in the opscenter console and it will grow. If we turn off the 
 OOM killer for the process, it will run until everything else is killed 
 instead and then the kernel panics.
 We have the following settings configured: 
 2G Heap
 512M New
 {code}
 memtable_heap_space_in_mb: 1024
 memtable_offheap_space_in_mb: 1024
 memtable_allocation_type: heap_buffers
 commitlog_total_space_in_mb: 2048
 concurrent_compactors: 1
 compaction_throughput_mb_per_sec: 128
 {code}
 The compaction strategy is leveled (these are read-intensive tables that are 
 rarely updated)
 I have tried every setting, every option and I have the system where the MTBF 
 is about an hour now, but we never finish compacting because there are some 
 large compactions pending. None of the GC tools or settings help because it 
 is not a GC problem. It is an off-heap memory problem.
 We are getting these messages in our syslog 
 {code}
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
 process java  pte:0320 pmd:2d6fa5067
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
 vm_flags:0870 anon_vma:  (null) mapping:  (null) 
 index:7fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
 Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
 88020d43da98 81715ac4 7fb820be3000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
 81174183 0320 0007fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
 0320 7fb820be3000 7fb820be4000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [81715ac4] 
 dump_stack+0x45/0x56
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [81174183] 
 print_bad_pte+0x1a3/0x250
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [81175439] 
 vm_normal_page+0x69/0x80
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [8117580b] 
 unmap_page_range+0x3bb/0x7f0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [81175cc1] 
 unmap_single_vma+0x81/0xf0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [81176d39] 
 unmap_vmas+0x49/0x90
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219610]  [8117feec] 
 exit_mmap+0x9c/0x170
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219617]  [8110fcf3] 
 ? __delayacct_add_tsk+0x153/0x170
 Jan  2 07:06:00

[jira] [Commented] (CASSANDRA-8552) Large compactions run out of off-heap RAM


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14262922#comment-14262922
 ] 

Benedict commented on CASSANDRA-8552:
-

Are you certain that system.log is accurate? There are multiple messages from 
system startup within a few microseconds of each other, which seems very 
suspicious, and compaction shouldn't be possible at that stage. The startup 
messages are also incomplete. Are you running multiple Cassandra daemons in one 
Java process somehow?



 Large compactions run out of off-heap RAM
 -

 Key: CASSANDRA-8552
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 14.4 
 AWS EC2
 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
Reporter: Brent Haines
Assignee: Marcus Eriksson
Priority: Blocker
 Fix For: 2.1.3


 We have a large table of storing, effectively event logs and a pair of 
 denormalized tables for indexing.
 When updating from 2.0 to 2.1 we saw performance improvements, but some 
 random and silent crashes during nightly repairs. We lost a node (totally 
 corrupted) and replaced it. That node has never stabilized -- it simply can't 
 finish the compactions. 
 Smaller compactions finish. Larger compactions, like these two never finish - 
 {code}
 pending tasks: 48
compaction type   keyspace table completed total   
  unit   progress
 Compaction   data   stories   16532973358   75977993784   
 bytes 21.76%
 Compaction   data   stories_by_text   10593780658   38555048812   
 bytes 27.48%
 Active compaction remaining time :   0h10m51s
 {code}
 We are not getting exceptions and are not running out of heap space. The 
 Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
 We watch memory in the opscenter console and it will grow. If we turn off the 
 OOM killer for the process, it will run until everything else is killed 
 instead and then the kernel panics.
 We have the following settings configured: 
 2G Heap
 512M New
 {code}
 memtable_heap_space_in_mb: 1024
 memtable_offheap_space_in_mb: 1024
 memtable_allocation_type: heap_buffers
 commitlog_total_space_in_mb: 2048
 concurrent_compactors: 1
 compaction_throughput_mb_per_sec: 128
 {code}
 The compaction strategy is leveled (these are read-intensive tables that are 
 rarely updated)
 I have tried every setting, every option and I have the system where the MTBF 
 is about an hour now, but we never finish compacting because there are some 
 large compactions pending. None of the GC tools or settings help because it 
 is not a GC problem. It is an off-heap memory problem.
 We are getting these messages in our syslog 
 {code}
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
 process java  pte:0320 pmd:2d6fa5067
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
 vm_flags:0870 anon_vma:  (null) mapping:  (null) 
 index:7fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
 Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
 88020d43da98 81715ac4 7fb820be3000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
 81174183 0320 0007fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
 0320 7fb820be3000 7fb820be4000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [81715ac4] 
 dump_stack+0x45/0x56
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [81174183] 
 print_bad_pte+0x1a3/0x250
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [81175439] 
 vm_normal_page+0x69/0x80
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [8117580b] 
 unmap_page_range+0x3bb/0x7f0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [81175cc1] 
 unmap_single_vma+0x81/0xf0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [81176d39] 
 unmap_vmas+0x49/0x90
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219610]  [8117feec] 
 exit_mmap+0x9c/0x170
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219617]  [8110fcf3] 
 ? __delayacct_add_tsk+0x153/0x170
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219621]  [8106482c] 
 mmput+0x5c/0x120
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219625]  [81069bbc] 
 do_exit+0x26c/0xa50
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219631]

[jira] [Updated] (CASSANDRA-8552) Large compactions run out of off-heap RAM


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Haines updated CASSANDRA-8552:

Attachment: Screen Shot 2015-01-02 at 9.36.11 PM.png

This is how the system looks after about 15 minutes of compacting

 Large compactions run out of off-heap RAM
 -

 Key: CASSANDRA-8552
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8552
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 14.4 
 AWS EC2
 12 m1.xlarge nodes [4 cores, 16GB RAM, 1TB storage (251GB Used)]
 Java build 1.7.0_55-b13 and build 1.8.0_25-b17
Reporter: Brent Haines
Assignee: Marcus Eriksson
Priority: Blocker
 Fix For: 2.1.3

 Attachments: Screen Shot 2015-01-02 at 9.36.11 PM.png, system.log


 We have a large table of storing, effectively event logs and a pair of 
 denormalized tables for indexing.
 When updating from 2.0 to 2.1 we saw performance improvements, but some 
 random and silent crashes during nightly repairs. We lost a node (totally 
 corrupted) and replaced it. That node has never stabilized -- it simply can't 
 finish the compactions. 
 Smaller compactions finish. Larger compactions, like these two never finish - 
 {code}
 pending tasks: 48
compaction type   keyspace table completed total   
  unit   progress
 Compaction   data   stories   16532973358   75977993784   
 bytes 21.76%
 Compaction   data   stories_by_text   10593780658   38555048812   
 bytes 27.48%
 Active compaction remaining time :   0h10m51s
 {code}
 We are not getting exceptions and are not running out of heap space. The 
 Ubuntu OOM killer is reaping the process after all of the memory is consumed. 
 We watch memory in the opscenter console and it will grow. If we turn off the 
 OOM killer for the process, it will run until everything else is killed 
 instead and then the kernel panics.
 We have the following settings configured: 
 2G Heap
 512M New
 {code}
 memtable_heap_space_in_mb: 1024
 memtable_offheap_space_in_mb: 1024
 memtable_allocation_type: heap_buffers
 commitlog_total_space_in_mb: 2048
 concurrent_compactors: 1
 compaction_throughput_mb_per_sec: 128
 {code}
 The compaction strategy is leveled (these are read-intensive tables that are 
 rarely updated)
 I have tried every setting, every option and I have the system where the MTBF 
 is about an hour now, but we never finish compacting because there are some 
 large compactions pending. None of the GC tools or settings help because it 
 is not a GC problem. It is an off-heap memory problem.
 We are getting these messages in our syslog 
 {code}
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219527] BUG: Bad page map in 
 process java  pte:0320 pmd:2d6fa5067
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219545] addr:7fb820be3000 
 vm_flags:0870 anon_vma:  (null) mapping:  (null) 
 index:7fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219556] CPU: 3 PID: 27344 
 Comm: java Tainted: GB3.13.0-24-generic #47-Ubuntu
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219559]  880028510e40 
 88020d43da98 81715ac4 7fb820be3000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219565]  88020d43dae0 
 81174183 0320 0007fb820be3
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219568]  8802d6fa5f18 
 0320 7fb820be3000 7fb820be4000
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219572] Call Trace:
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219584]  [81715ac4] 
 dump_stack+0x45/0x56
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219591]  [81174183] 
 print_bad_pte+0x1a3/0x250
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219594]  [81175439] 
 vm_normal_page+0x69/0x80
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219598]  [8117580b] 
 unmap_page_range+0x3bb/0x7f0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219602]  [81175cc1] 
 unmap_single_vma+0x81/0xf0
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219605]  [81176d39] 
 unmap_vmas+0x49/0x90
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219610]  [8117feec] 
 exit_mmap+0x9c/0x170
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219617]  [8110fcf3] 
 ? __delayacct_add_tsk+0x153/0x170
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219621]  [8106482c] 
 mmput+0x5c/0x120
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219625]  [81069bbc] 
 do_exit+0x26c/0xa50
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219631]  [810d7591] 
 ? __unqueue_futex+0x31/0x60
 Jan  2 07:06:00 ip-10-0-2-226 kernel: [49801151.219634]  [810d83b6] 
 ?

[jira] [Commented] (CASSANDRA-8399) Reference Counter exception when dropping user type


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263265#comment-14263265
 ] 

Benedict commented on CASSANDRA-8399:
-

Sure. I'll take a look soon.

 Reference Counter exception when dropping user type
 ---

 Key: CASSANDRA-8399
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8399
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Joshua McKenzie
 Fix For: 2.1.3

 Attachments: 8399_fix_empty_results.txt, 8399_v2.txt, node2.log, 
 ubuntu-8399.log


 When running the dtest 
 {{user_types_test.py:TestUserTypes.test_type_keyspace_permission_isolation}} 
 with the current 2.1-HEAD code, very frequently, but not always, when 
 dropping a type, the following exception is seen:{code}
 ERROR [MigrationStage:1] 2014-12-01 13:54:54,824 CassandraDaemon.java:170 - 
 Exception in thread Thread[MigrationStage:1,5,main]
 java.lang.AssertionError: Reference counter -1 for 
 /var/folders/v3/z4wf_34n1q506_xjdy49gb78gn/T/dtest-eW2RXj/test/node2/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-sche
 ma_keyspaces-ka-14-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1662)
  ~[main/:na]
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.close(SSTableScanner.java:164) 
 ~[main/:na]
 at 
 org.apache.cassandra.utils.MergeIterator.close(MergeIterator.java:62) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore$8.close(ColumnFamilyStore.java:1943)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:2116) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:2029)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1963)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:744)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:731)
  ~[main/:na]
 at org.apache.cassandra.config.Schema.updateVersion(Schema.java:374) 
 ~[main/:na]
 at 
 org.apache.cassandra.config.Schema.updateVersionAndAnnounce(Schema.java:399) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:167) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:49)
  ~[main/:na]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[main/:na]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_67]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_67]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_67]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_67]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67]{code}
 Log of the node with the error is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-8399) Reference Counter exception when dropping user type


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict reassigned CASSANDRA-8399:
---

Assignee: Benedict  (was: Joshua McKenzie)

 Reference Counter exception when dropping user type
 ---

 Key: CASSANDRA-8399
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8399
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8399_fix_empty_results.txt, 8399_v2.txt, node2.log, 
 ubuntu-8399.log


 When running the dtest 
 {{user_types_test.py:TestUserTypes.test_type_keyspace_permission_isolation}} 
 with the current 2.1-HEAD code, very frequently, but not always, when 
 dropping a type, the following exception is seen:{code}
 ERROR [MigrationStage:1] 2014-12-01 13:54:54,824 CassandraDaemon.java:170 - 
 Exception in thread Thread[MigrationStage:1,5,main]
 java.lang.AssertionError: Reference counter -1 for 
 /var/folders/v3/z4wf_34n1q506_xjdy49gb78gn/T/dtest-eW2RXj/test/node2/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-sche
 ma_keyspaces-ka-14-Data.db
 at 
 org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1662)
  ~[main/:na]
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.close(SSTableScanner.java:164) 
 ~[main/:na]
 at 
 org.apache.cassandra.utils.MergeIterator.close(MergeIterator.java:62) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore$8.close(ColumnFamilyStore.java:1943)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:2116) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:2029)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1963)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:744)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:731)
  ~[main/:na]
 at org.apache.cassandra.config.Schema.updateVersion(Schema.java:374) 
 ~[main/:na]
 at 
 org.apache.cassandra.config.Schema.updateVersionAndAnnounce(Schema.java:399) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:167) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:49)
  ~[main/:na]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[main/:na]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_67]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_67]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_67]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_67]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67]{code}
 Log of the node with the error is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-8550) Internal pagination in CQL3 index queries creating substantial overhead

2015-01-02 Thread Tyler Hobbs (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs reassigned CASSANDRA-8550:
--

Assignee: Tyler Hobbs

 Internal pagination in CQL3 index queries creating substantial overhead
 ---

 Key: CASSANDRA-8550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8550
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Samuel Klock
Assignee: Tyler Hobbs
 Fix For: 2.1.3


 While benchmarking CQL3 secondary indexes in 2.1.2, we've noticed substantial 
 performance degradation as the volume of indexed data increases.  In trying 
 to figure out what's going on, we found that a major factor contributing to 
 this degradation appears to be logic in 
 {{o.a.c.db.index.composites.CompositesSearcher}} used to paginate scans of 
 index tables.  In particular, in the use cases we've explored, this short 
 algorithm used to select a page size appears to be the culprit:
 {code:java}
 private int meanColumns = 
 Math.max(index.getIndexCfs().getMeanColumns(), 1);
 // We shouldn't fetch only 1 row as this provides buggy paging in 
 case the first row doesn't satisfy all clauses
 private int rowsPerQuery = Math.max(Math.min(filter.maxRows(), 
 filter.maxColumns() / meanColumns), 2);
 {code}
 In indexes where the cardinality doesn't scale linearly with the volume of 
 data indexed, it seems likely that the value of {{meanColumns}} will steadily 
 rise in write-heavy workloads.  In the cases we've explored, 
 {{filter.maxColumns()}} returns a small enough number (related to the lesser 
 of the native-protocol page size or the user-specified limit for the query) 
 that, after {{meanColumns}} reaches a few thousand, {{rowsPerQuery}} (the 
 page size) is consistently set to 2.
 The resulting overhead is severe.  In our environment, if we fix 
 {{rowsPerQuery}} to some reasonably large constant (e.g., 5,000), queries 
 that with the existing logic would require over two minutes to complete can 
 run in under ten seconds.
 Using a constant clearly seems like the wrong answer.  But the overhead the 
 existing algorithm seems to introduce suggests that it isn't the right answer 
 either.  An intuitive solution might be to use the minimum of 
 {{filter.maxRows()}} and {{filter.maxColumns()}} (or 2 if both of those are 
 1), but it's not immediately clear that there aren't safety considerations 
 the algorithm is attempting to account for that this strategy does not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8554) Node where gossip is disabled still shows as UP on that node; other nodes show it as DN

2015-01-02 Thread Thanh (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263305#comment-14263305
 ] 

Thanh commented on CASSANDRA-8554:
--

I asked Mark to change the title of this jira from
Nodetool drain shows all nodes as UP on the drained node
to 
Node where gossip is disabled still shows as up on that node; other nodes show 
it as DN
because the behavior he describes above is not specific to DRAIN.  You'll see 
the same thing if you do nodetool disablegossip on nodeX:
nodetool status run from nodeX (after a nodetool disablegossip is done on 
nodeX) will show nodeX and all other nodes as UP (assuming that all the other 
nodes are indeed up), while nodetool status run from any other cluster node 
will show nodeX as DN.  I didn't think this was a bug, but...comments in COSS 
seem to indicate that it is, which led to the creation of this jira.

 Node where gossip is disabled still shows  as UP on that node; other nodes 
 show it as DN
 

 Key: CASSANDRA-8554
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8554
 Project: Cassandra
  Issue Type: Bug
 Environment: Centos 6.5, DSE4.5.1 tarball install
Reporter: Mark Curtis
Priority: Minor

 When running nodetool drain, the drained node will still show the status of 
 itself as UP in nodetool status even after the drain has finished. For 
 example using a 3 node cluster on one of the nodes that is still operating 
 and not drained we see this:
 {code}
 $ ./dse-4.5.1/bin/nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: Central
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  192.168.56.21  210.78 KB  256 32.1%  
 82eb2fca-4f57-467b-a972-93096ec5d69f  RAC1
 DN  192.168.56.23  2.22 GB256 33.5%  
 a11bfac1-fad0-440b-bd68-7562a89ce3c7  RAC1
 UN  192.168.56.22  2.22 GB256 34.4%  
 4250cb05-97be-4bac-887a-acc307d1bc0c  RAC1
 {code}
 While on the drained node we see this:
 {code}
 [datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool drain
 [datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: Central
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  192.168.56.21  210.78 KB  256 32.1%  
 82eb2fca-4f57-467b-a972-93096ec5d69f  RAC1
 UN  192.168.56.23  2.22 GB256 33.5%  
 a11bfac1-fad0-440b-bd68-7562a89ce3c7  RAC1
 UN  192.168.56.22  2.22 GB256 34.4%  
 4250cb05-97be-4bac-887a-acc307d1bc0c  RAC1
 {code}
 Netstat shows outgoing connections from the drained node to other nodes as 
 still established on port 7000 but the node is no longer listening on port 
 7000 which I believe is expected.
 However the output of nodetool status on the drained node could be 
 interpreted as misleading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8555) Immediate sequential update of column sometimes not immediately applied (OS X only?)

2015-01-02 Thread Andy Tolbert (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Tolbert updated CASSANDRA-8555:

Environment: OS X 10.10.1, Oracle JDK 1.7.0_71-b14, cassandra-2.0 HEAD, 
1.2.19, 2.0.11,  2.1.2.   1 node cluster.  (was: OS X, Oracle JDK 
1.7.0_71-b14, cassandra-2.0 HEAD, 1.2.19, 2.0.11,  2.1.2.   1 node cluster.)

 Immediate sequential update of column sometimes not immediately applied (OS X 
 only?)
 

 Key: CASSANDRA-8555
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8555
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: OS X 10.10.1, Oracle JDK 1.7.0_71-b14, cassandra-2.0 
 HEAD, 1.2.19, 2.0.11,  2.1.2.   1 node cluster.
Reporter: Andy Tolbert
Priority: Minor

 There was [a question on stack 
 overflow|http://stackoverflow.com/questions/27707081/cassandra-writes-after-setting-a-column-to-null-are-lost-randomly-is-this-a-bu]
  from a user where they had a problem when doing the following:
 {code:java}
 @Test
 public void testWriteUpdateRead() throws Exception {
   Cluster cluster = Cluster.builder()
   .addContactPoint(127.0.0.1)
   .build();
   Session cs = cluster.connect();
   cs.execute(DROP KEYSPACE if exists readtest;);
   cs.execute(CREATE KEYSPACE readtest WITH replication  +
   = {'class':'SimpleStrategy', 'replication_factor':1};);
   cs.execute(create table readtest.sessions( +
   id text primary key, +
   passwordHash text, +
   ););
   for (int i = 0; i  1000; i++) {
 String sessionID = UUID.randomUUID().toString();
 cs.execute(insert into readtest.sessions (id, passwordHash) values(' + 
 sessionID + ', null));
 cs.execute(update readtest.sessions set passwordHash=' + sessionID + ' 
 where id = ' + sessionID + ' );
 ResultSet rs = cs.execute(select * from readtest.sessions where id = ' 
 + sessionID + ');
 Row row = rs.one();
 assertThat(failed ith time= + i, row.getString(passwordHash), 
 equalTo(sessionID));
   }
   cs.close();
   cluster.close();
 }
 {code}
 Running this test, there are times where the 'passwordHash' column was null, 
 making it seem like the update statement was never applied.
 I can only reproduce this on OS X for some reason.  I suspect this may be a 
 duplicate or was resolved coincidentally by a recent change, since it appears 
 to be resolved in the cassandra-2.1 and trunk branches, but I can reproduce 
 the issue against cassandra-2.1.2.  The problem appears to still exist in 
 cassandra-2.0 HEAD.  I went through CHANGES.txt for 2.1.3 and no fix stuck 
 out so I figured I'd create an issue just in case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8554) Node where gossip is disabled still shows as UP on that node; other nodes show it as DN

2015-01-02 Thread Thanh (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263309#comment-14263309
 ] 

Thanh commented on CASSANDRA-8554:
--

I meant to add in my initial comment:
nodetool drain disables both Thrift and Gossip.  Since we see the same 
behavior with nodetool disablegossip, the common denominator is that gossip 
is disabled (not listening on port 7000 anymore)

 Node where gossip is disabled still shows  as UP on that node; other nodes 
 show it as DN
 

 Key: CASSANDRA-8554
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8554
 Project: Cassandra
  Issue Type: Bug
 Environment: Centos 6.5, DSE4.5.1 tarball install
Reporter: Mark Curtis
Priority: Minor

 When running nodetool drain, the drained node will still show the status of 
 itself as UP in nodetool status even after the drain has finished. For 
 example using a 3 node cluster on one of the nodes that is still operating 
 and not drained we see this:
 {code}
 $ ./dse-4.5.1/bin/nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: Central
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  192.168.56.21  210.78 KB  256 32.1%  
 82eb2fca-4f57-467b-a972-93096ec5d69f  RAC1
 DN  192.168.56.23  2.22 GB256 33.5%  
 a11bfac1-fad0-440b-bd68-7562a89ce3c7  RAC1
 UN  192.168.56.22  2.22 GB256 34.4%  
 4250cb05-97be-4bac-887a-acc307d1bc0c  RAC1
 {code}
 While on the drained node we see this:
 {code}
 [datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool drain
 [datastax@DSE4 ~]$ ./dse-4.5.1/bin/nodetool status
 Note: Ownership information does not include topology; for complete 
 information, specify a keyspace
 Datacenter: Central
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad   Tokens  Owns   Host ID  
  Rack
 UN  192.168.56.21  210.78 KB  256 32.1%  
 82eb2fca-4f57-467b-a972-93096ec5d69f  RAC1
 UN  192.168.56.23  2.22 GB256 33.5%  
 a11bfac1-fad0-440b-bd68-7562a89ce3c7  RAC1
 UN  192.168.56.22  2.22 GB256 34.4%  
 4250cb05-97be-4bac-887a-acc307d1bc0c  RAC1
 {code}
 Netstat shows outgoing connections from the drained node to other nodes as 
 still established on port 7000 but the node is no longer listening on port 
 7000 which I believe is expected.
 However the output of nodetool status on the drained node could be 
 interpreted as misleading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8494) incremental bootstrap

2015-01-02 Thread T Jake Luciani (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263304#comment-14263304
]

T Jake Luciani commented on CASSANDRA-8494:
---

Rather than add a rich state management to bootstrap why don't we consider
joining nodes a part of the ring right away and proxy non-streamed ranges to a
known replica till all the data is streamed. If the node dies nothing bad
happens. We already send extra writes to joining nodes, so we would only need
to add the ability for a joining node to track what data has been streamed so
far.

incremental bootstrap
-

Key: CASSANDRA-8494
URL: https://issues.apache.org/jira/browse/CASSANDRA-8494
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: Jon Haddad
Assignee: Yuki Morishita
Priority: Minor
Labels: density
Fix For: 3.0

Current bootstrapping involves (to my knowledge) picking tokens and streaming
data before the node is available for requests. This can be problematic with
fat nodes, since it may require 20TB of data to be streamed over before the
machine can be useful. This can result in a massive window of time before
the machine can do anything useful.
As a potential approach to mitigate the huge window of time before a node is
available, I suggest modifying the bootstrap process to only acquire a single
initial token before being marked UP. This would likely be a configuration
parameter incremental_bootstrap or something similar.
After the node is bootstrapped with this one token, it could go into UP
state, and could then acquire additional tokens (one or a handful at a time),
which would be streamed over while the node is active and serving requests.
The benefit here is that with the default 256 tokens a node could become an
active part of the cluster with less than 1% of it's final data streamed over.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[1/2] cassandra git commit: Invalidate prepared stmts when table is altered

Repository: cassandra
Updated Branches:
  refs/heads/trunk dcc3bb054 - c11e1a9d8


Invalidate prepared stmts when table is altered

Patch by Tyler Hobbs; reviewed by Aleksey Yeschenko for CASSANDRA-7910


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9f613ab4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9f613ab4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9f613ab4

Branch: refs/heads/trunk
Commit: 9f613ab42c76783191af7d20f50d716309e4aa5c
Parents: 6124a73
Author: Tyler Hobbs ty...@datastax.com
Authored: Fri Jan 2 11:19:57 2015 -0600
Committer: Tyler Hobbs ty...@datastax.com
Committed: Fri Jan 2 11:19:57 2015 -0600

--
 CHANGES.txt |  2 ++
 src/java/org/apache/cassandra/auth/Auth.java| 34 ++--
 .../org/apache/cassandra/config/CFMetaData.java | 15 +++--
 .../apache/cassandra/cql3/QueryProcessor.java   | 22 +++--
 .../org/apache/cassandra/db/DefsTables.java |  4 +--
 .../cassandra/service/IMigrationListener.java   | 33 ---
 .../cassandra/service/MigrationListener.java| 33 +++
 .../cassandra/service/MigrationManager.java | 28 
 .../org/apache/cassandra/transport/Server.java  |  6 ++--
 9 files changed, 80 insertions(+), 97 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f613ab4/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index ec64aa9..f69a3fc 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,6 @@
 2.1.3
+ * Invalidate affected prepared statements when a table's columns
+   are altered (CASSANDRA-7910)
  * Stress - user defined writes should populate sequentally (CASSANDRA-8524)
  * Fix regression in SSTableRewriter causing some rows to become unreadable 
during compaction (CASSANDRA-8429)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f613ab4/src/java/org/apache/cassandra/auth/Auth.java
--
diff --git a/src/java/org/apache/cassandra/auth/Auth.java 
b/src/java/org/apache/cassandra/auth/Auth.java
index ed7aa87..0c3b0fe 100644
--- a/src/java/org/apache/cassandra/auth/Auth.java
+++ b/src/java/org/apache/cassandra/auth/Auth.java
@@ -185,7 +185,7 @@ public class Auth implements AuthMBean
 DatabaseDescriptor.getAuthorizer().setup();
 
 // register a custom MigrationListener for permissions cleanup after 
dropped keyspaces/cfs.
-MigrationManager.instance.register(new MigrationListener());
+MigrationManager.instance.register(new AuthMigrationListener());
 
 // the delay is here to give the node some time to see its peers - to 
reduce
 // Skipped default superuser setup: some nodes were not ready log 
spam.
@@ -318,9 +318,9 @@ public class Auth implements AuthMBean
 }
 
 /**
- * IMigrationListener implementation that cleans up permissions on dropped 
resources.
+ * MigrationListener implementation that cleans up permissions on dropped 
resources.
  */
-public static class MigrationListener implements IMigrationListener
+public static class AuthMigrationListener extends MigrationListener
 {
 public void onDropKeyspace(String ksName)
 {
@@ -331,33 +331,5 @@ public class Auth implements AuthMBean
 {
 
DatabaseDescriptor.getAuthorizer().revokeAll(DataResource.columnFamily(ksName, 
cfName));
 }
-
-public void onDropUserType(String ksName, String userType)
-{
-}
-
-public void onCreateKeyspace(String ksName)
-{
-}
-
-public void onCreateColumnFamily(String ksName, String cfName)
-{
-}
-
-public void onCreateUserType(String ksName, String userType)
-{
-}
-
-public void onUpdateKeyspace(String ksName)
-{
-}
-
-public void onUpdateColumnFamily(String ksName, String cfName)
-{
-}
-
-public void onUpdateUserType(String ksName, String userType)
-{
-}
 }
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/9f613ab4/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index 74bd5f8..e75abb7 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -,7 +,11 @@ public final class CFMetaData
 return m;
 }
 
-public void reload()
+/**
+ * Updates this object in place to match the definition in the

[2/2] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c11e1a9d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c11e1a9d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c11e1a9d

Branch: refs/heads/trunk
Commit: c11e1a9d8a2cf64afa04c6dadb47881f464d9eb0
Parents: dcc3bb0 9f613ab
Author: Tyler Hobbs ty...@datastax.com
Authored: Fri Jan 2 11:21:57 2015 -0600
Committer: Tyler Hobbs ty...@datastax.com
Committed: Fri Jan 2 11:21:57 2015 -0600

--
 CHANGES.txt  |  2 ++
 src/java/org/apache/cassandra/config/CFMetaData.java | 15 ---
 src/java/org/apache/cassandra/config/Schema.java |  4 ++--
 .../org/apache/cassandra/cql3/QueryProcessor.java|  9 +
 .../apache/cassandra/service/MigrationListener.java  |  2 +-
 .../apache/cassandra/service/MigrationManager.java   |  4 ++--
 src/java/org/apache/cassandra/transport/Server.java  |  2 +-
 7 files changed, 29 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c11e1a9d/CHANGES.txt
--
diff --cc CHANGES.txt
index ac63fb3,f69a3fc..82f1d20
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,48 -1,6 +1,50 @@@
 +3.0
 + * Support index key/value entries on map collections (CASSANDRA-8473)
 + * Modernize schema tables (CASSANDRA-8261)
 + * Support for user-defined aggregation functions (CASSANDRA-8053)
 + * Fix NPE in SelectStatement with empty IN values (CASSANDRA-8419)
 + * Refactor SelectStatement, return IN results in natural order instead
 +   of IN value list order (CASSANDRA-7981)
 + * Support UDTs, tuples, and collections in user-defined
 +   functions (CASSANDRA-7563)
 + * Fix aggregate fn results on empty selection, result column name,
 +   and cqlsh parsing (CASSANDRA-8229)
 + * Mark sstables as repaired after full repair (CASSANDRA-7586)
 + * Extend Descriptor to include a format value and refactor reader/writer
 +   APIs (CASSANDRA-7443)
 + * Integrate JMH for microbenchmarks (CASSANDRA-8151)
 + * Keep sstable levels when bootstrapping (CASSANDRA-7460)
 + * Add Sigar library and perform basic OS settings check on startup 
(CASSANDRA-7838)
 + * Support for aggregation functions (CASSANDRA-4914)
 + * Remove cassandra-cli (CASSANDRA-7920)
 + * Accept dollar quoted strings in CQL (CASSANDRA-7769)
 + * Make assassinate a first class command (CASSANDRA-7935)
 + * Support IN clause on any clustering column (CASSANDRA-4762)
 + * Improve compaction logging (CASSANDRA-7818)
 + * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917)
 + * Do anticompaction in groups (CASSANDRA-6851)
 + * Support user-defined functions (CASSANDRA-7395, 7526, 7562, 7740, 7781, 
7929,
 +   7924, 7812, 8063, 7813, 7708)
 + * Permit configurable timestamps with cassandra-stress (CASSANDRA-7416)
 + * Move sstable RandomAccessReader to nio2, which allows using the
 +   FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050)
 + * Remove CQL2 (CASSANDRA-5918)
 + * Add Thrift get_multi_slice call (CASSANDRA-6757)
 + * Optimize fetching multiple cells by name (CASSANDRA-6933)
 + * Allow compilation in java 8 (CASSANDRA-7028)
 + * Make incremental repair default (CASSANDRA-7250)
 + * Enable code coverage thru JaCoCo (CASSANDRA-7226)
 + * Switch external naming of 'column families' to 'tables' (CASSANDRA-4369) 
 + * Shorten SSTable path (CASSANDRA-6962)
 + * Use unsafe mutations for most unit tests (CASSANDRA-6969)
 + * Fix race condition during calculation of pending ranges (CASSANDRA-7390)
 + * Fail on very large batch sizes (CASSANDRA-8011)
 + * Improve concurrency of repair (CASSANDRA-6455, 8208)
 +
 +
  2.1.3
+  * Invalidate affected prepared statements when a table's columns
+are altered (CASSANDRA-7910)
   * Stress - user defined writes should populate sequentally (CASSANDRA-8524)
   * Fix regression in SSTableRewriter causing some rows to become unreadable 
 during compaction (CASSANDRA-8429)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c11e1a9d/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --cc src/java/org/apache/cassandra/config/CFMetaData.java
index 0730ba7,e75abb7..cb176f2
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@@ -724,11 -938,193 +724,15 @@@ public final class CFMetaDat
  return def == null ? defaultValidator : def.type;
  }
  
- public void reload()
 -/** applies implicit defaults to cf definition. useful in updates */
 -private static void 
applyImplicitDefaults(org.apache.cassandra.thrift.CfDef cf_def)
 -{
 -if (!cf_def.isSetComment())
 -

[jira] [Commented] (CASSANDRA-8552) Large compactions run out of off-heap RAM


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263428#comment-14263428
 ] 

Brent Haines commented on CASSANDRA-8552:
-

Here is the syslog after upgrading the kernel to fix the bad pte bug : 
{code}
Jan  3 05:15:01 ip-10-0-2-226 CRON[20245]: (ubuntu) CMD 
(/home/ubuntu/checkcassandra.sh)
Jan  3 05:15:01 ip-10-0-2-226 CRON[20246]: (root) CMD (if [ -x 
/etc/munin/plugins/apt_all ]; then munin-run apt_all update 7200 12 /dev/null; 
elif [ -x /etc/munin/plugins/apt ]; then munin-run apt update 7200 12 
/dev/null; fi)
Jan  3 05:15:01 ip-10-0-2-226 CRON[20247]: (root) CMD (command -v debian-sa1  
/dev/null  debian-sa1 1 1)
Jan  3 05:15:02 ip-10-0-2-226 postfix/pickup[1360]: 4FC6E805D4: uid=1000 
from=ubuntu
Jan  3 05:15:02 ip-10-0-2-226 postfix/cleanup[20292]: 4FC6E805D4: 
message-id=20150103051502.4FC6E805D4@ip-10-0-2-226.ec2.internal
Jan  3 05:15:02 ip-10-0-2-226 postfix/qmgr[1362]: 4FC6E805D4: 
from=ubuntu@ip-10-0-2-226.ec2.internal, size=621, nrcpt=1 (queue active)
Jan  3 05:15:02 ip-10-0-2-226 postfix/local[20294]: 4FC6E805D4: 
to=ubuntu@ip-10-0-2-226.ec2.internal, orig_to=ubuntu, relay=local, 
delay=0.05, delays=0.03/0.01/0/0.01, dsn=2.0.0, status=sent (delivered to 
mailbox)
Jan  3 05:15:02 ip-10-0-2-226 postfix/qmgr[1362]: 4FC6E805D4: removed
Jan  3 05:17:01 ip-10-0-2-226 CRON[21023]: (root) CMD (   cd /  run-parts 
--report /etc/cron.hourly)
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906482] java invoked 
oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906490] java cpuset=/ 
mems_allowed=0
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906495] CPU: 0 PID: 21373 Comm: 
java Not tainted 3.13.0-43-generic #72-Ubuntu
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906497]   
8800053cd980 81720bf6 8802bbdf4800
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906503]  8800053cda08 
8171b4b1  003ac2e4
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906506]  8173310e 
8803a572  003ac2e4
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906510] Call Trace:
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906521]  [81720bf6] 
dump_stack+0x45/0x56
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906527]  [8171b4b1] 
dump_header+0x7f/0x1f1
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906533]  [8173310e] ? 
xen_hypervisor_callback+0x1e/0x30
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906539]  [811526de] 
oom_kill_process+0x1ce/0x330
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906545]  [812d6ce5] ? 
security_capable_noaudit+0x15/0x20
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906548]  [81152e14] 
out_of_memory+0x414/0x450
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906552]  [81159180] 
__alloc_pages_nodemask+0xa60/0xb80
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906558]  [811977a3] 
alloc_pages_current+0xa3/0x160
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906563]  [8114f297] 
__page_cache_alloc+0x97/0xc0
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906567]  [81150ca5] 
filemap_fault+0x185/0x410
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906572]  [81175b4f] 
__do_fault+0x6f/0x530
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906577]  [81005f0d] ? 
pte_mfn_to_pfn.part.13+0x7d/0x100
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906581]  [81179d12] 
handle_mm_fault+0x482/0xf00
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906585]  [81151778] ? 
generic_file_aio_read+0x598/0x700
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906590]  [8172cc14] 
__do_page_fault+0x184/0x560
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906594]  [81004e32] ? 
xen_mc_flush+0x182/0x1b0
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906598]  [81004e32] ? 
xen_mc_flush+0x182/0x1b0
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906601]  [8172d00a] 
do_page_fault+0x1a/0x70
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906605]  [81729fc5] ? 
do_device_not_available+0x35/0x50
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906608]  [81729468] 
page_fault+0x28/0x30
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906611] Mem-Info:
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906613] Node 0 DMA per-cpu:
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906616] CPU0: hi:0, 
btch:   1 usd:   0
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906618] CPU1: hi:0, 
btch:   1 usd:   0
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906620] CPU2: hi:0, 
btch:   1 usd:   0
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906622] CPU3: hi:0, 
btch:   1 usd:   0
Jan  3 05:18:22 ip-10-0-2-226 kernel: [49881091.906623] Node

[jira] [Comment Edited] (CASSANDRA-8552) Large compactions run out of off-heap RAM