date:20150326


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381637#comment-14381637
 ] 

Marcus Eriksson commented on CASSANDRA-8568:


[~benedict] could you rebase? And there seems to be some merge problems in 
build.xml

 Impose new API on data tracker modifications that makes correct usage obvious 
 and imposes safety
 

 Key: CASSANDRA-8568
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8568
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
 Fix For: 3.0


 DataTracker has become a bit of a quagmire, and not at all obvious to 
 interface with, with many subtly different modifiers. I suspect it is still 
 subtly broken, especially around error recovery.
 I propose piggy-backing on CASSANDRA-7705 to offer RAII (and GC-enforced, for 
 those situations where a try/finally block isn't possible) objects that have 
 transactional behaviour, and with few simple declarative methods that can be 
 composed simply to provide all of the functionality we currently need.
 See CASSANDRA-8399 for context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9023) 2.0.13 write timeouts on driver

2015-03-26 Thread anishek (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381660#comment-14381660
 ] 

anishek commented on CASSANDRA-9023:


I have retested it and I can recreate it with the same configuration as above. 
In the log files though there are no Exceptions. 
One thing i noticed is if i change the memtable_flush_writers to 2 the error 
did not come for the run.

 2.0.13 write timeouts on driver
 ---

 Key: CASSANDRA-9023
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9023
 Project: Cassandra
  Issue Type: Bug
 Environment: For testing using only Single node 
 hardware configuration as follows:
 cpu :
 CPU(s):16
 On-line CPU(s) list:   0-15
 Thread(s) per core:2
 Core(s) per socket:8
 Socket(s): 1
 NUMA node(s):  1
 Vendor ID: GenuineIntel
 CPU MHz:   2000.174
 L1d cache: 32K
 L1i cache: 32K
 L2 cache:  256K
 L3 cache:  20480K
 NUMA node0 CPU(s): 0-15
 OS:
 Linux version 2.6.32-504.8.1.el6.x86_64 (mockbu...@c6b9.bsys.dev.centos.org) 
 (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) 
 Disk: There only single disk in Raid i think space is 500 GB used is 5 GB
Reporter: anishek
 Attachments: out_system.log


 Initially asked @ 
 http://www.mail-archive.com/user@cassandra.apache.org/msg41621.html
 Was suggested to post here. 
 If any more details are required please let me know 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8180) Optimize disk seek using min/max colunm name meta data when the LIMIT clause is used


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381674#comment-14381674
 ] 

Sylvain Lebresne commented on CASSANDRA-8180:
-

bq. l I found it much easier to understand

Glad that it's the case.

bq. I think it might make sense if I implement this change directly on a branch 
based on {{8099_engine_refactor}}

I wouldn't be the one to blame you for that.

bq. I cannot find a way to implement this unless we iterate twice, the first 
time to count until the limit has been reached in 
{{SinglePartitionSliceCommand}} and the second time to return the data

You actually don't have to care about the limit (in SinglePartitionSliceCommand 
at least). The way to do this would be to return an iterator that first query 
and return the results of the first sstable and once it has returned all 
results, it transparently query the 2nd sstable and start returning those 
results, etc...

That being said, I do suspect doing this at the merging level (in 
MergeIterator) would be better. The idea would be to special the merge iterator 
to take specific iterators that expose some {{lowerBound()}} method. That 
method would be allowed to return a value that is not returned by the iterator 
but is lower than anything it will return. The merge iterator would use those 
lower bound as initial {{Candidate}} for the iterators but know that when it 
consumes those canditates it should just discard them (and get the actual next 
value of the iterator). Basically, we'd add a way for the iterator to say 
don't bother using me until you've at least reached value X.  The sstable 
iterators would typically implement that {{lowerBound}} method by returning the 
sstable min column name. Provided we make sure the sstable iterators don't do 
any work unless their {{hasNext/next}} methods are called, we wouldn't actually 
use a sstable until we've reached it's min column name.

Doing it that way would 2 advantages over doing it at the collation level:
# this is more general as it would work even if the sstables min/max column 
name intersects (it's harder/uglier to do the same at the collation level imo).
# this would work for range queries too.

We may want to build that on top of CASSANDRA-8915 however.


 Optimize disk seek using min/max colunm name meta data when the LIMIT clause 
 is used
 

 Key: CASSANDRA-8180
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8180
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Cassandra 2.0.10
Reporter: DOAN DuyHai
Assignee: Stefania
Priority: Minor
 Fix For: 3.0


 I was working on an example of sensor data table (timeseries) and face a use 
 case where C* does not optimize read on disk.
 {code}
 cqlsh:test CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) 
 WITH CLUSTERING ORDER BY (col DESC);
 cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
 ...
 nodetool flush test test
 ...
 cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
 ...
 nodetool flush test test
 ...
 cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
 ...
 nodetool flush test test
 {code}
 After that, I activate request tracing:
 {code}
 cqlsh:test SELECT * FROM test WHERE id=1 LIMIT 1;
  activity  | 
 timestamp| source| source_elapsed
 ---+--+---+
 execute_cql3_query | 
 23:48:46,498 | 127.0.0.1 |  0
 Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 
 23:48:46,498 | 127.0.0.1 | 74
Preparing statement | 
 23:48:46,499 | 127.0.0.1 |253
   Executing single-partition query on test | 
 23:48:46,499 | 127.0.0.1 |930
   Acquiring sstable references | 
 23:48:46,499 | 127.0.0.1 |943
Merging memtable tombstones | 
 23:48:46,499 | 127.0.0.1 |   1032
Key cache hit for sstable 3 | 
 23:48:46,500 | 127.0.0.1 |   1160
Seeking to partition beginning in data file | 
 23:48:46,500 | 127.0.0.1 |   1173
Key cache hit for sstable 2 | 
 23:48:46,500 | 127.0.0.1 |   1889
Seeking to partition beginning in data file | 
 23:48:46,500 |

[jira] [Assigned] (CASSANDRA-9036) disk full when running cleanup (on a far from full disk)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reassigned CASSANDRA-9036:
--

Assignee: Robert Stupp  (was: Marcus Eriksson)

[~snazy] could you have a look? I think it could be related to CASSANDRA-7386

 disk full when running cleanup (on a far from full disk)
 --

 Key: CASSANDRA-9036
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9036
 Project: Cassandra
  Issue Type: Bug
Reporter: Erik Forsberg
Assignee: Robert Stupp

 I'm trying to run cleanup, but get this:
 {noformat}
  INFO [CompactionExecutor:18] 2015-03-25 10:29:16,355 CompactionManager.java 
 (line 564) Cleaning up 
 SSTableReader(path='/cassandra/production/Data_daily/production-Data_daily-jb-4345750-Data.db')
 ERROR [CompactionExecutor:18] 2015-03-25 10:29:16,664 CassandraDaemon.java 
 (line 199) Exception in thread Thread[CompactionExecutor:18,1,main]
 java.io.IOException: disk full
 at 
 org.apache.cassandra.db.compaction.CompactionManager.doCleanupCompaction(CompactionManager.java:567)
 at 
 org.apache.cassandra.db.compaction.CompactionManager.access$400(CompactionManager.java:63)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$5.perform(CompactionManager.java:281)
 at 
 org.apache.cassandra.db.compaction.CompactionManager$2.call(CompactionManager.java:225)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 Now that's odd, since:
 * Disk has some 680G left
 * The sstable it's trying to cleanup is far less than 680G:
 {noformat}
 # ls -lh *4345750*
 -rw-r--r-- 1 cassandra cassandra  64M Mar 21 04:42 
 production-Data_daily-jb-4345750-CompressionInfo.db
 -rw-r--r-- 1 cassandra cassandra 219G Mar 21 04:42 
 production-Data_daily-jb-4345750-Data.db
 -rw-r--r-- 1 cassandra cassandra 503M Mar 21 04:42 
 production-Data_daily-jb-4345750-Filter.db
 -rw-r--r-- 1 cassandra cassandra  42G Mar 21 04:42 
 production-Data_daily-jb-4345750-Index.db
 -rw-r--r-- 1 cassandra cassandra 5.9K Mar 21 04:42 
 production-Data_daily-jb-4345750-Statistics.db
 -rw-r--r-- 1 cassandra cassandra  81M Mar 21 04:42 
 production-Data_daily-jb-4345750-Summary.db
 -rw-r--r-- 1 cassandra cassandra   79 Mar 21 04:42 
 production-Data_daily-jb-4345750-TOC.txt
 {noformat}
 Sure, it's large, but it's not 680G. 
 No other compactions are running on that server. I'm getting this on 12 / 56 
 servers right now. 
 Could it be some bug in the calculation of the expected size of the new 
 sstable, perhaps? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8180) Optimize disk seek using min/max colunm name meta data when the LIMIT clause is used

2015-03-26 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381624#comment-14381624
 ] 

Stefania commented on CASSANDRA-8180:
-

[~slebresne], [~thobbs], [~iamaleksey] : 

I think it might make sense if I implement this change directly on a branch 
based on {{8099_engine_refactor}}? First of all I found it *much easier* to 
understand and secondly I don't particularly want to rebase or merge later on 
once 8099 is merged into trunk. Any concerns?

I've been looking at the code on 8099 today, and I cannot find a way to 
implement this unless we iterate twice, the first time to count until the limit 
has been reached in {{SinglePartitionSliceCommand}} and the second time to 
return the data. Or have I missed something? If not, I think we need to store 
the data in memory via an {{ArrayBackedPartition}}, is this correct?

Here is a very inefficient and ugly way to do this, may I have some pointers on 
to improve on it?

https://github.com/stef1927/cassandra/commits/8180-8099

Specifically in {{querySSTablesByClustering()}} at line 254 of 
{{SinglePartitionSliceCommand.java}}.

 Optimize disk seek using min/max colunm name meta data when the LIMIT clause 
 is used
 

 Key: CASSANDRA-8180
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8180
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Cassandra 2.0.10
Reporter: DOAN DuyHai
Assignee: Stefania
Priority: Minor
 Fix For: 3.0


 I was working on an example of sensor data table (timeseries) and face a use 
 case where C* does not optimize read on disk.
 {code}
 cqlsh:test CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) 
 WITH CLUSTERING ORDER BY (col DESC);
 cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
 ...
 nodetool flush test test
 ...
 cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
 ...
 nodetool flush test test
 ...
 cqlsh:test INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
 ...
 nodetool flush test test
 {code}
 After that, I activate request tracing:
 {code}
 cqlsh:test SELECT * FROM test WHERE id=1 LIMIT 1;
  activity  | 
 timestamp| source| source_elapsed
 ---+--+---+
 execute_cql3_query | 
 23:48:46,498 | 127.0.0.1 |  0
 Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 
 23:48:46,498 | 127.0.0.1 | 74
Preparing statement | 
 23:48:46,499 | 127.0.0.1 |253
   Executing single-partition query on test | 
 23:48:46,499 | 127.0.0.1 |930
   Acquiring sstable references | 
 23:48:46,499 | 127.0.0.1 |943
Merging memtable tombstones | 
 23:48:46,499 | 127.0.0.1 |   1032
Key cache hit for sstable 3 | 
 23:48:46,500 | 127.0.0.1 |   1160
Seeking to partition beginning in data file | 
 23:48:46,500 | 127.0.0.1 |   1173
Key cache hit for sstable 2 | 
 23:48:46,500 | 127.0.0.1 |   1889
Seeking to partition beginning in data file | 
 23:48:46,500 | 127.0.0.1 |   1901
Key cache hit for sstable 1 | 
 23:48:46,501 | 127.0.0.1 |   2373
Seeking to partition beginning in data file | 
 23:48:46,501 | 127.0.0.1 |   2384
  Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 
 23:48:46,501 | 127.0.0.1 |   2768
 Merging data from memtables and 3 sstables | 
 23:48:46,501 | 127.0.0.1 |   2784
 Read 2 live and 0 tombstoned cells | 
 23:48:46,501 | 127.0.0.1 |   2976
   Request complete | 
 23:48:46,501 | 127.0.0.1 |   3551
 {code}
 We can clearly see that C* hits 3 SSTables on disk instead of just one, 
 although it has the min/max column meta data to decide which SSTable contains 
 the most recent data.
 Funny enough, if we add a clause on the clustering column to the select, this 
 time C* optimizes the read path:
 {code}
 cqlsh:test SELECT * FROM test WHERE id=1 AND col  25 LIMIT 1;
  activity

[jira] [Commented] (CASSANDRA-8899) cqlsh - not able to get row count with select(*) for large table

2015-03-26 Thread Benjamin Lerer (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381582#comment-14381582
 ] 

Benjamin Lerer commented on CASSANDRA-8899:
---

[~jeffl] Could you check the effect of increasing the timeout?

By default the page size is of 10 000  rows and count is performed on the 
coordinator node. This means that if the data are not on your coordinator node 
it will have to send at least 5 queries (more if the data are distributed over 
several nodes) to the other nodes. Depending how far your nodes are from the 
coordinator the latency can add up pretty quickly.

The best way for you to verify this theory will be to use request tracing: 
http://www.datastax.com/dev/blog/tracing-in-cassandra-1-2  


 cqlsh - not able to get row count with select(*) for large table
 

 Key: CASSANDRA-8899
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8899
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.1.2 ubuntu12.04
Reporter: Jeff Liu
Assignee: Benjamin Lerer

  I'm getting errors when running a query that looks at a large number of rows.
 {noformat}
 cqlsh:events select count(*) from catalog;
  count
 ---
  1
 (1 rows)
 cqlsh:events select count(*) from catalog limit 11000;
  count
 ---
  11000
 (1 rows)
 cqlsh:events select count(*) from catalog limit 5;
 errors={}, last_host=127.0.0.1
 cqlsh:events 
 {noformat}
 We are not able to make the select * query to get row count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-9033:
---
Priority: Major  (was: Blocker)

Lowering prio as the actual problem is that you have that many tiny files on 
your node. Question is how you ended up with that many files.

Did you run repairs prior to the number of files exploded?

Do you have graphs over how many files you have on the node? Is there a gradual 
increase over time or did it happen over night?

 Upgrading from 2.1.1 to 2.1.3 with LCS  and many sstable files makes nodes 
 unresponsive
 ---

 Key: CASSANDRA-9033
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033
 Project: Cassandra
  Issue Type: Bug
 Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic 
 #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
 * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters]
 * 12 nodes running a mix of 2.1.1 and 2.1.3
 * 8GB stack size with offheap objects
Reporter: Brent Haines
Assignee: Marcus Eriksson
 Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip


 We have an Event Log table using LCS that has grown fast. There are more than 
 100K sstable files that are around 1KB. Increasing compactors and adjusting 
 compaction throttling upward doesn't make a difference. It has been running 
 great though until we upgraded to 2.1.3. Those nodes needed more RAM for the 
 stack (12 GB) to even have a prayer of responding to queries. They bog down 
 and become unresponsive. There are no GC messages that I can see, and no 
 compaction either. 
 The only work-around I have found is to decommission, blow away the big CF 
 and rejoin. That happens in about 20 minutes and everything is freaking happy 
 again. The size of the files is more like what I'd expect as well. 
 Our schema: 
 {code}
 cqlsh describe columnfamily data.stories
 CREATE TABLE data.stories (
 id timeuuid PRIMARY KEY,
 action_data timeuuid,
 action_name text,
 app_id timeuuid,
 app_instance_id timeuuid,
 data maptext, text,
 objects settimeuuid,
 time_stamp timestamp,
 user_id timeuuid
 ) WITH bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = 'Stories represent the timeline and are placed in the 
 dashboard for the brand manager to see'
 AND compaction = {'min_threshold': '4', 'class': 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
 'max_threshold': '32'}
 AND compression = {'sstable_compression': 
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';
 cqlsh 
 {code}
 There were no log entries that stood out. It pretty much consisted of x is 
 down x is up repeated ad infinitum. I have attached the zipped system.log 
 that has the situation after the upgrade and then after I stopped, removed 
 system, system_traces, OpsCenter, and data/stories-/* and restarted. 
 It has rejoined the cluster now and is busy read-repairing to recover its 
 data.
 On another note, we see a lot of this during repair now (on all the nodes): 
 {code}
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 
 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the 
 following error
 java.io.IOException: Failed during snapshot creation.
 at 
 org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
  ~[apache-cassandra-2.1.3.jar:2.1.3]
 at 
 org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) 
 ~[apache-cassandra-2.1.3.jar:2.1.3]
 at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) 
 ~[guava-16.0.jar:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 
 CassandraDaemon.java:167 - Exception in thread 
 Thread[AntiEntropySessions:5,5,RMI Runtime]
 java.lang.RuntimeException: java.io.IOException: Failed during snapshot 
 creation.
 at com.google.common.base.Throwables.propagate(Throwables.java:160) 
 ~[guava-16.0.jar:na]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)

[jira] [Commented] (CASSANDRA-9028) Optimize LIMIT execution to mitigate need for a full partition scan

[
https://issues.apache.org/jira/browse/CASSANDRA-9028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381688#comment-14381688
]

Sylvain Lebresne commented on CASSANDRA-9028:
-

Well, the trace does says that all sstables have been touched as you said,
and they have, but touching a sstable is world away from reading the entire
partition in memory. The reason your first query does touch 2 sstables is
that the code does not know which sstable will have results for the query, how
much it will have nor which results will sort first. This is not particularly
abnormal, there is so much the storage engine can deduce without reading any
data, but this doesn't change the fact that as little as possible is read in
each sstable and we certainly don't retrieve entire partitions unless we have
to.

The reason the 2nd request actually only hit a single sstable is that this
request is more restricted and the engine is able to use that additional
restriction to eliminate one of the sstable.

For completness sake, I'll note that there is actually some optimization we're
contemplating in CASSANDRA-8180 to avoid touching sstables in some cases.
This might or might not help your first query, I honestly haven't looked
closely enough at the example to say. It won't make a terribly huge difference
in any case.

Optimize LIMIT execution to mitigate need for a full partition scan
---

Key: CASSANDRA-9028
URL: https://issues.apache.org/jira/browse/CASSANDRA-9028
Project: Cassandra
Issue Type: Improvement
Components: API, Core
Reporter: jonathan lacefield
Attachments: Data.1.json, Data.2.json, Data.3.json, test.ddl,
tracing.out

Currently, a SELECT statement for a single Partition Key that contains a
LIMIT X clause will fetch an entire partition from a node and place the
partition into memory prior to applying the limit clause and returning
results to be served to the client via the coordinator.
This JIRA is to request an optimization for the CQL LIMIT clause to avoid the
entire partition retrieval step, and instead only retrieve the components to
satisfy the LIMIT condition.
Ideally, any LIMIT X would avoid the need to retrieve a full partition. This
may not be possible though. As a compromise, it would still be incredibly
beneficial if a LIMIT 1 clause could be optimized to only retrieve the
latest item. Ideally a LIMIT 1 would operationally behave the same way
as a Clustering Key WHERE clause where the latest, i.e. LIMIT 1 field, col
value was specified.
We can supply some trace results to help show the difference between 2
different queries that preform the same logical function if desired.
For example, a query that returns the latest value for a clustering col
where QUERY 1 uses a LIMIT 1 clause and QUERY 2 uses a WHERE clustering col
= latest value

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-9028) Optimize LIMIT execution to mitigate need for a full partition scan


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-9028.
-
Resolution: Not a Problem

 Optimize LIMIT execution to mitigate need for a full partition scan
 ---

 Key: CASSANDRA-9028
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9028
 Project: Cassandra
  Issue Type: Improvement
  Components: API, Core
Reporter: jonathan lacefield
 Attachments: Data.1.json, Data.2.json, Data.3.json, test.ddl, 
 tracing.out


 Currently, a SELECT statement for a single Partition Key that contains a 
 LIMIT X clause will fetch an entire partition from a node and place the 
 partition into memory prior to applying the limit clause and returning 
 results to be served to the client via the coordinator.
 This JIRA is to request an optimization for the CQL LIMIT clause to avoid the 
 entire partition retrieval step, and instead only retrieve the components to 
 satisfy the LIMIT condition.
 Ideally, any LIMIT X would avoid the need to retrieve a full partition.  This 
 may not be possible though.  As a compromise, it would still be incredibly 
 beneficial if a LIMIT 1 clause could be optimized to only retrieve the 
 latest item.  Ideally a LIMIT 1 would operationally behave the same way 
 as a Clustering Key WHERE clause where the latest, i.e. LIMIT 1 field, col 
 value was specified.
 We can supply some trace results to help show the difference between 2 
 different queries that preform the same logical function if desired.
   For example, a query that returns the latest value for a clustering col 
 where QUERY 1 uses a LIMIT 1 clause and QUERY 2 uses a WHERE clustering col 
 = latest value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381722#comment-14381722
 ] 

Benedict commented on CASSANDRA-8670:
-

bq. What tool are you using to review?

I like to navigate in IntelliJ, and on the command line, so having a clean run 
of commits helps a lot.

After a bit of consideration, I think there's a good justification for 
introducing a whole new class if we intend to fully replace 
DataStreamOutputAndChannel, largely because the two write paths are not at all 
clear, and appear to be different (the old versions of the write paths being 
hard to actually pin down the location of in the VM source). So having a solid 
handle on how it behaves, and ensuring fewer code paths are executed, seems a 
good thing. As such, I think this patch should replace DSOaC entirely, and 
remove it from the codebase. I also think this is a good opportunity to share 
its code with DataOutputByteBuffer, and in doing hopefully make that faster, 
potentially improving performance of CL append (it doesn't need to extend 
AbstractDataOutput, and would share most of its implementation with 
NIODataOutputStream if it did not).

A few comments in NIODataInputStream:

* readNext() should assert it is never shuffling more than 7 bytes; in fact 
ideally this would be done by readMinimum() to make it clearer
* readNext() should IMO never shuffle unless it's at the end of its capacity; 
if it hasRemaining() and limit() != capacity() it should read on from its 
current limit (readMinimum can ensure there is room to fully meet its 
requirements)
* readUnsignedShort() could simply be: {{ return readShort()  0x;}} 
* available() should return the bytes in the buffer at least
* ensureMinimum() isn't clearly named, since it is more intrinsically linked to 
primitive reads than it suggests, consuming the bytes and throwing EOF if it 
cannot read. Something like preparePrimitiveRead() (no fixed idea myself, just 
think it is more than ensureMinimum)

A few comments in NIODataOutputStreamPlus:
* close() should flush
* close() should clean the buffer
* why the use of hollowBuffer? For clarity in case of restoring the cursor 
position during exceptions? Would be helpful to clarify with a comment. It 
seems like perhaps this should only be used for the first branch, though, since 
the second should have no risk of throwing an exception, so we can safely 
restore the position. It seems like it might be best to make hollowBuffer 
default to null, and instantiate it only if it is larger than our buffer size, 
otherwise first flushing our internal buffer if we haven't got enough room. 
This way we should rarely need the hollowBuffer.
* We should either extend our AbstractDataOutput, or make our writeUTF method 
public static, so we can share it

Finally, it would be nice if we didn't need to stash the OutputStream version 
separately. Perhaps we can reorganise the class hierarchy, so that 
DataOutputStreamPlus doesn't wrap an internal OutputStream, it just is a light 
abstract class merge of the types OutputStream and DataOutputPlus. We can 
introduce a WrappedDataOutputStreamPlus in its place, and AbstractDataOutput 
could extend our new DataOutputStreamPlus instead of the other way around (with 
Wrapped... extending _it_). Then we can just stash a DataOutputStreamPlus in 
all cases. Sound reasonable?

 Large columns + NIO memory pooling causes excessive direct memory usage
 ---

 Key: CASSANDRA-8670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Fix For: 3.0

 Attachments: largecolumn_test.py


 If you provide a large byte array to NIO and ask it to populate the byte 
 array from a socket it will allocate a thread local byte buffer that is the 
 size of the requested read no matter how large it is. Old IO wraps new IO for 
 sockets (but not files) so old IO is effected as well.
 Even If you are using Buffered{Input | Output}Stream you can end up passing a 
 large byte array to NIO. The byte array read method will pass the array to 
 NIO directly if it is larger than the internal buffer.  
 Passing large cells between nodes as part of intra-cluster messaging can 
 cause the NIO pooled buffers to quickly reach a high watermark and stay 
 there. This ends up costing 2x the largest cell size because there is a 
 buffer for input and output since they are different threads. This is further 
 multiplied by the number of nodes in the cluster - 1 since each has a 
 dedicated thread pair with separate thread locals.
 Anecdotally it appears that the cost is doubled beyond that although it isn't 
 clear why. Possibly the control

[jira] [Commented] (CASSANDRA-8568) Impose new API on data tracker modifications that makes correct usage obvious and imposes safety


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381773#comment-14381773
 ] 

Benedict commented on CASSANDRA-8568:
-

Yep, will do. I'm rebasing CASSANDRA-8984 onto trunk, and then will rebase this 
onto that, since it's likely that will be committed soon(ish).

 Impose new API on data tracker modifications that makes correct usage obvious 
 and imposes safety
 

 Key: CASSANDRA-8568
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8568
 Project: Cassandra
  Issue Type: Bug
Reporter: Benedict
Assignee: Benedict
 Fix For: 3.0


 DataTracker has become a bit of a quagmire, and not at all obvious to 
 interface with, with many subtly different modifiers. I suspect it is still 
 subtly broken, especially around error recovery.
 I propose piggy-backing on CASSANDRA-7705 to offer RAII (and GC-enforced, for 
 those situations where a try/finally block isn't possible) objects that have 
 transactional behaviour, and with few simple declarative methods that can be 
 composed simply to provide all of the functionality we currently need.
 See CASSANDRA-8399 for context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381708#comment-14381708
 ] 

Benedict commented on CASSANDRA-8984:
-

bq. in a stable release.

Well, our release page doesn't quite agree with this implicit assertion (that 
2.1 is stable) - but like I say, we can accept the risk as stands and just try 
to patch it up as necessary. I'm more keen to fix them than others since I've 
taken the heat of the failures, but I'm comfortable so long as I've put my 
version of the future out there and highlighted my concerns.

[~JoshuaMcKenzie]: I've pushed a small update that I expect fixes the Windows 
issue (though looking forward to automated branch testing so I can corroborate 
against Windows directly)

 Introduce Transactional API for behaviours that can corrupt system state
 

 Key: CASSANDRA-8984
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8984
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.4

 Attachments: 8984_windows_timeout.txt


 As a penultimate (and probably final for 2.1, if we agree to introduce it 
 there) round of changes to the internals managing sstable writing, I've 
 introduced a new API called Transactional that I hope will make it much 
 easier to write correct behaviour. As things stand we conflate a lot of 
 behaviours into methods like close - the recent changes unpicked some of 
 these, but didn't go far enough. My proposal here introduces an interface 
 designed to support four actions (on top of their normal function):
 * prepareToCommit
 * commit
 * abort
 * cleanup
 In normal operation, once we have finished constructing a state change we 
 call prepareToCommit; once all such state changes are prepared, we call 
 commit. If at any point everything fails, abort is called. In _either_ case, 
 cleanup is called at the very last.
 These transactional objects are all AutoCloseable, with the behaviour being 
 to rollback any changes unless commit has completed successfully.
 The changes are actually less invasive than it might sound, since we did 
 recently introduce abort in some places, as well as have commit like methods. 
 This simply formalises the behaviour, and makes it consistent between all 
 objects that interact in this way. Much of the code change is boilerplate, 
 such as moving an object into a try-declaration, although the change is still 
 non-trivial. What it _does_ do is eliminate a _lot_ of special casing that we 
 have had since 2.1 was released. The data tracker API changes and compaction 
 leftover cleanups should finish the job with making this much easier to 
 reason about, but this change I think is worthwhile considering for 2.1, 
 since we've just overhauled this entire area (and not released these 
 changes), and this change is essentially just the finishing touches, so the 
 risk is minimal and the potential gains reasonably significant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9037) Terminal UDFs evaluated at prepare time throw protocol version error

2015-03-26 Thread Sam Tunnicliffe (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381759#comment-14381759
 ] 

Sam Tunnicliffe commented on CASSANDRA-9037:


Thanks Tyler, I've pushed another commit to the branch with additional tests as 
requested.

The changes to CqlRecordReader aren't unrelated, its inner WrappedRow class 
implements com.datastax.driver.core.Row which has been extended since version 
2.1.2 of the driver (in 
[5c1e121f|https://github.com/datastax/java-driver/commit/5c1e121f0cc6e39e4d1349bb30f409ae486b3d97#diff-3b8ffce5c217f9096226305ecfd5a49a]
  
[a0c42dd2|https://github.com/datastax/java-driver/commit/a0c42dd24f65d3b6e7c558dce68ae1b48c6da7f7])


 Terminal UDFs evaluated at prepare time throw protocol version error
 

 Key: CASSANDRA-9037
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9037
 Project: Cassandra
  Issue Type: Bug
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe
 Fix For: 3.0


 When a pure function with only terminal arguments (or with no arguments) is 
 used in a where clause, it's executed at prepare time and 
 {{Server.CURRENT_VERSION}} passed as the protocol version for serialization 
 purposes. For native functions, this isn't a problem, but UDFs use classes in 
 the bundled java-driver-core jar for (de)serialization of args and return 
 values. When {{Server.CURRENT_VERSION}} is greater than the highest version 
 supported by the bundled java driver the execution fails with the following 
 exception:
 {noformat}
 ERROR [SharedPool-Worker-1] 2015-03-24 18:10:59,391 QueryMessage.java:132 - 
 Unexpected error during query
 org.apache.cassandra.exceptions.FunctionExecutionException: execution of 
 'ks.overloaded[text]' failed: java.lang.IllegalArgumentException: No protocol 
 version matching integer version 4
 at 
 org.apache.cassandra.exceptions.FunctionExecutionException.create(FunctionExecutionException.java:35)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.udf.gen.Cksoverloaded_1.execute(Cksoverloaded_1.java)
  ~[na:na]
 at 
 org.apache.cassandra.cql3.functions.FunctionCall.executeInternal(FunctionCall.java:78)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.functions.FunctionCall.access$200(FunctionCall.java:34)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.functions.FunctionCall$Raw.execute(FunctionCall.java:176)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.functions.FunctionCall$Raw.prepare(FunctionCall.java:161)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.SingleColumnRelation.toTerm(SingleColumnRelation.java:108)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.SingleColumnRelation.newEQRestriction(SingleColumnRelation.java:143)
  ~[main/:na]
 at org.apache.cassandra.cql3.Relation.toRestriction(Relation.java:127) 
 ~[main/:na]
 at 
 org.apache.cassandra.cql3.restrictions.StatementRestrictions.init(StatementRestrictions.java:126)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepareRestrictions(SelectStatement.java:787)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:740)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:488)
  ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:252) 
 ~[main/:na]
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:246) 
 ~[main/:na]
 at 
 org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:119)
  ~[main/:na]
 at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:475)
  [main/:na]
 at 
 org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:371)
  [main/:na]
 at 
 io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
  [netty-all-4.0.23.Final.jar:4.0.23.Final]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_71]
 at 
 org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
  [main/:na]
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
 [main/:na]
 at

[jira] [Commented] (CASSANDRA-8917) Upgrading from 2.0.9 to 2.1.3 with 3 nodes, CL = quorum causes exceptions

2015-03-26 Thread Gary Ogden (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381768#comment-14381768
 ] 

Gary Ogden commented on CASSANDRA-8917:
---

We haven't attempted the upgrade again since we ran into this issue.

 Upgrading from 2.0.9 to 2.1.3 with 3 nodes, CL = quorum causes exceptions
 -

 Key: CASSANDRA-8917
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8917
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.0.9, Centos 6.5, Java 1.7.0_72, spring data 
 cassandra 1.1.1, cassandra java driver 2.0.9
Reporter: Gary Ogden
 Fix For: 2.1.4

 Attachments: b_output.log, jersey_error.log, node1-cassandra.yaml, 
 node1-system.log, node2-cassandra.yaml, node2-system.log, 
 node3-cassandra.yaml, node3-system.log


 We have java apps running on glassfish that read/write to our 3 node cluster 
 running on 2.0.9. 
 we have the CL set to quorum for all reads and writes.
 When we started to upgrade the first node and did the sstable upgrade on that 
 node, we started getting this error on reads and writes:
 com.datastax.driver.core.exceptions.UnavailableException: Not enough replica 
 available for query at consistency QUORUM (2 required but only 1 alive)
 How is that possible when we have 3 nodes total, and there was 2 that were up 
 and it's saying we can't get the required CL?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9023) 2.0.13 write timeouts on driver


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9023:
---
Fix Version/s: 2.0.14

 2.0.13 write timeouts on driver
 ---

 Key: CASSANDRA-9023
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9023
 Project: Cassandra
  Issue Type: Bug
 Environment: For testing using only Single node 
 hardware configuration as follows:
 cpu :
 CPU(s):16
 On-line CPU(s) list:   0-15
 Thread(s) per core:2
 Core(s) per socket:8
 Socket(s): 1
 NUMA node(s):  1
 Vendor ID: GenuineIntel
 CPU MHz:   2000.174
 L1d cache: 32K
 L1i cache: 32K
 L2 cache:  256K
 L3 cache:  20480K
 NUMA node0 CPU(s): 0-15
 OS:
 Linux version 2.6.32-504.8.1.el6.x86_64 (mockbu...@c6b9.bsys.dev.centos.org) 
 (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) 
 Disk: There only single disk in Raid i think space is 500 GB used is 5 GB
Reporter: anishek
 Fix For: 2.0.14

 Attachments: out_system.log


 Initially asked @ 
 http://www.mail-archive.com/user@cassandra.apache.org/msg41621.html
 Was suggested to post here. 
 If any more details are required please let me know 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-6477) Global indexes


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-6477:

Reviewer: Sam Tunnicliffe

 Global indexes
 --

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8150) Revaluate Default JVM tuning parameters


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8150:
--
Assignee: Ryan McGuire  (was: Brandon Williams)

 Revaluate Default JVM tuning parameters
 ---

 Key: CASSANDRA-8150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Matt Stump
Assignee: Ryan McGuire
 Attachments: upload.png


 It's been found that the old twitter recommendations of 100m per core up to 
 800m is harmful and should no longer be used.
 Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3 
 or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess 
 1/3 is probably better for releases greater than 2.1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-8893) RandomAccessReader should share its FileChannel with all instances (via SegmentedFile)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-8893:
-

Assignee: Stefania  (was: Benedict)

Stefania, can you take a stab at this?

 RandomAccessReader should share its FileChannel with all instances (via 
 SegmentedFile)
 --

 Key: CASSANDRA-8893
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8893
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
 Fix For: 3.0


 There's no good reason to open a FileChannel for each 
 \(Compressed\)\?RandomAccessReader, and this would simplify 
 RandomAccessReader to just a thin wrapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8150) Revaluate Default JVM tuning parameters

2015-03-26 Thread Hans van der Linde (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383232#comment-14383232
]

Hans van der Linde commented on CASSANDRA-8150:
---

Dear sender,

I am giving a training and will be back on Tuesday 31-03-2015

During breaks I hope to respond on your email.

Your email will not be forwarded.

For urgent matters regarding GRTC / RTPE contact Peter v/d Koolwijk
(peter.van.de.koolw...@ing.nl / 06-54660211

Or alternatively my manager Coos v/d Berg (coos.van.den.b...@ing.nl /
06-22018780)

Best regards,

Hans van der Linde

-
ATTENTION:
The information in this electronic mail message is private and
confidential, and only intended for the addressee. Should you
receive this message by mistake, you are hereby notified that
any disclosure, reproduction, distribution or use of this
message is strictly prohibited. Please inform the sender by
reply transmission and delete the message without copying or
opening it.

Messages and attachments are scanned for all viruses known.
If this message contains password-protected attachments, the
files have NOT been scanned for viruses by the ING mail domain.
Always scan attachments before opening them.
-

Revaluate Default JVM tuning parameters
---

Key: CASSANDRA-8150
URL: https://issues.apache.org/jira/browse/CASSANDRA-8150
Project: Cassandra
Issue Type: Improvement
Components: Config
Reporter: Matt Stump
Assignee: Ryan McGuire
Attachments: upload.png

It's been found that the old twitter recommendations of 100m per core up to
800m is harmful and should no longer be used.
Instead the formula used should be 1/3 or 1/4 max heap with a max of 2G. 1/3
or 1/4 is debatable and I'm open to suggestions. If I were to hazard a guess
1/3 is probably better for releases greater than 2.1.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-6680) Clock skew detection via gossip


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6680:
--
Assignee: Stefania  (was: Brandon Williams)

 Clock skew detection via gossip
 ---

 Key: CASSANDRA-6680
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6680
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Stefania
Priority: Minor
 Fix For: 3.0


 Gossip's HeartbeatState keeps the generation (local timestamp the node was 
 started) and version (monotonically increasing per gossip interval) which 
 could be used to roughly calculate the node's current time, enabling 
 detection of gossip messages too far in the future for the clocks to be 
 synced.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-5969) Allow JVM_OPTS to be passed to sstablescrub


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5969:
--
Assignee: Stefania  (was: Brandon Williams)

 Allow JVM_OPTS to be passed to sstablescrub
 ---

 Key: CASSANDRA-5969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5969
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Adam Hattrell
Assignee: Stefania
  Labels: lhf

 Can you add a feature request to pass JVM_OPTS to the sstablescrub script -- 
 and other places where java is being called? (Among other things, this lets 
 us run java stuff with -Djava.awt.headless=true on OS X so that Java 
 processes don't pop up into the foreground -- i.e. we have a script that 
 loops over all CFs and runs sstablescrub, and without that flag being passed 
 in the OS X machine becomes pretty much unusable as it keeps switching focus 
 to the java processes as they start.)
  
 --- a/resources/cassandra/bin/sstablescrub
 +++ b/resources/cassandra/bin/sstablescrub
 @@ -70,7 +70,7 @@ if [ x$MAX_HEAP_SIZE = x ]; then
  MAX_HEAP_SIZE=256M
  fi
  
 -$JAVA -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
 +$JAVA $JVM_OPTS -ea -cp $CLASSPATH -Xmx$MAX_HEAP_SIZE \
  -Dlog4j.configuration=log4j-tools.properties \
  org.apache.cassandra.tools.StandaloneScrubber $@



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows

Roman Tkachenko created CASSANDRA-9045:
--

 Summary: Deleted columns are resurrected after repair in wide rows
 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Priority: Critical


Hey guys,

After almost a week of researching the issue and trying out multiple things 
with (almost) no luck I was suggested (on the user@cass list) to file a report 
here.

h5. Setup

Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
it goes away)
Multi datacenter 12+6 nodes cluster.

h5. Schema

{code}
cqlsh describe keyspace blackbook;

CREATE KEYSPACE blackbook WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'IAD': '3',
  'ORD': '3'
};

USE blackbook;

CREATE TABLE bounces (
  domainid text,
  address text,
  message text,
  timestamp bigint,
  PRIMARY KEY (domainid, address)
) WITH
  bloom_filter_fp_chance=0.10 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.10 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.00 AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};
{code}

h5. Use case

Each row (defined by a domainid) can have many many columns (bounce entries) so 
rows can get pretty wide. In practice, most of the rows are not that big but 
some of them contain hundreds of thousands and even millions of columns.

Columns are not TTL'ed but can be deleted using the following CQL3 statement:

{code}
delete from bounces where domainid = 'domain.com' and address = 
'al...@example.com';
{code}

All queries are performed using LOCAL_QUORUM CL.

h5. Problem

We weren't very diligent about running repairs on the cluster initially, but 
shorty after we started doing it we noticed that some of previously deleted 
columns (bounce entries) are there again, as if tombstones have disappeared.

I have run this test multiple times via cqlsh, on the row of the customer who 
originally reported the issue:
* delete an entry
* verify it's not returned even with CL=ALL
* run repair on nodes that own this row's key
* the columns reappear and are returned even with CL=ALL

I tried the same test on another row with much less data and everything was 
correctly deleted and didn't reappear after repair.

h5. Other steps I've taken so far

Made sure NTP is running on all servers and clocks are synchronized.

Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
keyspace) on all nodes, then changed it back to the default 10 days again. 
Didn't help.

Performed one more test. Updated one of the resurrected columns, then deleted 
it and ran repair again. This time the updated version of the column reappeared.

Finally, I noticed these log entries for the row in question:

{code}
INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 CompactionController.java 
(line 192) Compacting large row blackbook/bounces:4ed558feba8a483733001d6a 
(279067683 bytes) incrementally
{code}

Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 512MB 
so the row fits into it, deleted the entry and ran repair once again. The log 
entry for this row was gone and the columns didn't reappear.

We have a lot of rows much larger than 512MB so can't increase this parameters 
forever, if that is the issue.

Please let me know if you need more information on the case or if I can run 
more experiments.

Thanks!

Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state

2015-03-26 Thread Joshua McKenzie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382307#comment-14382307
 ] 

Joshua McKenzie commented on CASSANDRA-8984:


Test on Windows are working after that last push. Given our other discussions 
about a new release cycle I think debating whether we consider 2.1 a stable 
release at this point or not will have a short shelf-life. I'm on the fence 
w/this change as it's largely a refactor of existing flow into codified objects 
but we're also late in the 2.1 release cycle for changes that touch this much 
of the code-base in this fashion.

 Introduce Transactional API for behaviours that can corrupt system state
 

 Key: CASSANDRA-8984
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8984
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.4

 Attachments: 8984_windows_timeout.txt


 As a penultimate (and probably final for 2.1, if we agree to introduce it 
 there) round of changes to the internals managing sstable writing, I've 
 introduced a new API called Transactional that I hope will make it much 
 easier to write correct behaviour. As things stand we conflate a lot of 
 behaviours into methods like close - the recent changes unpicked some of 
 these, but didn't go far enough. My proposal here introduces an interface 
 designed to support four actions (on top of their normal function):
 * prepareToCommit
 * commit
 * abort
 * cleanup
 In normal operation, once we have finished constructing a state change we 
 call prepareToCommit; once all such state changes are prepared, we call 
 commit. If at any point everything fails, abort is called. In _either_ case, 
 cleanup is called at the very last.
 These transactional objects are all AutoCloseable, with the behaviour being 
 to rollback any changes unless commit has completed successfully.
 The changes are actually less invasive than it might sound, since we did 
 recently introduce abort in some places, as well as have commit like methods. 
 This simply formalises the behaviour, and makes it consistent between all 
 objects that interact in this way. Much of the code change is boilerplate, 
 such as moving an object into a try-declaration, although the change is still 
 non-trivial. What it _does_ do is eliminate a _lot_ of special casing that we 
 have had since 2.1 was released. The data tracker API changes and compaction 
 leftover cleanups should finish the job with making this much easier to 
 reason about, but this change I think is worthwhile considering for 2.1, 
 since we've just overhauled this entire area (and not released these 
 changes), and this change is essentially just the finishing touches, so the 
 risk is minimal and the potential gains reasonably significant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382304#comment-14382304
 ] 

Philip Thompson commented on CASSANDRA-9045:


I'm very interested in the cqlsh traces for the delete and select queries. It 
doesn't seem like a repair issue, so I'm unassigning Yuki

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Philip Thompson
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson resolved CASSANDRA-9033.

Resolution: Not a Problem

Yes, it is always OK to change compaction strategy, if doing that corrupts your 
data, that is of course an actual issue (if you have logs or can reproduce it, 
please file a new ticket)

 Upgrading from 2.1.1 to 2.1.3 with LCS  and many sstable files makes nodes 
 unresponsive
 ---

 Key: CASSANDRA-9033
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033
 Project: Cassandra
  Issue Type: Bug
 Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic 
 #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
 * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters]
 * 12 nodes running a mix of 2.1.1 and 2.1.3
 * 8GB stack size with offheap objects
Reporter: Brent Haines
Assignee: Marcus Eriksson
 Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip


 We have an Event Log table using LCS that has grown fast. There are more than 
 100K sstable files that are around 1KB. Increasing compactors and adjusting 
 compaction throttling upward doesn't make a difference. It has been running 
 great though until we upgraded to 2.1.3. Those nodes needed more RAM for the 
 stack (12 GB) to even have a prayer of responding to queries. They bog down 
 and become unresponsive. There are no GC messages that I can see, and no 
 compaction either. 
 The only work-around I have found is to decommission, blow away the big CF 
 and rejoin. That happens in about 20 minutes and everything is freaking happy 
 again. The size of the files is more like what I'd expect as well. 
 Our schema: 
 {code}
 cqlsh describe columnfamily data.stories
 CREATE TABLE data.stories (
 id timeuuid PRIMARY KEY,
 action_data timeuuid,
 action_name text,
 app_id timeuuid,
 app_instance_id timeuuid,
 data maptext, text,
 objects settimeuuid,
 time_stamp timestamp,
 user_id timeuuid
 ) WITH bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = 'Stories represent the timeline and are placed in the 
 dashboard for the brand manager to see'
 AND compaction = {'min_threshold': '4', 'class': 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
 'max_threshold': '32'}
 AND compression = {'sstable_compression': 
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';
 cqlsh 
 {code}
 There were no log entries that stood out. It pretty much consisted of x is 
 down x is up repeated ad infinitum. I have attached the zipped system.log 
 that has the situation after the upgrade and then after I stopped, removed 
 system, system_traces, OpsCenter, and data/stories-/* and restarted. 
 It has rejoined the cluster now and is busy read-repairing to recover its 
 data.
 On another note, we see a lot of this during repair now (on all the nodes): 
 {code}
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 
 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the 
 following error
 java.io.IOException: Failed during snapshot creation.
 at 
 org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
  ~[apache-cassandra-2.1.3.jar:2.1.3]
 at 
 org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) 
 ~[apache-cassandra-2.1.3.jar:2.1.3]
 at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) 
 ~[guava-16.0.jar:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 
 CassandraDaemon.java:167 - Exception in thread 
 Thread[AntiEntropySessions:5,5,RMI Runtime]
 java.lang.RuntimeException: java.io.IOException: Failed during snapshot 
 creation.
 at com.google.common.base.Throwables.propagate(Throwables.java:160) 
 ~[guava-16.0.jar:na]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
 ~[apache-cassandra-2.1.3.jar:2.1.3]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_55]
 at

[jira] [Updated] (CASSANDRA-9046) Allow Cassandra config to be updated to restart Daemon without unloading classes


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Hugonnet updated CASSANDRA-9046:
-
Summary: Allow Cassandra config to be updated to restart Daemon without 
unloading classes  (was: Allow Cassandra config to be updated to restart 
Deaemon without unloading classes)

 Allow Cassandra config to be updated to restart Daemon without unloading 
 classes
 

 Key: CASSANDRA-9046
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Emmanuel Hugonnet
 Fix For: 3.0

 Attachments: 
 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch


 Make applyConfig public in DatabaseDescriptor so that if we embed C* we can 
 restart it after some configuration change without having to stop the whole 
 application to unload the class which is configured once and for all in a 
 static block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8979) MerkleTree mismatch for deleted and non-existing rows


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-8979:
--
Attachment: (was: cassandra-2.0-8979-validator_patch.txt)

 MerkleTree mismatch for deleted and non-existing rows
 -

 Key: CASSANDRA-8979
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8979
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Stefan Podkowinski
Assignee: Yuki Morishita
 Attachments: cassandra-2.0-8979-lazyrow_patch.txt, 
 cassandra-2.0-8979-validator_patch.txt, 
 cassandra-2.0-8979-validatortest_patch.txt, 
 cassandra-2.1-8979-lazyrow_patch.txt, cassandra-2.1-8979-validator_patch.txt


 Validation compaction will currently create different hashes for rows that 
 have been deleted compared to nodes that have not seen the rows at all or 
 have already compacted them away. 
 In case this sounds familiar to you, see CASSANDRA-4905 which was supposed to 
 prevent hashing of expired tombstones. This still seems to be in place, but 
 does not address the issue completely. Or there was a change in 2.0 that 
 rendered the patch ineffective. 
 The problem is that rowHash() in the Validator will return a new hash in any 
 case, whether the PrecompactedRow did actually update the digest or not. This 
 will lead to the case that a purged, PrecompactedRow will not change the 
 digest, but we end up with a different tree compared to not having rowHash 
 called at all (such as in case the row already doesn't exist).
 As an implication, repair jobs will constantly detect mismatches between 
 older sstables containing purgable rows and nodes that have already compacted 
 these rows. After transfering the reported ranges, the newly created sstables 
 will immediately get deleted again during the following compaction. This will 
 happen for each repair run over again until the sstable with the purgable row 
 finally gets compacted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8979) MerkleTree mismatch for deleted and non-existing rows


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-8979:
--
Attachment: cassandra-2.0-8979-validatortest_patch.txt
cassandra-2.0-8979-validator_patch.txt
cassandra-2.0-8979-lazyrow_patch.txt
cassandra-2.1-8979-validator_patch.txt
cassandra-2.1-8979-lazyrow_patch.txt

 MerkleTree mismatch for deleted and non-existing rows
 -

 Key: CASSANDRA-8979
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8979
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Stefan Podkowinski
Assignee: Yuki Morishita
 Attachments: cassandra-2.0-8979-lazyrow_patch.txt, 
 cassandra-2.0-8979-validator_patch.txt, 
 cassandra-2.0-8979-validatortest_patch.txt, 
 cassandra-2.1-8979-lazyrow_patch.txt, cassandra-2.1-8979-validator_patch.txt


 Validation compaction will currently create different hashes for rows that 
 have been deleted compared to nodes that have not seen the rows at all or 
 have already compacted them away. 
 In case this sounds familiar to you, see CASSANDRA-4905 which was supposed to 
 prevent hashing of expired tombstones. This still seems to be in place, but 
 does not address the issue completely. Or there was a change in 2.0 that 
 rendered the patch ineffective. 
 The problem is that rowHash() in the Validator will return a new hash in any 
 case, whether the PrecompactedRow did actually update the digest or not. This 
 will lead to the case that a purged, PrecompactedRow will not change the 
 digest, but we end up with a different tree compared to not having rowHash 
 called at all (such as in case the row already doesn't exist).
 As an implication, repair jobs will constantly detect mismatches between 
 older sstables containing purgable rows and nodes that have already compacted 
 these rows. After transfering the reported ranges, the newly created sstables 
 will immediately get deleted again during the following compaction. This will 
 happen for each repair run over again until the sstable with the purgable row 
 finally gets compacted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson resolved CASSANDRA-9033.

Resolution: Duplicate

 Upgrading from 2.1.1 to 2.1.3 with LCS  and many sstable files makes nodes 
 unresponsive
 ---

 Key: CASSANDRA-9033
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033
 Project: Cassandra
  Issue Type: Bug
 Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic 
 #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
 * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters]
 * 12 nodes running a mix of 2.1.1 and 2.1.3
 * 8GB stack size with offheap objects
Reporter: Brent Haines
Assignee: Marcus Eriksson
 Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip


 We have an Event Log table using LCS that has grown fast. There are more than 
 100K sstable files that are around 1KB. Increasing compactors and adjusting 
 compaction throttling upward doesn't make a difference. It has been running 
 great though until we upgraded to 2.1.3. Those nodes needed more RAM for the 
 stack (12 GB) to even have a prayer of responding to queries. They bog down 
 and become unresponsive. There are no GC messages that I can see, and no 
 compaction either. 
 The only work-around I have found is to decommission, blow away the big CF 
 and rejoin. That happens in about 20 minutes and everything is freaking happy 
 again. The size of the files is more like what I'd expect as well. 
 Our schema: 
 {code}
 cqlsh describe columnfamily data.stories
 CREATE TABLE data.stories (
 id timeuuid PRIMARY KEY,
 action_data timeuuid,
 action_name text,
 app_id timeuuid,
 app_instance_id timeuuid,
 data maptext, text,
 objects settimeuuid,
 time_stamp timestamp,
 user_id timeuuid
 ) WITH bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = 'Stories represent the timeline and are placed in the 
 dashboard for the brand manager to see'
 AND compaction = {'min_threshold': '4', 'class': 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
 'max_threshold': '32'}
 AND compression = {'sstable_compression': 
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';
 cqlsh 
 {code}
 There were no log entries that stood out. It pretty much consisted of x is 
 down x is up repeated ad infinitum. I have attached the zipped system.log 
 that has the situation after the upgrade and then after I stopped, removed 
 system, system_traces, OpsCenter, and data/stories-/* and restarted. 
 It has rejoined the cluster now and is busy read-repairing to recover its 
 data.
 On another note, we see a lot of this during repair now (on all the nodes): 
 {code}
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 
 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the 
 following error
 java.io.IOException: Failed during snapshot creation.
 at 
 org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
  ~[apache-cassandra-2.1.3.jar:2.1.3]
 at 
 org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) 
 ~[apache-cassandra-2.1.3.jar:2.1.3]
 at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) 
 ~[guava-16.0.jar:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 
 CassandraDaemon.java:167 - Exception in thread 
 Thread[AntiEntropySessions:5,5,RMI Runtime]
 java.lang.RuntimeException: java.io.IOException: Failed during snapshot 
 creation.
 at com.google.common.base.Throwables.propagate(Throwables.java:160) 
 ~[guava-16.0.jar:na]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
 ~[apache-cassandra-2.1.3.jar:2.1.3]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_55]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_55]
 at

[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9045:
---
Assignee: Yuki Morishita

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Yuki Morishita
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382278#comment-14382278
 ] 

Philip Thompson commented on CASSANDRA-9045:


What CL are you deleting at? Can you attach a system log of a node undergoing 
the repair? Possibly at DEBUG?

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382278#comment-14382278
 ] 

Philip Thompson edited comment on CASSANDRA-9045 at 3/26/15 5:31 PM:
-

To confirm, the delete is at LOCAL_QUORUM? Can you attach a system log of a 
node undergoing the repair? Possibly at DEBUG?

Do you see the issue if you delete at ALL?


was (Author: philipthompson):
What CL are you deleting at? Can you attach a system log of a node undergoing 
the repair? Possibly at DEBUG?

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Yuki Morishita
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382278#comment-14382278
 ] 

Philip Thompson edited comment on CASSANDRA-9045 at 3/26/15 5:33 PM:
-

To confirm, the delete is at LOCAL_QUORUM? Can you attach a system log of a 
node undergoing the repair? Possibly at DEBUG?

Do you see the issue if you delete at ALL?

How long are the repairs taking? Is it within gc_grace? Can you attach traces 
of the delete query, and then the select query that returns the deleted entry?


was (Author: philipthompson):
To confirm, the delete is at LOCAL_QUORUM? Can you attach a system log of a 
node undergoing the repair? Possibly at DEBUG?

Do you see the issue if you delete at ALL?

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Yuki Morishita
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382298#comment-14382298
 ] 

Roman Tkachenko commented on CASSANDRA-9045:


Hi Philip - thanks for quick response.

Yes, normally the delete is LOCAL_QUORUM, but in my tests I was using ALL as 
well, with the same results.

Let me see if I can enable DEBUG logging and run repair again. That's gonna be 
a lot of logs, I imagine...

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Yuki Morishita
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8979) MerkleTree mismatch for deleted and non-existing rows

[
https://issues.apache.org/jira/browse/CASSANDRA-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382217#comment-14382217
]

Stefan Podkowinski commented on CASSANDRA-8979:
---

I think the main problem here is that digest.update() is always called with the
top-level row tombstone even if no tombstone exists. In this case digest is
updated with DeletionTime.LIVE which doesn't seem to be correct.
Patches have been update, but couldn't test 2.1 yet.
I've also created a test cluster [bootstrap
script|https://github.com/spodkowinski/phantom-cabinet-cases/tree/master/CASSANDRA-8979]
to reproduce the problem for 2.0.

MerkleTree mismatch for deleted and non-existing rows
-

Key: CASSANDRA-8979
URL: https://issues.apache.org/jira/browse/CASSANDRA-8979
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Stefan Podkowinski
Assignee: Yuki Morishita
Attachments: cassandra-2.0-8979-lazyrow_patch.txt,
cassandra-2.0-8979-validator_patch.txt,
cassandra-2.0-8979-validatortest_patch.txt,
cassandra-2.1-8979-lazyrow_patch.txt, cassandra-2.1-8979-validator_patch.txt

Validation compaction will currently create different hashes for rows that
have been deleted compared to nodes that have not seen the rows at all or
have already compacted them away.
In case this sounds familiar to you, see CASSANDRA-4905 which was supposed to
prevent hashing of expired tombstones. This still seems to be in place, but
does not address the issue completely. Or there was a change in 2.0 that
rendered the patch ineffective.
The problem is that rowHash() in the Validator will return a new hash in any
case, whether the PrecompactedRow did actually update the digest or not. This
will lead to the case that a purged, PrecompactedRow will not change the
digest, but we end up with a different tree compared to not having rowHash
called at all (such as in case the row already doesn't exist).
As an implication, repair jobs will constantly detect mismatches between
older sstables containing purgable rows and nodes that have already compacted
these rows. After transfering the reported ranges, the newly created sstables
will immediately get deleted again during the following compaction. This will
happen for each repair run over again until the sstable with the purgable row
finally gets compacted.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8845) sorted CQLSSTableWriter accept unsorted clustering keys


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Yeksigian updated CASSANDRA-8845:
--
Attachment: 8845-2.1.txt

There was a change in ArrayBackedSortedColumns which makes sure that the rows 
are properly sorted when cached. The partition keys still need to be in sorted 
order, so only the clustering columns can change.

Attached is a patch which changes the javadoc to reflect this change.

 sorted CQLSSTableWriter accept unsorted clustering keys
 ---

 Key: CASSANDRA-8845
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8845
 Project: Cassandra
  Issue Type: Bug
Reporter: Pierre N.
Assignee: Carl Yeksigian
 Fix For: 2.1.4

 Attachments: 8845-2.1.txt, TestSorted.java


 The javadoc says : 
 {quote}
 The SSTable sorted order means that rows are added such that their partition 
 key respect the partitioner order and for a given partition, that *the rows 
 respect the clustering columns order*.
 public Builder sorted()
 {quote}
 It throw an ex when partition key are in incorrect order, however, it doesn't 
 throw an ex when rows are inserted with incorrect clustering keys order. It 
 buffer them and sort them in correct order.
 {code}
 writer.addRow(1, 3);
 writer.addRow(1, 1);
 writer.addRow(1, 2);
 {code}
 {code}
 $ sstable2json sorted/ks/t1/ks-t1-ka-1-Data.db 
 [
 {key: 1,
  cells: [[\u\u\u\u0001:,,1424524149557000],
[\u\u\u\u0002:,,1424524149557000],
[\u\u\u\u0003:,,142452414955]]}
 ]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson reopened CASSANDRA-9033:


reopening to close as duplicate

 Upgrading from 2.1.1 to 2.1.3 with LCS  and many sstable files makes nodes 
 unresponsive
 ---

 Key: CASSANDRA-9033
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033
 Project: Cassandra
  Issue Type: Bug
 Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic 
 #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
 * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters]
 * 12 nodes running a mix of 2.1.1 and 2.1.3
 * 8GB stack size with offheap objects
Reporter: Brent Haines
Assignee: Marcus Eriksson
 Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip


 We have an Event Log table using LCS that has grown fast. There are more than 
 100K sstable files that are around 1KB. Increasing compactors and adjusting 
 compaction throttling upward doesn't make a difference. It has been running 
 great though until we upgraded to 2.1.3. Those nodes needed more RAM for the 
 stack (12 GB) to even have a prayer of responding to queries. They bog down 
 and become unresponsive. There are no GC messages that I can see, and no 
 compaction either. 
 The only work-around I have found is to decommission, blow away the big CF 
 and rejoin. That happens in about 20 minutes and everything is freaking happy 
 again. The size of the files is more like what I'd expect as well. 
 Our schema: 
 {code}
 cqlsh describe columnfamily data.stories
 CREATE TABLE data.stories (
 id timeuuid PRIMARY KEY,
 action_data timeuuid,
 action_name text,
 app_id timeuuid,
 app_instance_id timeuuid,
 data maptext, text,
 objects settimeuuid,
 time_stamp timestamp,
 user_id timeuuid
 ) WITH bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = 'Stories represent the timeline and are placed in the 
 dashboard for the brand manager to see'
 AND compaction = {'min_threshold': '4', 'class': 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
 'max_threshold': '32'}
 AND compression = {'sstable_compression': 
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';
 cqlsh 
 {code}
 There were no log entries that stood out. It pretty much consisted of x is 
 down x is up repeated ad infinitum. I have attached the zipped system.log 
 that has the situation after the upgrade and then after I stopped, removed 
 system, system_traces, OpsCenter, and data/stories-/* and restarted. 
 It has rejoined the cluster now and is busy read-repairing to recover its 
 data.
 On another note, we see a lot of this during repair now (on all the nodes): 
 {code}
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 
 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the 
 following error
 java.io.IOException: Failed during snapshot creation.
 at 
 org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
  ~[apache-cassandra-2.1.3.jar:2.1.3]
 at 
 org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) 
 ~[apache-cassandra-2.1.3.jar:2.1.3]
 at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) 
 ~[guava-16.0.jar:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 
 CassandraDaemon.java:167 - Exception in thread 
 Thread[AntiEntropySessions:5,5,RMI Runtime]
 java.lang.RuntimeException: java.io.IOException: Failed during snapshot 
 creation.
 at com.google.common.base.Throwables.propagate(Throwables.java:160) 
 ~[guava-16.0.jar:na]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
 ~[apache-cassandra-2.1.3.jar:2.1.3]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 ~[na:1.7.0_55]
 at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
 ~[na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  ~[na:1.7.0_55]
 at

[jira] [Assigned] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson reassigned CASSANDRA-9045:
--

Assignee: Philip Thompson  (was: Yuki Morishita)

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Philip Thompson
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8993) EffectiveIndexInterval calculation is incorrect

[
https://issues.apache.org/jira/browse/CASSANDRA-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382302#comment-14382302
]

Tyler Hobbs commented on CASSANDRA-8993:

I'll try to explain a bit about how downsampling works overall so that more
people besides myself understand how it works :)

I can put whatever info is useful into comments for posterity.

bq. If I print out the original indices and effective intervals, it seems
that at the first downsampling level (64)

The sampling level after minimal downsampling is 127, not 64. The sampling
level can be anywhere between 0 and BASE_SAMPLING_LEVEL. When a summary moves
from sampling level 128 to level 127, it will drop one summary entry with an
index between \[0, 127\], one entry between \[127, 255\], and so on for the
rest of the summary. The index to drop is determined by
{{Downsampling.getSamplingPattern()}}. The list of integers returned from
{{Downsampling.getSamplingPattern(BASE_SAMPLING_LEVEL)}} are the indexes that
we'll drop for each round of downsampling.

As an example, suppose BASE_SAMPLING_LEVEL is 16 instead of 128.
{{Downsampling.getSamplingPattern(16)}} returns the following pattern:

{noformat}
15, 7, 11, 3, 13, 5, 9, 1, 14, 6, 10, 2, 12, 4, 8, 0
{noformat}

So, when we move from sampling level 16 to 15, we'll drop the entry at index 15
(and repeat that for indexes 15 + (16 * 1), 15 + (16 * 2), 15 + (16 * 3), etc).
When we move from sampling level 15 to 14, we'll drop the entry at index 7
(and repeat as before, but take into account the fact that we've already
dropped the entry at index 15). This pattern of dropping minimizes the maximum
distance between remaining summary entries.

Now, in practice, we will never move from sampling level 128 directly to level
127 because of IndexSummaryManager's {{DOWNSAMPLE_THRESHOLD}}. However, an
index summary could go through multiple rounds of down and upsampling and
arrive at level 127, so we need to be able to handle that.

bq. Further confusion to understanding Downsampling as a whole stems from the
permission of a -1 index into getEffectiveIndexIntervalAfterIndex without
explanation

Hmm, yeah, looking at the code, I don't think we actually need to handle that.
I believe it is leftover logic from earlier in the development of the code when
downsampling would remove the 0th index in an earlier round. With the current
code, the 0th index entry should always be present. I'll make some changes to
remove that.

bq. and the fact that every effective interval is the same despite there being
multiple avenues for calculating it

I'm not sure what you mean here.

EffectiveIndexInterval calculation is incorrect
---

Key: CASSANDRA-8993
URL: https://issues.apache.org/jira/browse/CASSANDRA-8993
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Blocker
Fix For: 2.1.4

Attachments: 8993-2.1-v2.txt, 8993-2.1.txt, 8993.txt

I'm not familiar enough with the calculation itself to understand why this is
happening, but see discussion on CASSANDRA-8851 for the background. I've
introduced a test case to look for this during downsampling, but it seems to
pass just fine, so it may be an artefact of upgrading.
The problem was, unfortunately, not manifesting directly because it would
simply result in a failed lookup. This was only exposed when early opening
used firstKeyBeyond, which does not use the effective interval, and provided
the result to getPosition().
I propose a simple fix that ensures a bug here cannot break correctness.
Perhaps [~thobbs] can follow up with an investigation as to how it actually
went wrong?

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9046) Allow Cassandra config to be updated to allow restarting without unloading classes

Emmanuel Hugonnet created CASSANDRA-9046:


 Summary: Allow Cassandra config to be updated to allow restarting 
without unloading classes
 Key: CASSANDRA-9046
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Emmanuel Hugonnet


Make applyConfig public in DatabaseDescriptor so that if we embed C* we can 
restart it after some configuration change without having to stop the whole 
application to unload the class which is configured once and for all in a 
static block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382301#comment-14382301
 ] 

Roman Tkachenko commented on CASSANDRA-9045:


Repairs are definitely within gc_grace which is 10 days. A repair of a single 
node (nodetool repair blackbook bounce) takes about 1.5 hours.

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Yuki Morishita
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382330#comment-14382330
 ] 

Roman Tkachenko commented on CASSANDRA-9045:


I'll run the test and try to get them to you.

Not so sure about the logs though. I've enabled DEBUG and the node hasn't 
finished starting yet but has already produced ~1GB of logs. If you know how to 
enable debug mode just for repair/compaction components, let me know.

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Philip Thompson
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9046) Allow Cassandra config to be updated to restart Deaemon without unloading classes


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Hugonnet updated CASSANDRA-9046:
-
Summary: Allow Cassandra config to be updated to restart Deaemon without 
unloading classes  (was: Allow Cassandra config to be updated to allow 
restarting without unloading classes)

 Allow Cassandra config to be updated to restart Deaemon without unloading 
 classes
 -

 Key: CASSANDRA-9046
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Emmanuel Hugonnet
 Fix For: 3.0

 Attachments: 
 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch


 Make applyConfig public in DatabaseDescriptor so that if we embed C* we can 
 restart it after some configuration change without having to stop the whole 
 application to unload the class which is configured once and for all in a 
 static block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8979) MerkleTree mismatch for deleted and non-existing rows


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Podkowinski updated CASSANDRA-8979:
--
Attachment: (was: cassandra-2.0-8979-test.txt)

 MerkleTree mismatch for deleted and non-existing rows
 -

 Key: CASSANDRA-8979
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8979
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Stefan Podkowinski
Assignee: Yuki Morishita
 Attachments: cassandra-2.0-8979-lazyrow_patch.txt, 
 cassandra-2.0-8979-validator_patch.txt, 
 cassandra-2.0-8979-validatortest_patch.txt, 
 cassandra-2.1-8979-lazyrow_patch.txt, cassandra-2.1-8979-validator_patch.txt


 Validation compaction will currently create different hashes for rows that 
 have been deleted compared to nodes that have not seen the rows at all or 
 have already compacted them away. 
 In case this sounds familiar to you, see CASSANDRA-4905 which was supposed to 
 prevent hashing of expired tombstones. This still seems to be in place, but 
 does not address the issue completely. Or there was a change in 2.0 that 
 rendered the patch ineffective. 
 The problem is that rowHash() in the Validator will return a new hash in any 
 case, whether the PrecompactedRow did actually update the digest or not. This 
 will lead to the case that a purged, PrecompactedRow will not change the 
 digest, but we end up with a different tree compared to not having rowHash 
 called at all (such as in case the row already doesn't exist).
 As an implication, repair jobs will constantly detect mismatches between 
 older sstables containing purgable rows and nodes that have already compacted 
 these rows. After transfering the reported ranges, the newly created sstables 
 will immediately get deleted again during the following compaction. This will 
 happen for each repair run over again until the sstable with the purgable row 
 finally gets compacted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable

2015-03-26 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382231#comment-14382231
 ] 

T Jake Luciani commented on CASSANDRA-8085:
---

Technically it was [~slebresne] mine were just bumps from releases

 Make PasswordAuthenticator number of hashing rounds configurable
 

 Key: CASSANDRA-8085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tyler Hobbs
Assignee: Sam Tunnicliffe
 Fix For: 3.0, 2.1.4

 Attachments: 8085-2.1.txt, 8085-3.0.txt


 Running 2^10 rounds of bcrypt can take a while.  In environments (like PHP) 
 where connections are not typically long-lived, authenticating can add 
 substantial overhead.  On IRC, one user saw the time to connect, 
 authenticate, and execute a query jump from 5ms to 150ms with authentication 
 enabled ([debug logs|http://pastebin.com/bSUufbr0]).
 CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even 
 after 7715), this is a good option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9045:
---
Reproduced In: 2.0.13, 2.0.10  (was: 2.0.10, 2.0.13)
Fix Version/s: 2.0.14

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9046) Allow Cassandra config to be updated to allow restarting without unloading classes


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Hugonnet updated CASSANDRA-9046:
-
Attachment: 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch

 Allow Cassandra config to be updated to allow restarting without unloading 
 classes
 --

 Key: CASSANDRA-9046
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Emmanuel Hugonnet
 Attachments: 
 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch


 Make applyConfig public in DatabaseDescriptor so that if we embed C* we can 
 restart it after some configuration change without having to stop the whole 
 application to unload the class which is configured once and for all in a 
 static block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive

2015-03-26 Thread Brent Haines (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382150#comment-14382150
]

Brent Haines commented on CASSANDRA-9033:
-

Holy shit. You're right. I apologize. Here is what happened -

The stories table *was* LCS before we failed last week. When I built the schema
for the replacement cluster, I stuck with STCS because the original failure
made me nervous. It was stuck in my head that this was an LCS table so I didn't
actually review the results of the describe columnfamily. Or look for STCS
related bugs... I'm stupid, sorry.

I will try the workaround, thanks for that. The initial failure was using LCS
(I swear it), but the replacement cluster obviously failed for the reasons you
gave.

I'll set up the work-around you gave and wait for 2.1.4.

One final question - Is it generally ok to change compaction strategies on a
large table? Before we restarted this, I tried to change from LCS to STCS and
the key store was corrupted.

Thanks for the help.

Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes
unresponsive
---

Key: CASSANDRA-9033
URL: https://issues.apache.org/jira/browse/CASSANDRA-9033
Project: Cassandra
Issue Type: Bug
Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic
#79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
* EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters]
* 12 nodes running a mix of 2.1.1 and 2.1.3
* 8GB stack size with offheap objects
Reporter: Brent Haines
Assignee: Marcus Eriksson
Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip

We have an Event Log table using LCS that has grown fast. There are more than
100K sstable files that are around 1KB. Increasing compactors and adjusting
compaction throttling upward doesn't make a difference. It has been running
great though until we upgraded to 2.1.3. Those nodes needed more RAM for the
stack (12 GB) to even have a prayer of responding to queries. They bog down
and become unresponsive. There are no GC messages that I can see, and no
compaction either.
The only work-around I have found is to decommission, blow away the big CF
and rejoin. That happens in about 20 minutes and everything is freaking happy
again. The size of the files is more like what I'd expect as well.
Our schema:
{code}
cqlsh describe columnfamily data.stories
CREATE TABLE data.stories (
id timeuuid PRIMARY KEY,
action_data timeuuid,
action_name text,
app_id timeuuid,
app_instance_id timeuuid,
data maptext, text,
objects settimeuuid,
time_stamp timestamp,
user_id timeuuid
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{keys:ALL, rows_per_partition:NONE}'
AND comment = 'Stories represent the timeline and are placed in the
dashboard for the brand manager to see'
AND compaction = {'min_threshold': '4', 'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32'}
AND compression = {'sstable_compression':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
cqlsh
{code}
There were no log entries that stood out. It pretty much consisted of x is
down x is up repeated ad infinitum. I have attached the zipped system.log
that has the situation after the upgrade and then after I stopped, removed
system, system_traces, OpsCenter, and data/stories-/* and restarted.
It has rejoined the cluster now and is busy read-repairing to recover its
data.
On another note, we see a lot of this during repair now (on all the nodes):
{code}
ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303
- [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the
following error
java.io.IOException: Failed during snapshot creation.
at
org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
~[apache-cassandra-2.1.3.jar:2.1.3]
at
org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146)
~[apache-cassandra-2.1.3.jar:2.1.3]
at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
~[guava-16.0.jar:na]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[na:1.7.0_55]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

[jira] [Commented] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382242#comment-14382242
 ] 

Sylvain Lebresne commented on CASSANDRA-8085:
-

Almost surely due to a release bump on my part too. This is why we should only 
set a single fix version before commit (and the committer can feel free to 
update that to whatever he committed to once he resolve the ticket), as 
otherwise there is no simple way to bump versions simply and that is what 
happen. TL;DR, the removal of 2.0 of the fix version was an accident.

 Make PasswordAuthenticator number of hashing rounds configurable
 

 Key: CASSANDRA-8085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tyler Hobbs
Assignee: Sam Tunnicliffe
 Fix For: 3.0, 2.1.4

 Attachments: 8085-2.1.txt, 8085-3.0.txt


 Running 2^10 rounds of bcrypt can take a while.  In environments (like PHP) 
 where connections are not typically long-lived, authenticating can add 
 substantial overhead.  On IRC, one user saw the time to connect, 
 authenticate, and execute a query jump from 5ms to 150ms with authentication 
 enabled ([debug logs|http://pastebin.com/bSUufbr0]).
 CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even 
 after 7715), this is a good option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8989) Reading from table which contains collection type using token function and with CL ONE causes overwhelming writes to replicas


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Yeksigian updated CASSANDRA-8989:
--
Attachment: 8989-2.0.txt

Backported the patch from CASSANDRA-6863 for 2.0. I tested with a mixed 2.0.12 
and a patched cluster and there are no additional read repair requests; the 
2.0.12 node continues to exhibit this behavior.

 Reading from table which contains collection type using token function and 
 with CL  ONE causes overwhelming writes to replicas
 ---

 Key: CASSANDRA-8989
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8989
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Miroslaw Partyka
Assignee: Carl Yeksigian
Priority: Critical
 Attachments: 8989-2.0.txt, trace.txt


 When reading from a table at the aforementioned conditions, each read from 
 replica also casues write to the replica. 
 Confimed in version 2.0.12  2.0.13, version 2.1.3 seems ok.
 To reproduce:
 {code}CREATE KEYSPACE test WITH replication = {'class': 
 'NetworkTopologyStrategy', 'DC1': 2};
 USE test;
 CREATE TABLE bug(id int PRIMARY KEY, val mapint,int);
 INSERT INTO bug(id, val) VALUES (1, {2: 3});
 CONSISTENCY LOCAL_QUORUM
 TRACING ON
 SELECT * FROM bug WHERE token(id) = 0;{code}
 trace contains twice:
 Appending to commitlog
 Adding to bug memtable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382531#comment-14382531
 ] 

Roman Tkachenko commented on CASSANDRA-9045:


I have attached an excerpt from cqlsh session showing select - delete - 
select - repair - select with tracing on. The very last select was issued 
after repair was done.

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.0.14

 Attachments: cqlsh.txt


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-03-26 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382524#comment-14382524
 ] 

Ariel Weisberg edited comment on CASSANDRA-8670 at 3/26/15 7:52 PM:


NIODataInputStream
bq. readNext() should assert it is never shuffling more than 7 bytes; in fact 
ideally this would be done by readMinimum() to make it clearer
By assert you mean an assert that compiles out or a precondition?

bq. readNext() should IMO never shuffle unless it's at the end of its capacity; 
if it hasRemaining() and limit() != capacity() it should read on from its 
current limit (readMinimum can ensure there is room to fully meet its 
requirements)
I guess I don't get when this optimization will help. I could see it hurting. 
You could stream through the buffer not returning to the beginning on a regular 
basis and end up issuing smaller then desired reads.

Users of buffered input stream get this behavior and I didn't want to change 
it. DataInput and company pull bytes out one at a time even for multi-byte 
types.

NIODataOutputStreamPlus
bq. available() should return the bytes in the buffer at least
I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0, 
for a file it returns the bytes remaining to read from the file. I think it 
makes sense for the API when you don't have a real answer.

bq. why the use of hollowBuffer? For clarity in case of restoring the cursor 
position during exceptions? Would be helpful to clarify with a comment. It 
seems like perhaps this should only be used for the first branch, though, since 
the second should have no risk of throwing an exception, so we can safely 
restore the position. It seems like it might be best to make hollowBuffer 
default to null, and instantiate it only if it is larger than our buffer size, 
otherwise first flushing our internal buffer if we haven't got enough room. 
This way we should rarely need the hollowBuffer.
The contract of the API requires that the incoming buffer not be modified. For 
thread safety reasons I don't modify the original buffer's position and then 
reset it in a finally block.

I am not sure what you mean by hollow buffer larger then our buffer. It's 
hollow so it has no size. We also use it copy things into our buffer while 
preserving the original position.

The rest is reasonable.






was (Author: aweisberg):
NIODataInputStream
bq. readNext() should assert it is never shuffling more than 7 bytes; in fact 
ideally this would be done by readMinimum() to make it clearer
By assert you mean an assert that compiles out or a precondition?

bq. readNext() should IMO never shuffle unless it's at the end of its capacity; 
if it hasRemaining() and limit() != capacity() it should read on from its 
current limit (readMinimum can ensure there is room to fully meet its 
requirements)
I guess I don't get when this optimization will help. I could see it hurting. 
You could stream through the buffer not returning to the beginning on a regular 
basis and end up issuing smaller then desired reads.

NIODataOutputStreamPlus
bq. available() should return the bytes in the buffer at least
I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0, 
for a file it returns the bytes remaining to read from the file. I think it 
makes sense for the API when you don't have a real answer.

bq. why the use of hollowBuffer? For clarity in case of restoring the cursor 
position during exceptions? Would be helpful to clarify with a comment. It 
seems like perhaps this should only be used for the first branch, though, since 
the second should have no risk of throwing an exception, so we can safely 
restore the position. It seems like it might be best to make hollowBuffer 
default to null, and instantiate it only if it is larger than our buffer size, 
otherwise first flushing our internal buffer if we haven't got enough room. 
This way we should rarely need the hollowBuffer.
The contract of the API requires that the incoming buffer not be modified. For 
thread safety reasons I don't modify the original buffer's position and then 
reset it in a finally block.

I am not sure what you mean by hollow buffer larger then our buffer. It's 
hollow so it has no size. We also use it copy things into our buffer while 
preserving the original position.

The rest is reasonable.





 Large columns + NIO memory pooling causes excessive direct memory usage
 ---

 Key: CASSANDRA-8670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Fix For: 3.0

 Attachments: largecolumn_test.py


 If you provide a large byte array to NIO

[jira] [Comment Edited] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-03-26 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382524#comment-14382524
 ] 

Ariel Weisberg edited comment on CASSANDRA-8670 at 3/26/15 8:02 PM:


NIODataInputStream
bq. readNext() should assert it is never shuffling more than 7 bytes; in fact 
ideally this would be done by readMinimum() to make it clearer
By assert you mean an assert that compiles out or a precondition?

bq. readNext() should IMO never shuffle unless it's at the end of its capacity; 
if it hasRemaining() and limit() != capacity() it should read on from its 
current limit (readMinimum can ensure there is room to fully meet its 
requirements)
I guess I don't get when this optimization will help. I could see it hurting. 
You could stream through the buffer not returning to the beginning on a regular 
basis and end up issuing smaller then desired reads.

Users of buffered input stream get this behavior and I didn't want to change 
it. DataInput and company pull bytes out one at a time even for multi-byte 
types.

NIODataOutputStreamPlus
bq. available() should return the bytes in the buffer at least
I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0, 
for a file it returns the bytes remaining to read from the file. I think it 
makes sense for the API when you don't have a real answer.

bq. why the use of hollowBuffer? For clarity in case of restoring the cursor 
position during exceptions? Would be helpful to clarify with a comment. It 
seems like perhaps this should only be used for the first branch, though, since 
the second should have no risk of throwing an exception, so we can safely 
restore the position. It seems like it might be best to make hollowBuffer 
default to null, and instantiate it only if it is larger than our buffer size, 
otherwise first flushing our internal buffer if we haven't got enough room. 
This way we should rarely need the hollowBuffer.
The contract of the API requires that the incoming buffer not be modified. For 
thread safety reasons I don't modify the original buffer's position and then 
reset it in a finally block.

I am not sure what you mean by hollow buffer larger than our buffer. It's 
hollow so it has no size. We also use it copy things into our buffer while 
preserving the original position.

The rest is reasonable.






was (Author: aweisberg):
NIODataInputStream
bq. readNext() should assert it is never shuffling more than 7 bytes; in fact 
ideally this would be done by readMinimum() to make it clearer
By assert you mean an assert that compiles out or a precondition?

bq. readNext() should IMO never shuffle unless it's at the end of its capacity; 
if it hasRemaining() and limit() != capacity() it should read on from its 
current limit (readMinimum can ensure there is room to fully meet its 
requirements)
I guess I don't get when this optimization will help. I could see it hurting. 
You could stream through the buffer not returning to the beginning on a regular 
basis and end up issuing smaller then desired reads.

Users of buffered input stream get this behavior and I didn't want to change 
it. DataInput and company pull bytes out one at a time even for multi-byte 
types.

NIODataOutputStreamPlus
bq. available() should return the bytes in the buffer at least
I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0, 
for a file it returns the bytes remaining to read from the file. I think it 
makes sense for the API when you don't have a real answer.

bq. why the use of hollowBuffer? For clarity in case of restoring the cursor 
position during exceptions? Would be helpful to clarify with a comment. It 
seems like perhaps this should only be used for the first branch, though, since 
the second should have no risk of throwing an exception, so we can safely 
restore the position. It seems like it might be best to make hollowBuffer 
default to null, and instantiate it only if it is larger than our buffer size, 
otherwise first flushing our internal buffer if we haven't got enough room. 
This way we should rarely need the hollowBuffer.
The contract of the API requires that the incoming buffer not be modified. For 
thread safety reasons I don't modify the original buffer's position and then 
reset it in a finally block.

I am not sure what you mean by hollow buffer larger then our buffer. It's 
hollow so it has no size. We also use it copy things into our buffer while 
preserving the original position.

The rest is reasonable.





 Large columns + NIO memory pooling causes excessive direct memory usage
 ---

 Key: CASSANDRA-8670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter:

[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382579#comment-14382579
 ] 

Carl Yeksigian commented on CASSANDRA-9048:
---

While I think this is really useful, I don't see why this would live in-tree, 
especially given part of the worry here is that cqlsh only works with a single 
version of Cassandra at a time -- I would imagine this would live in much the 
same way. Since it doesn't utilize anything in-tree, it would make sense to 
keep this as a separate repository.

 Delimited File Bulk Loader
 --

 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess
 Fix For: 3.0

 Attachments: CASSANDRA-9048.patch


 There is a strong need for bulk loading data from delimited files into 
 Cassandra.  Starting with delimited files means that the data is not 
 currently in the SSTable format, and therefore cannot immediately leverage 
 Cassandra's bulk loading tool, sstableloader, directly.
 A tool supporting delimited files much closer matches the format of the data 
 more often than the SSTable format itself, and a tool that loads from 
 delimited files is very useful.
 In order for this bulk loader to be more generally useful to customers, it 
 should handle a number of options at a minimum:
 - support specifying the input file or to read the data from stdin (so other 
 command-line programs can pipe into the loader)
 - supply the CQL schema for the input data
 - support all data types other than collections (collections is a stretch 
 goal/need)
 - an option to specify the delimiter
 - an option to specify comma as the decimal delimiter (for international use 
 casese)
 - an option to specify how NULL values are specified in the file (e.g., the 
 empty string or the string NULL)
 - an option to specify how BOOLEAN values are specified in the file (e.g., 
 TRUE/FALSE or 0/1)
 - an option to specify the Date and Time format
 - an option to skip some number of rows at the beginning of the file
 - an option to only read in some number of rows from the file
 - an option to indicate how many parse errors to tolerate
 - an option to specify a file that will contain all the lines that did not 
 parse correctly (up to the maximum number of parse errors)
 - an option to specify the CQL port to connect to (with 9042 as the default).
 Additional options would be useful, but this set of options/features is a 
 start.
 A word on COPY.  COPY comes via CQLSH which requires the client to be the 
 same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
 etc).  This tool should be able to connect to any version of Cassandra 
 (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
 Moreover, CQLSH's COPY command does not support a number of the options 
 above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
 considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382617#comment-14382617
 ] 

Aleksey Yeschenko commented on CASSANDRA-9048:
--

We already have plans for a Spark-based, multiple-format data import/export 
tool. CSV files will be the first supported format, with other Cassandra tables 
supported too (see CASSANDRA-8234).

That tool, once done, will go in the tree, and supersede CQLSH's COPY, among 
other things.

 Delimited File Bulk Loader
 --

 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess
 Fix For: 3.0

 Attachments: CASSANDRA-9048.patch


 There is a strong need for bulk loading data from delimited files into 
 Cassandra.  Starting with delimited files means that the data is not 
 currently in the SSTable format, and therefore cannot immediately leverage 
 Cassandra's bulk loading tool, sstableloader, directly.
 A tool supporting delimited files much closer matches the format of the data 
 more often than the SSTable format itself, and a tool that loads from 
 delimited files is very useful.
 In order for this bulk loader to be more generally useful to customers, it 
 should handle a number of options at a minimum:
 - support specifying the input file or to read the data from stdin (so other 
 command-line programs can pipe into the loader)
 - supply the CQL schema for the input data
 - support all data types other than collections (collections is a stretch 
 goal/need)
 - an option to specify the delimiter
 - an option to specify comma as the decimal delimiter (for international use 
 casese)
 - an option to specify how NULL values are specified in the file (e.g., the 
 empty string or the string NULL)
 - an option to specify how BOOLEAN values are specified in the file (e.g., 
 TRUE/FALSE or 0/1)
 - an option to specify the Date and Time format
 - an option to skip some number of rows at the beginning of the file
 - an option to only read in some number of rows from the file
 - an option to indicate how many parse errors to tolerate
 - an option to specify a file that will contain all the lines that did not 
 parse correctly (up to the maximum number of parse errors)
 - an option to specify the CQL port to connect to (with 9042 as the default).
 Additional options would be useful, but this set of options/features is a 
 start.
 A word on COPY.  COPY comes via CQLSH which requires the client to be the 
 same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
 etc).  This tool should be able to connect to any version of Cassandra 
 (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
 Moreover, CQLSH's COPY command does not support a number of the options 
 above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
 considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382507#comment-14382507
 ] 

Philip Thompson commented on CASSANDRA-9048:


The comment in StringParser needs fixed, it does not reflect what the method 
does. You don't follow code style everywhere [1]. 


[1] http://wiki.apache.org/cassandra/CodeStyle

 Delimited File Bulk Loader
 --

 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess
 Fix For: 3.0

 Attachments: CASSANDRA-9048.patch


 There is a strong need for bulk loading data from delimited files into 
 Cassandra.  Starting with delimited files means that the data is not 
 currently in the SSTable format, and therefore cannot immediately leverage 
 Cassandra's bulk loading tool, sstableloader, directly.
 A tool supporting delimited files much closer matches the format of the data 
 more often than the SSTable format itself, and a tool that loads from 
 delimited files is very useful.
 In order for this bulk loader to be more generally useful to customers, it 
 should handle a number of options at a minimum:
 - support specifying the input file or to read the data from stdin (so other 
 command-line programs can pipe into the loader)
 - supply the CQL schema for the input data
 - support all data types other than collections (collections is a stretch 
 goal/need)
 - an option to specify the delimiter
 - an option to specify comma as the decimal delimiter (for international use 
 casese)
 - an option to specify how NULL values are specified in the file (e.g., the 
 empty string or the string NULL)
 - an option to specify how BOOLEAN values are specified in the file (e.g., 
 TRUE/FALSE or 0/1)
 - an option to specify the Date and Time format
 - an option to skip some number of rows at the beginning of the file
 - an option to only read in some number of rows from the file
 - an option to indicate how many parse errors to tolerate
 - an option to specify a file that will contain all the lines that did not 
 parse correctly (up to the maximum number of parse errors)
 - an option to specify the CQL port to connect to (with 9042 as the default).
 Additional options would be useful, but this set of options/features is a 
 start.
 A word on COPY.  COPY comes via CQLSH which requires the client to be the 
 same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
 etc).  This tool should be able to connect to any version of Cassandra 
 (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
 Moreover, CQLSH's COPY command does not support a number of the options 
 above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
 considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Tkachenko updated CASSANDRA-9045:
---
Attachment: cqlsh.txt

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.0.14

 Attachments: cqlsh.txt


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-03-26 Thread Ariel Weisberg (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382524#comment-14382524
]

Ariel Weisberg commented on CASSANDRA-8670:
---

NIODataInputStream
bq. readNext() should assert it is never shuffling more than 7 bytes; in fact
ideally this would be done by readMinimum() to make it clearer
By assert you mean an assert that compiles out or a precondition?

bq. readNext() should IMO never shuffle unless it's at the end of its capacity;
if it hasRemaining() and limit() != capacity() it should read on from its
current limit (readMinimum can ensure there is room to fully meet its
requirements)
I guess I don't get when this optimization will help. I could see it hurting.
You could stream through the buffer not returning to the beginning on a regular
basis and end up issuing smaller then desired reads.

NIODataOutputStreamPlus
bq. available() should return the bytes in the buffer at least
I duplicated the JDK behavior for NIO. DataInputStream for a socket returns 0,
for a file it returns the bytes remaining to read from the file. I think it
makes sense for the API when you don't have a real answer.

bq. why the use of hollowBuffer? For clarity in case of restoring the cursor
position during exceptions? Would be helpful to clarify with a comment. It
seems like perhaps this should only be used for the first branch, though, since
the second should have no risk of throwing an exception, so we can safely
restore the position. It seems like it might be best to make hollowBuffer
default to null, and instantiate it only if it is larger than our buffer size,
otherwise first flushing our internal buffer if we haven't got enough room.
This way we should rarely need the hollowBuffer.
The contract of the API requires that the incoming buffer not be modified. For
thread safety reasons I don't modify the original buffer's position and then
reset it in a finally block.

I am not sure what you mean by hollow buffer larger then our buffer. It's
hollow so it has no size. We also use it copy things into our buffer while
preserving the original position.

The rest is reasonable.

Large columns + NIO memory pooling causes excessive direct memory usage
---

Key: CASSANDRA-8670
URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
Fix For: 3.0

Attachments: largecolumn_test.py

If you provide a large byte array to NIO and ask it to populate the byte
array from a socket it will allocate a thread local byte buffer that is the
size of the requested read no matter how large it is. Old IO wraps new IO for
sockets (but not files) so old IO is effected as well.
Even If you are using Buffered{Input | Output}Stream you can end up passing a
large byte array to NIO. The byte array read method will pass the array to
NIO directly if it is larger than the internal buffer.
Passing large cells between nodes as part of intra-cluster messaging can
cause the NIO pooled buffers to quickly reach a high watermark and stay
there. This ends up costing 2x the largest cell size because there is a
buffer for input and output since they are different threads. This is further
multiplied by the number of nodes in the cluster - 1 since each has a
dedicated thread pair with separate thread locals.
Anecdotally it appears that the cost is doubled beyond that although it isn't
clear why. Possibly the control connections or possibly there is some way in
which multiple
Need a workload in CI that tests the advertised limits of cells on a cluster.
It would be reasonable to ratchet down the max direct memory for the test to
trigger failures if a memory pooling issue is introduced. I don't think we
need to test concurrently pulling in a lot of them, but it should at least
work serially.
The obvious fix to address this issue would be to read in smaller chunks when
dealing with large values. I think small should still be relatively large (4
megabytes) so that code that is reading from a disk can amortize the cost of
a seek. It can be hard to tell what the underlying thing being read from is
going to be in some of the contexts where we might choose to implement
switching to reading chunks.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382534#comment-14382534
 ] 

Roman Tkachenko commented on CASSANDRA-9045:


Forgot to mention that before the test I restored the original in memory 
compaction limit to the default 64MB so the row does not fit into this limit.

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.0.14

 Attachments: cqlsh.txt


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382532#comment-14382532
 ] 

Philip Thompson commented on CASSANDRA-9045:


[~thobbs], this will be most meaningful to you. The Digest Mismatch seems 
interesting to me, how could that happen at CL=ALL for all operations?

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.0.14

 Attachments: cqlsh.txt


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9045:
---
Reproduced In: 2.0.13, 2.0.10  (was: 2.0.10, 2.0.13)
   Tester: Philip Thompson

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[1/2] cassandra git commit: Backport CASSANDRA-8085 to cassandra-2.0

2015-03-26 Thread tylerhobbs

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 14327e4b9 - 93156d761


Backport CASSANDRA-8085 to cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1b1acae9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1b1acae9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1b1acae9

Branch: refs/heads/cassandra-2.1
Commit: 1b1acae9afbd6faf2f628d5ae7ba0763aaac1e86
Parents: 9625910
Author: Tyler Hobbs ty...@datastax.com
Authored: Thu Mar 26 13:22:09 2015 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Thu Mar 26 13:22:09 2015 -0500

--
 CHANGES.txt  |  1 +
 .../apache/cassandra/auth/PasswordAuthenticator.java | 15 +--
 2 files changed, 14 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1acae9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 293dc55..adc0d59 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.14:
+ * Make PasswordAuthenticator number of hashing rounds configurable 
(CASSANDRA-8085)
  * Lower logging level from ERROR to DEBUG when a scheduled schema pull
cannot be completed due to a node being down (CASSANDRA-9032)
  * Fix MOVED_NODE client event (CASSANDRA-8516)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1acae9/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java
--
diff --git a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java 
b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java
index e4c00b7..3c6d1af 100644
--- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java
+++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java
@@ -53,8 +53,19 @@ public class PasswordAuthenticator implements 
ISaslAwareAuthenticator
 {
 private static final Logger logger = 
LoggerFactory.getLogger(PasswordAuthenticator.class);
 
-// 2 ** GENSALT_LOG2_ROUNS rounds of hashing will be performed.
-private static final int GENSALT_LOG2_ROUNDS = 10;
+// 2 ** GENSALT_LOG2_ROUNDS rounds of hashing will be performed.
+private static final String GENSALT_LOG2_ROUNDS_PROPERTY = 
cassandra.auth_bcrypt_gensalt_log2_rounds;
+private static final int GENSALT_LOG2_ROUNDS = getGensaltLogRounds();
+
+static int getGensaltLogRounds()
+{
+int rounds = Integer.getInteger(GENSALT_LOG2_ROUNDS_PROPERTY, 10);
+if (rounds  4 || rounds  31)
+throw new RuntimeException(new 
ConfigurationException(String.format(Bad value for system property -D%s.  +
+   
 Please use a value 4 and 31,
+   
 GENSALT_LOG2_ROUNDS_PROPERTY)));
+return rounds;
+}
 
 // name of the hash column.
 private static final String SALTED_HASH = salted_hash;

cassandra git commit: Backport CASSANDRA-8085 to cassandra-2.0

2015-03-26 Thread tylerhobbs

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 9625910a5 - 1b1acae9a


Backport CASSANDRA-8085 to cassandra-2.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1b1acae9
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1b1acae9
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1b1acae9

Branch: refs/heads/cassandra-2.0
Commit: 1b1acae9afbd6faf2f628d5ae7ba0763aaac1e86
Parents: 9625910
Author: Tyler Hobbs ty...@datastax.com
Authored: Thu Mar 26 13:22:09 2015 -0500
Committer: Tyler Hobbs ty...@datastax.com
Committed: Thu Mar 26 13:22:09 2015 -0500

--
 CHANGES.txt  |  1 +
 .../apache/cassandra/auth/PasswordAuthenticator.java | 15 +--
 2 files changed, 14 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1acae9/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 293dc55..adc0d59 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.14:
+ * Make PasswordAuthenticator number of hashing rounds configurable 
(CASSANDRA-8085)
  * Lower logging level from ERROR to DEBUG when a scheduled schema pull
cannot be completed due to a node being down (CASSANDRA-9032)
  * Fix MOVED_NODE client event (CASSANDRA-8516)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1acae9/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java
--
diff --git a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java 
b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java
index e4c00b7..3c6d1af 100644
--- a/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java
+++ b/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java
@@ -53,8 +53,19 @@ public class PasswordAuthenticator implements 
ISaslAwareAuthenticator
 {
 private static final Logger logger = 
LoggerFactory.getLogger(PasswordAuthenticator.class);
 
-// 2 ** GENSALT_LOG2_ROUNS rounds of hashing will be performed.
-private static final int GENSALT_LOG2_ROUNDS = 10;
+// 2 ** GENSALT_LOG2_ROUNDS rounds of hashing will be performed.
+private static final String GENSALT_LOG2_ROUNDS_PROPERTY = 
cassandra.auth_bcrypt_gensalt_log2_rounds;
+private static final int GENSALT_LOG2_ROUNDS = getGensaltLogRounds();
+
+static int getGensaltLogRounds()
+{
+int rounds = Integer.getInteger(GENSALT_LOG2_ROUNDS_PROPERTY, 10);
+if (rounds  4 || rounds  31)
+throw new RuntimeException(new 
ConfigurationException(String.format(Bad value for system property -D%s.  +
+   
 Please use a value 4 and 31,
+   
 GENSALT_LOG2_ROUNDS_PROPERTY)));
+return rounds;
+}
 
 // name of the hash column.
 private static final String SALTED_HASH = salted_hash;

[jira] [Commented] (CASSANDRA-9033) Upgrading from 2.1.1 to 2.1.3 with LCS and many sstable files makes nodes unresponsive

2015-03-26 Thread Brent Haines (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382416#comment-14382416
 ] 

Brent Haines commented on CASSANDRA-9033:
-

I did not maintain logs for the corruption while changing compaction 
strategies. I imagine it might have been caused by the many millions of sstable 
files. I am relieved to see them all shrink to just a handful within 30 minutes 
of running with the settings you prescribed.

Thank you.

 Upgrading from 2.1.1 to 2.1.3 with LCS  and many sstable files makes nodes 
 unresponsive
 ---

 Key: CASSANDRA-9033
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9033
 Project: Cassandra
  Issue Type: Bug
 Environment: * Ubuntu 14.04.2 - Linux ip-10-0-2-122 3.13.0-46-generic 
 #79-Ubuntu SMP Tue Mar 10 20:06:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
 * EC2 m2-xlarge instances [4cpu, 16GB RAM, 1TB storage on 3 platters]
 * 12 nodes running a mix of 2.1.1 and 2.1.3
 * 8GB stack size with offheap objects
Reporter: Brent Haines
Assignee: Marcus Eriksson
 Attachments: cassandra-env.sh, cassandra.yaml, system.log.1.zip


 We have an Event Log table using LCS that has grown fast. There are more than 
 100K sstable files that are around 1KB. Increasing compactors and adjusting 
 compaction throttling upward doesn't make a difference. It has been running 
 great though until we upgraded to 2.1.3. Those nodes needed more RAM for the 
 stack (12 GB) to even have a prayer of responding to queries. They bog down 
 and become unresponsive. There are no GC messages that I can see, and no 
 compaction either. 
 The only work-around I have found is to decommission, blow away the big CF 
 and rejoin. That happens in about 20 minutes and everything is freaking happy 
 again. The size of the files is more like what I'd expect as well. 
 Our schema: 
 {code}
 cqlsh describe columnfamily data.stories
 CREATE TABLE data.stories (
 id timeuuid PRIMARY KEY,
 action_data timeuuid,
 action_name text,
 app_id timeuuid,
 app_instance_id timeuuid,
 data maptext, text,
 objects settimeuuid,
 time_stamp timestamp,
 user_id timeuuid
 ) WITH bloom_filter_fp_chance = 0.01
 AND caching = '{keys:ALL, rows_per_partition:NONE}'
 AND comment = 'Stories represent the timeline and are placed in the 
 dashboard for the brand manager to see'
 AND compaction = {'min_threshold': '4', 'class': 
 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
 'max_threshold': '32'}
 AND compression = {'sstable_compression': 
 'org.apache.cassandra.io.compress.LZ4Compressor'}
 AND dclocal_read_repair_chance = 0.1
 AND default_time_to_live = 0
 AND gc_grace_seconds = 864000
 AND max_index_interval = 2048
 AND memtable_flush_period_in_ms = 0
 AND min_index_interval = 128
 AND read_repair_chance = 0.0
 AND speculative_retry = '99.0PERCENTILE';
 cqlsh 
 {code}
 There were no log entries that stood out. It pretty much consisted of x is 
 down x is up repeated ad infinitum. I have attached the zipped system.log 
 that has the situation after the upgrade and then after I stopped, removed 
 system, system_traces, OpsCenter, and data/stories-/* and restarted. 
 It has rejoined the cluster now and is busy read-repairing to recover its 
 data.
 On another note, we see a lot of this during repair now (on all the nodes): 
 {code}
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,207 RepairSession.java:303 
 - [repair #c5043c40-d260-11e4-a2f2-8bb3e2bbdb35] session completed with the 
 following error
 java.io.IOException: Failed during snapshot creation.
 at 
 org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:344)
  ~[apache-cassandra-2.1.3.jar:2.1.3]
 at 
 org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:146) 
 ~[apache-cassandra-2.1.3.jar:2.1.3]
 at com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) 
 ~[guava-16.0.jar:na]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_55]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
 ERROR [AntiEntropySessions:5] 2015-03-24 20:03:10,208 
 CassandraDaemon.java:167 - Exception in thread 
 Thread[AntiEntropySessions:5,5,RMI Runtime]
 java.lang.RuntimeException: java.io.IOException: Failed during snapshot 
 creation.
 at com.google.common.base.Throwables.propagate(Throwables.java:160) 
 ~[guava-16.0.jar:na]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) 
 ~[apache-cassandra-2.1.3.jar:2.1.3]
 at

[jira] [Updated] (CASSANDRA-9034) AssertionError in SizeEstimatesRecorder


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Yeksigian updated CASSANDRA-9034:
--
Attachment: 9034-trunk.txt

Was able to replicate by starting with {{-Dcassandra.join_ring=false}}. Added a 
check to SizeEstimatesRecorder to make sure StorageService has started.

 AssertionError in SizeEstimatesRecorder
 ---

 Key: CASSANDRA-9034
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9034
 Project: Cassandra
  Issue Type: Bug
 Environment: Trunk (52ddfe412a)
Reporter: Stefania
Priority: Minor
 Fix For: 3.0

 Attachments: 9034-trunk.txt


 One of the dtests of CASSANDRA-8236 
 (https://github.com/stef1927/cassandra-dtest/tree/8236) raises the following 
 exception unless I set {{-Dcassandra.size_recorder_interval=0}}:
 {code}
 ERROR [OptionalTasks:1] 2015-03-25 12:58:47,015 CassandraDaemon.java:179 - 
 Exception in thread Thread[OptionalTasks:1,5,main]
 java.lang.AssertionError: null
 at 
 org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2235)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:61)
  ~[main/:na]
 at 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:82)
  ~[main/:na]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_76]
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
 [na:1.7.0_76]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_76]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
 INFO  [RMI TCP Connection(2)-127.0.0.1] 2015-03-25 12:59:23,189 
 StorageService.java:863 - Joining ring by operator request
 {code}
 The test is {{start_node_without_join_test}} in 
 _pushed_notifications_test.py_ but starting a node that won't join the ring 
 might be sufficient to reproduce the exception (I haven't tried though).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382165#comment-14382165
 ] 

Tyler Hobbs commented on CASSANDRA-8085:


Is there a reason we shouldn't backport this to 2.0?  It looks like [~tjake] 
set the fixver to 2.1 -- any particular reason for doing that?

 Make PasswordAuthenticator number of hashing rounds configurable
 

 Key: CASSANDRA-8085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tyler Hobbs
Assignee: Sam Tunnicliffe
 Fix For: 3.0, 2.1.4

 Attachments: 8085-2.1.txt, 8085-3.0.txt


 Running 2^10 rounds of bcrypt can take a while.  In environments (like PHP) 
 where connections are not typically long-lived, authenticating can add 
 substantial overhead.  On IRC, one user saw the time to connect, 
 authenticate, and execute a query jump from 5ms to 150ms with authentication 
 enabled ([debug logs|http://pastebin.com/bSUufbr0]).
 CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even 
 after 7715), this is a good option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8984) Introduce Transactional API for behaviours that can corrupt system state

[
https://issues.apache.org/jira/browse/CASSANDRA-8984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382439#comment-14382439
]

Jonathan Ellis commented on CASSANDRA-8984:
---

bq. our release page doesn't quite agree with this implicit assertion (that 2.1
is stable)

It aspires to be stable. :)

Let's keep the big changes to 3.x now.

Introduce Transactional API for behaviours that can corrupt system state

Key: CASSANDRA-8984
URL: https://issues.apache.org/jira/browse/CASSANDRA-8984
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Benedict
Assignee: Benedict
Fix For: 2.1.4

Attachments: 8984_windows_timeout.txt

As a penultimate (and probably final for 2.1, if we agree to introduce it
there) round of changes to the internals managing sstable writing, I've
introduced a new API called Transactional that I hope will make it much
easier to write correct behaviour. As things stand we conflate a lot of
behaviours into methods like close - the recent changes unpicked some of
these, but didn't go far enough. My proposal here introduces an interface
designed to support four actions (on top of their normal function):
* prepareToCommit
* commit
* abort
* cleanup
In normal operation, once we have finished constructing a state change we
call prepareToCommit; once all such state changes are prepared, we call
commit. If at any point everything fails, abort is called. In _either_ case,
cleanup is called at the very last.
These transactional objects are all AutoCloseable, with the behaviour being
to rollback any changes unless commit has completed successfully.
The changes are actually less invasive than it might sound, since we did
recently introduce abort in some places, as well as have commit like methods.
This simply formalises the behaviour, and makes it consistent between all
objects that interact in this way. Much of the code change is boilerplate,
such as moving an object into a try-declaration, although the change is still
non-trivial. What it _does_ do is eliminate a _lot_ of special casing that we
have had since 2.1 was released. The data tracker API changes and compaction
leftover cleanups should finish the job with making this much easier to
reason about, but this change I think is worthwhile considering for 2.1,
since we've just overhauled this entire area (and not released these
changes), and this change is essentially just the finishing touches, so the
risk is minimal and the potential gains reasonably significant.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader

2015-03-26 Thread Brian Hess (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382474#comment-14382474
 ] 

 Brian Hess commented on CASSANDRA-9048:


I have created a version of this as a Java program via executeAsync().  Some 
testing has shown that for bulk writing to Cassandra, if you are starting with 
delimited files (not SSTables), that Java's executeAsync() is more 
efficient/performant than creating SSTables and then calling sstableloader.

This implementation provides for the options above, as well as a way to specify 
the parallelism of the asynchronous writing (the number of futures in 
flight).  In addition to the Java implementation, I created a command-line 
utility a la cassandra-stress called cassandra-loader to invoke the Java 
classes with the appropriate CLASSPATH.  As such, I also modified build.xml and 
tools/bin/cassandra.in.sh as appropriate.

The patch is attached for review.

The command-line usage statement is:

{{Usage: -f filename -host ipaddress -schema schema [OPTIONS]
OPTIONS:
  -delim delimiter Delimiter to use [,]
  -delmInQuotes true Set to 'true' if delimiter can be inside 
quoted fields [false]  -dateFormat dateFormatString Date format [default for 
Locale.ENGLISH]
  -nullString nullString   String that signifies NULL [none]
  -skipRows skipRows   Number of rows to skip [0]
  -maxRows maxRows Maximum number of rows to read (-1 means all) 
[-1]
  -maxErrors maxErrors Maximum errors to endure [10]
  -badFile badFilename Filename for where to place badly parsed rows. 
[none]
  -port portNumber CQL Port Number [9042]
  -numFutures numFutures   Number of CQL futures to keep in flight [1000]
  -decimalDelim decimalDelim   Decimal delimiter [.] Other option is ','
  -boolStyle boolStyleString   Style for booleans [TRUE_FALSE] }}


 Delimited File Bulk Loader
 --

 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess
 Attachments: CASSANDRA-9048.patch


 There is a strong need for bulk loading data from delimited files into 
 Cassandra.  Starting with delimited files means that the data is not 
 currently in the SSTable format, and therefore cannot immediately leverage 
 Cassandra's bulk loading tool, sstableloader, directly.
 A tool supporting delimited files much closer matches the format of the data 
 more often than the SSTable format itself, and a tool that loads from 
 delimited files is very useful.
 In order for this bulk loader to be more generally useful to customers, it 
 should handle a number of options at a minimum:
 - support specifying the input file or to read the data from stdin (so other 
 command-line programs can pipe into the loader)
 - supply the CQL schema for the input data
 - support all data types other than collections (collections is a stretch 
 goal/need)
 - an option to specify the delimiter
 - an option to specify comma as the decimal delimiter (for international use 
 casese)
 - an option to specify how NULL values are specified in the file (e.g., the 
 empty string or the string NULL)
 - an option to specify how BOOLEAN values are specified in the file (e.g., 
 TRUE/FALSE or 0/1)
 - an option to specify the Date and Time format
 - an option to skip some number of rows at the beginning of the file
 - an option to only read in some number of rows from the file
 - an option to indicate how many parse errors to tolerate
 - an option to specify a file that will contain all the lines that did not 
 parse correctly (up to the maximum number of parse errors)
 - an option to specify the CQL port to connect to (with 9042 as the default).
 Additional options would be useful, but this set of options/features is a 
 start.
 A word on COPY.  COPY comes via CQLSH which requires the client to be the 
 same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
 etc).  This tool should be able to connect to any version of Cassandra 
 (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
 Moreover, CQLSH's COPY command does not support a number of the options 
 above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
 considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9046) Allow Cassandra config to be updated to restart Daemon without unloading classes


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9046:
---
Reviewer: Ariel Weisberg

 Allow Cassandra config to be updated to restart Daemon without unloading 
 classes
 

 Key: CASSANDRA-9046
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9046
 Project: Cassandra
  Issue Type: Improvement
  Components: Config
Reporter: Emmanuel Hugonnet
 Fix For: 3.0

 Attachments: 
 0001-CASSANDRA-9046-Making-applyConfig-public-so-it-may-b.patch


 Make applyConfig public in DatabaseDescriptor so that if we embed C* we can 
 restart it after some configuration change without having to stop the whole 
 application to unload the class which is configured once and for all in a 
 static block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-8085:
---
Attachment: 8085-2.0.txt

 Make PasswordAuthenticator number of hashing rounds configurable
 

 Key: CASSANDRA-8085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tyler Hobbs
Assignee: Sam Tunnicliffe
 Fix For: 3.0, 2.1.4, 2.0.14

 Attachments: 8085-2.0.txt, 8085-2.1.txt, 8085-3.0.txt


 Running 2^10 rounds of bcrypt can take a while.  In environments (like PHP) 
 where connections are not typically long-lived, authenticating can add 
 substantial overhead.  On IRC, one user saw the time to connect, 
 authenticate, and execute a query jump from 5ms to 150ms with authentication 
 enabled ([debug logs|http://pastebin.com/bSUufbr0]).
 CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even 
 after 7715), this is a good option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-9034) AssertionError in SizeEstimatesRecorder


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Yeksigian reassigned CASSANDRA-9034:
-

Assignee: Carl Yeksigian

 AssertionError in SizeEstimatesRecorder
 ---

 Key: CASSANDRA-9034
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9034
 Project: Cassandra
  Issue Type: Bug
 Environment: Trunk (52ddfe412a)
Reporter: Stefania
Assignee: Carl Yeksigian
Priority: Minor
 Fix For: 3.0

 Attachments: 9034-trunk.txt


 One of the dtests of CASSANDRA-8236 
 (https://github.com/stef1927/cassandra-dtest/tree/8236) raises the following 
 exception unless I set {{-Dcassandra.size_recorder_interval=0}}:
 {code}
 ERROR [OptionalTasks:1] 2015-03-25 12:58:47,015 CassandraDaemon.java:179 - 
 Exception in thread Thread[OptionalTasks:1,5,main]
 java.lang.AssertionError: null
 at 
 org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2235)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:61)
  ~[main/:na]
 at 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:82)
  ~[main/:na]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_76]
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
 [na:1.7.0_76]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_76]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
 INFO  [RMI TCP Connection(2)-127.0.0.1] 2015-03-25 12:59:23,189 
 StorageService.java:863 - Joining ring by operator request
 {code}
 The test is {{start_node_without_join_test}} in 
 _pushed_notifications_test.py_ but starting a node that won't join the ring 
 might be sufficient to reproduce the exception (I haven't tried though).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9048) Delimited File Bulk Loader

2015-03-26 Thread Brian Hess (JIRA)

 Brian Hess created CASSANDRA-9048:
--

 Summary: Delimited File Bulk Loader
 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess


There is a strong need for bulk loading data from delimited files into 
Cassandra.  Starting with delimited files means that the data is not currently 
in the SSTable format, and therefore cannot immediately leverage Cassandra's 
bulk loading tool, sstableloader, directly.

A tool supporting delimited files much closer matches the format of the data 
more often than the SSTable format itself, and a tool that loads from delimited 
files is very useful.

In order for this bulk loader to be more generally useful to customers, it 
should handle a number of options at a minimum:
- support specifying the input file or to read the data from stdin (so other 
command-line programs can pipe into the loader)
- supply the CQL schema for the input data
- support all data types other than collections (collections is a stretch 
goal/need)
- an option to specify the delimiter
- an option to specify comma as the decimal delimiter (for international use 
casese)
- an option to specify how NULL values are specified in the file (e.g., the 
empty string or the string NULL)
- an option to specify how BOOLEAN values are specified in the file (e.g., 
TRUE/FALSE or 0/1)
- an option to specify the Date and Time format
- an option to skip some number of rows at the beginning of the file
- an option to only read in some number of rows from the file
- an option to indicate how many parse errors to tolerate
- an option to specify a file that will contain all the lines that did not 
parse correctly (up to the maximum number of parse errors)
- an option to specify the CQL port to connect to (with 9042 as the default).

Additional options would be useful, but this set of options/features is a start.

A word on COPY.  COPY comes via CQLSH which requires the client to be the same 
version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, etc).  
This tool should be able to connect to any version of Cassandra (within 
reason).  For example, it should be able to handle 2.0.x and 2.1.x.  Moreover, 
CQLSH's COPY command does not support a number of the options above.  Lastly, 
the performance of COPY in 2.0.x is not high enough to be considered a bulk 
ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382365#comment-14382365
 ] 

Tyler Hobbs commented on CASSANDRA-9045:


It sounds to me like the incremental compaction is not processing range 
tombstones correctly, and it's purging the tombstone without purging the 
shadowed data.  It also sounds like the range tombstone is being dropped before 
gc_grace has passed, so something is going pretty wrong.

It seems like we should be able to reproduce this with a similar schema and 
similar deletes on a row that's above the in-memory compaction threshold.

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8085) Make PasswordAuthenticator number of hashing rounds configurable


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-8085:
---
Fix Version/s: 2.0.14

Okay, I've backported the patch to 2.0 and committed it as {{1b1acae}}.

 Make PasswordAuthenticator number of hashing rounds configurable
 

 Key: CASSANDRA-8085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8085
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tyler Hobbs
Assignee: Sam Tunnicliffe
 Fix For: 3.0, 2.1.4, 2.0.14

 Attachments: 8085-2.1.txt, 8085-3.0.txt


 Running 2^10 rounds of bcrypt can take a while.  In environments (like PHP) 
 where connections are not typically long-lived, authenticating can add 
 substantial overhead.  On IRC, one user saw the time to connect, 
 authenticate, and execute a query jump from 5ms to 150ms with authentication 
 enabled ([debug logs|http://pastebin.com/bSUufbr0]).
 CASSANDRA-7715 is a more complete fix for this, but in the meantime (and even 
 after 7715), this is a good option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9048) Delimited File Bulk Loader

2015-03-26 Thread Brian Hess (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hess updated CASSANDRA-9048:
--
Attachment: CASSANDRA-9048.patch

 Delimited File Bulk Loader
 --

 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess
 Attachments: CASSANDRA-9048.patch


 There is a strong need for bulk loading data from delimited files into 
 Cassandra.  Starting with delimited files means that the data is not 
 currently in the SSTable format, and therefore cannot immediately leverage 
 Cassandra's bulk loading tool, sstableloader, directly.
 A tool supporting delimited files much closer matches the format of the data 
 more often than the SSTable format itself, and a tool that loads from 
 delimited files is very useful.
 In order for this bulk loader to be more generally useful to customers, it 
 should handle a number of options at a minimum:
 - support specifying the input file or to read the data from stdin (so other 
 command-line programs can pipe into the loader)
 - supply the CQL schema for the input data
 - support all data types other than collections (collections is a stretch 
 goal/need)
 - an option to specify the delimiter
 - an option to specify comma as the decimal delimiter (for international use 
 casese)
 - an option to specify how NULL values are specified in the file (e.g., the 
 empty string or the string NULL)
 - an option to specify how BOOLEAN values are specified in the file (e.g., 
 TRUE/FALSE or 0/1)
 - an option to specify the Date and Time format
 - an option to skip some number of rows at the beginning of the file
 - an option to only read in some number of rows from the file
 - an option to indicate how many parse errors to tolerate
 - an option to specify a file that will contain all the lines that did not 
 parse correctly (up to the maximum number of parse errors)
 - an option to specify the CQL port to connect to (with 9042 as the default).
 Additional options would be useful, but this set of options/features is a 
 start.
 A word on COPY.  COPY comes via CQLSH which requires the client to be the 
 same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
 etc).  This tool should be able to connect to any version of Cassandra 
 (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
 Moreover, CQLSH's COPY command does not support a number of the options 
 above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
 considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9045:
---
Assignee: Marcus Eriksson  (was: Philip Thompson)

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Marcus Eriksson
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9045) Deleted columns are resurrected after repair in wide rows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382359#comment-14382359
 ] 

Philip Thompson commented on CASSANDRA-9045:


After discussion with [~thobbs], seems like a problem with incremental 
compaction. Assigning to [~krummas]

 Deleted columns are resurrected after repair in wide rows
 -

 Key: CASSANDRA-9045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9045
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Roman Tkachenko
Assignee: Philip Thompson
Priority: Critical
 Fix For: 2.0.14


 Hey guys,
 After almost a week of researching the issue and trying out multiple things 
 with (almost) no luck I was suggested (on the user@cass list) to file a 
 report here.
 h5. Setup
 Cassandra 2.0.13 (we had the issue with 2.0.10 as well and upgraded to see if 
 it goes away)
 Multi datacenter 12+6 nodes cluster.
 h5. Schema
 {code}
 cqlsh describe keyspace blackbook;
 CREATE KEYSPACE blackbook WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'IAD': '3',
   'ORD': '3'
 };
 USE blackbook;
 CREATE TABLE bounces (
   domainid text,
   address text,
   message text,
   timestamp bigint,
   PRIMARY KEY (domainid, address)
 ) WITH
   bloom_filter_fp_chance=0.10 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'LeveledCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor'};
 {code}
 h5. Use case
 Each row (defined by a domainid) can have many many columns (bounce entries) 
 so rows can get pretty wide. In practice, most of the rows are not that big 
 but some of them contain hundreds of thousands and even millions of columns.
 Columns are not TTL'ed but can be deleted using the following CQL3 statement:
 {code}
 delete from bounces where domainid = 'domain.com' and address = 
 'al...@example.com';
 {code}
 All queries are performed using LOCAL_QUORUM CL.
 h5. Problem
 We weren't very diligent about running repairs on the cluster initially, but 
 shorty after we started doing it we noticed that some of previously deleted 
 columns (bounce entries) are there again, as if tombstones have disappeared.
 I have run this test multiple times via cqlsh, on the row of the customer who 
 originally reported the issue:
 * delete an entry
 * verify it's not returned even with CL=ALL
 * run repair on nodes that own this row's key
 * the columns reappear and are returned even with CL=ALL
 I tried the same test on another row with much less data and everything was 
 correctly deleted and didn't reappear after repair.
 h5. Other steps I've taken so far
 Made sure NTP is running on all servers and clocks are synchronized.
 Increased gc_grace_seconds to 100 days, ran full repair (on the affected 
 keyspace) on all nodes, then changed it back to the default 10 days again. 
 Didn't help.
 Performed one more test. Updated one of the resurrected columns, then deleted 
 it and ran repair again. This time the updated version of the column 
 reappeared.
 Finally, I noticed these log entries for the row in question:
 {code}
 INFO [ValidationExecutor:77] 2015-03-25 20:27:43,936 
 CompactionController.java (line 192) Compacting large row 
 blackbook/bounces:4ed558feba8a483733001d6a (279067683 bytes) incrementally
 {code}
 Figuring it may be related I bumped in_memory_compaction_limit_in_mb to 
 512MB so the row fits into it, deleted the entry and ran repair once again. 
 The log entry for this row was gone and the columns didn't reappear.
 We have a lot of rows much larger than 512MB so can't increase this 
 parameters forever, if that is the issue.
 Please let me know if you need more information on the case or if I can run 
 more experiments.
 Thanks!
 Roman



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8669) simple_repair test failing on 2.1

2015-03-26 Thread Yuki Morishita (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382379#comment-14382379
 ] 

Yuki Morishita commented on CASSANDRA-8669:
---

Bisected down to this commit: 
[871f0039c5bf89be343039478c64ce835b04b5cf|https://github.com/apache/cassandra/commit/871f0039c5bf89be343039478c64ce835b04b5cf]
 (CASSANDRA-8429)

With the one commit before (bedd97f7abea417c0165721888458e62392875e9), I can 
run {{repair_test.py}} continuously without failure.

As Philip commented before, somehow extra ranges are being repaired when test 
fails.  I don't know if this relates to early open compaction since I modified 
{{repair_test.py}} and set {{sstable_preemptive_open_interval_in_mb: -1}} but 
test still fails with the same error. 
Will dig some more.

 simple_repair test failing on 2.1
 -

 Key: CASSANDRA-8669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8669
 Project: Cassandra
  Issue Type: Bug
Reporter: Philip Thompson
Assignee: Yuki Morishita
 Fix For: 2.1.4


 The dtest simple_repair_test began failing on 12/22 on 2.1 and trunk. The 
 test fails intermittently both locally and on cassci. 
 The test is here: 
 https://github.com/riptano/cassandra-dtest/blob/master/repair_test.py#L32
 The output is here: 
 http://cassci.datastax.com/job/cassandra-2.1_dtest/661/testReport/repair_test/TestRepair/simple_repair_test/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9047) The FROZEN and TUPLE keywords should not be reserved in CQL


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-9047:
---
Reviewer: Benjamin Lerer

 The FROZEN and TUPLE keywords should not be reserved in CQL
 ---

 Key: CASSANDRA-9047
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9047
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
Priority: Trivial
 Fix For: 2.1.4

 Attachments: 9047-2.1.txt


 It looks like we accidentally forgot to add the FROZEN and TUPLE keywords to 
 the list of unreserved keywords in Cql.g.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9047) The FROZEN and TUPLE keywords should not be reserved in CQL


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-9047:
---
Attachment: 9047-2.1.txt

 The FROZEN and TUPLE keywords should not be reserved in CQL
 ---

 Key: CASSANDRA-9047
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9047
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
Priority: Trivial
 Fix For: 2.1.4

 Attachments: 9047-2.1.txt


 It looks like we accidentally forgot to add the FROZEN and TUPLE keywords to 
 the list of unreserved keywords in Cql.g.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8499) Ensure SSTableWriter cleans up properly after failure


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382492#comment-14382492
 ] 

Benedict commented on CASSANDRA-8499:
-

Affected actions are: truncate, major compaction, cleanup, scrub, upgrade. 
Repair looks to be fine.

 Ensure SSTableWriter cleans up properly after failure
 -

 Key: CASSANDRA-8499
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8499
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.0.12, 2.1.3

 Attachments: 8499-20.txt, 8499-20v2, 8499-21.txt, 8499-21v2, 8499-21v3


 In 2.0 we do not free a bloom filter, in 2.1 we do not free a small piece of 
 offheap memory for writing compression metadata. In both we attempt to flush 
 the BF despite having encountered an exception, making the exception slow to 
 propagate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9048) Delimited File Bulk Loader


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9048:
---
Fix Version/s: 3.0

 Delimited File Bulk Loader
 --

 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess
 Fix For: 3.0

 Attachments: CASSANDRA-9048.patch


 There is a strong need for bulk loading data from delimited files into 
 Cassandra.  Starting with delimited files means that the data is not 
 currently in the SSTable format, and therefore cannot immediately leverage 
 Cassandra's bulk loading tool, sstableloader, directly.
 A tool supporting delimited files much closer matches the format of the data 
 more often than the SSTable format itself, and a tool that loads from 
 delimited files is very useful.
 In order for this bulk loader to be more generally useful to customers, it 
 should handle a number of options at a minimum:
 - support specifying the input file or to read the data from stdin (so other 
 command-line programs can pipe into the loader)
 - supply the CQL schema for the input data
 - support all data types other than collections (collections is a stretch 
 goal/need)
 - an option to specify the delimiter
 - an option to specify comma as the decimal delimiter (for international use 
 casese)
 - an option to specify how NULL values are specified in the file (e.g., the 
 empty string or the string NULL)
 - an option to specify how BOOLEAN values are specified in the file (e.g., 
 TRUE/FALSE or 0/1)
 - an option to specify the Date and Time format
 - an option to skip some number of rows at the beginning of the file
 - an option to only read in some number of rows from the file
 - an option to indicate how many parse errors to tolerate
 - an option to specify a file that will contain all the lines that did not 
 parse correctly (up to the maximum number of parse errors)
 - an option to specify the CQL port to connect to (with 9042 as the default).
 Additional options would be useful, but this set of options/features is a 
 start.
 A word on COPY.  COPY comes via CQLSH which requires the client to be the 
 same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
 etc).  This tool should be able to connect to any version of Cassandra 
 (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
 Moreover, CQLSH's COPY command does not support a number of the options 
 above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
 considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8989) Reading from table which contains collection type using token function and with CL ONE causes overwhelming writes to replicas

2015-03-26 Thread Sam Tunnicliffe (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-8989:
---
Reviewer: Sam Tunnicliffe

 Reading from table which contains collection type using token function and 
 with CL  ONE causes overwhelming writes to replicas
 ---

 Key: CASSANDRA-8989
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8989
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Miroslaw Partyka
Assignee: Carl Yeksigian
Priority: Critical
 Fix For: 2.0.14

 Attachments: 8989-2.0.txt, trace.txt


 When reading from a table at the aforementioned conditions, each read from 
 replica also casues write to the replica. 
 Confimed in version 2.0.12  2.0.13, version 2.1.3 seems ok.
 To reproduce:
 {code}CREATE KEYSPACE test WITH replication = {'class': 
 'NetworkTopologyStrategy', 'DC1': 2};
 USE test;
 CREATE TABLE bug(id int PRIMARY KEY, val mapint,int);
 INSERT INTO bug(id, val) VALUES (1, {2: 3});
 CONSISTENCY LOCAL_QUORUM
 TRACING ON
 SELECT * FROM bug WHERE token(id) = 0;{code}
 trace contains twice:
 Appending to commitlog
 Adding to bug memtable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9034) AssertionError in SizeEstimatesRecorder

2015-03-26 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383026#comment-14383026
 ] 

Stefania commented on CASSANDRA-9034:
-

Looks good, +1

 AssertionError in SizeEstimatesRecorder
 ---

 Key: CASSANDRA-9034
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9034
 Project: Cassandra
  Issue Type: Bug
 Environment: Trunk (52ddfe412a)
Reporter: Stefania
Assignee: Carl Yeksigian
Priority: Minor
 Fix For: 3.0

 Attachments: 9034-trunk.txt


 One of the dtests of CASSANDRA-8236 
 (https://github.com/stef1927/cassandra-dtest/tree/8236) raises the following 
 exception unless I set {{-Dcassandra.size_recorder_interval=0}}:
 {code}
 ERROR [OptionalTasks:1] 2015-03-25 12:58:47,015 CassandraDaemon.java:179 - 
 Exception in thread Thread[OptionalTasks:1,5,main]
 java.lang.AssertionError: null
 at 
 org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2235)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:61)
  ~[main/:na]
 at 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:82)
  ~[main/:na]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_76]
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
 [na:1.7.0_76]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_76]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
 INFO  [RMI TCP Connection(2)-127.0.0.1] 2015-03-25 12:59:23,189 
 StorageService.java:863 - Joining ring by operator request
 {code}
 The test is {{start_node_without_join_test}} in 
 _pushed_notifications_test.py_ but starting a node that won't join the ring 
 might be sufficient to reproduce the exception (I haven't tried though).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9034) AssertionError in SizeEstimatesRecorder

2015-03-26 Thread Stefania (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383035#comment-14383035
 ] 

Stefania commented on CASSANDRA-9034:
-

[~iamaleksey] if you are happy could you take care of committing please?

 AssertionError in SizeEstimatesRecorder
 ---

 Key: CASSANDRA-9034
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9034
 Project: Cassandra
  Issue Type: Bug
 Environment: Trunk (52ddfe412a)
Reporter: Stefania
Assignee: Carl Yeksigian
Priority: Minor
 Fix For: 3.0

 Attachments: 9034-trunk.txt


 One of the dtests of CASSANDRA-8236 
 (https://github.com/stef1927/cassandra-dtest/tree/8236) raises the following 
 exception unless I set {{-Dcassandra.size_recorder_interval=0}}:
 {code}
 ERROR [OptionalTasks:1] 2015-03-25 12:58:47,015 CassandraDaemon.java:179 - 
 Exception in thread Thread[OptionalTasks:1,5,main]
 java.lang.AssertionError: null
 at 
 org.apache.cassandra.service.StorageService.getLocalTokens(StorageService.java:2235)
  ~[main/:na]
 at 
 org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:61)
  ~[main/:na]
 at 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:82)
  ~[main/:na]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_76]
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
 [na:1.7.0_76]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_76]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_76]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_76]
 INFO  [RMI TCP Connection(2)-127.0.0.1] 2015-03-25 12:59:23,189 
 StorageService.java:863 - Joining ring by operator request
 {code}
 The test is {{start_node_without_join_test}} in 
 _pushed_notifications_test.py_ but starting a node that won't join the ring 
 might be sufficient to reproduce the exception (I haven't tried though).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383081#comment-14383081
 ] 

Jonathan Ellis commented on CASSANDRA-9048:
---

How performant is this compared to CASSANDRA-7405?

 Delimited File Bulk Loader
 --

 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess
 Fix For: 3.0

 Attachments: CASSANDRA-9048.patch


 There is a strong need for bulk loading data from delimited files into 
 Cassandra.  Starting with delimited files means that the data is not 
 currently in the SSTable format, and therefore cannot immediately leverage 
 Cassandra's bulk loading tool, sstableloader, directly.
 A tool supporting delimited files much closer matches the format of the data 
 more often than the SSTable format itself, and a tool that loads from 
 delimited files is very useful.
 In order for this bulk loader to be more generally useful to customers, it 
 should handle a number of options at a minimum:
 - support specifying the input file or to read the data from stdin (so other 
 command-line programs can pipe into the loader)
 - supply the CQL schema for the input data
 - support all data types other than collections (collections is a stretch 
 goal/need)
 - an option to specify the delimiter
 - an option to specify comma as the decimal delimiter (for international use 
 casese)
 - an option to specify how NULL values are specified in the file (e.g., the 
 empty string or the string NULL)
 - an option to specify how BOOLEAN values are specified in the file (e.g., 
 TRUE/FALSE or 0/1)
 - an option to specify the Date and Time format
 - an option to skip some number of rows at the beginning of the file
 - an option to only read in some number of rows from the file
 - an option to indicate how many parse errors to tolerate
 - an option to specify a file that will contain all the lines that did not 
 parse correctly (up to the maximum number of parse errors)
 - an option to specify the CQL port to connect to (with 9042 as the default).
 Additional options would be useful, but this set of options/features is a 
 start.
 A word on COPY.  COPY comes via CQLSH which requires the client to be the 
 same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
 etc).  This tool should be able to connect to any version of Cassandra 
 (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
 Moreover, CQLSH's COPY command does not support a number of the options 
 above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
 considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7970) JSON support for CQL


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382768#comment-14382768
 ] 

Tyler Hobbs commented on CASSANDRA-7970:


I've pushed some new commits to my branch to address your comments.  I also 
merged in the latest trunk and added support for the new date and time types.

bq. So, AbstractType.fromJSONObject would return a Term

Done (over a few commits).  The only hangup was that with collections and 
tuples, we need to avoid serializing elements in {{fromJSONObject}} because 
this can happen at prepare-time when we don't know the protocol version.  
Accordingly, Those classes return a DelayedValue instead of a terminal Value.

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8993) EffectiveIndexInterval calculation is incorrect

[
https://issues.apache.org/jira/browse/CASSANDRA-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382505#comment-14382505
]

Benedict commented on CASSANDRA-8993:
-

OK, so it makes a lot more sense now that I realise the downsampling
granularity can be so small - I was thrown by the BSL must be a power of two
without an equivalent statement for the sampling itself, and in my head I just
assumed it was all dealing with powers of 2 (so nothing technically wrong with
the comments, just my interpretation of them). This also explains why the
effective index intervals were always the same - with powers of 2 they would
be. I wonder if we couldn't get a lot of the benefit of downsampling by
sticking to powers of 2, as it might simplify the code significantly? The
original indices, indices to skip, and effective intervals could each be
implemented with approximately one simple statement. Not pushing for it, mind,
just airing the question.

Thanks for taking the time to explain, anyway, and with that clarification I am
+1 the patch as stands.

On the topic of the zero index always being present: I can vouch that this
assumption breaks somewhere, because I assumed this to be the case when
modifying IndexSummaryBuilder, and without a
setNextSamplePosition(-minIndexInterval) it doesn't pass its test cases (i.e.
initiating the first sample index deterministically to zero caused unit test
failures). So we should perhaps track down where the logical flaw is, however
minor it may be.

EffectiveIndexInterval calculation is incorrect
---

Attachments: 8993-2.1-v2.txt, 8993-2.1.txt, 8993.txt

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9044) Build prototype for validation testing harness


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382514#comment-14382514
 ] 

Philip Thompson commented on CASSANDRA-9044:


Merged into dtest and running here 
http://cassci.datastax.com/job/CTOOL_stress_validation/

Long term, this will probably need to move out of dtest, or at least into it's 
own submodule.

 Build prototype for validation testing harness
 --

 Key: CASSANDRA-9044
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9044
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Philip Thompson
Assignee: Philip Thompson

 Build a job and set it to run on jenkins for the basic stress validation 
 described in CASSANDRA-9007. Currently only using CCM nodes and log parsing 
 stress for errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9049) Run validation harness against a real cluster


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9049:
---
Issue Type: Sub-task  (was: Task)
Parent: CASSANDRA-9007

 Run validation harness against a real cluster
 -

 Key: CASSANDRA-9049
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9049
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Philip Thompson
Assignee: Philip Thompson

 Currently we run against CCM nodes. We will get more useful data and feedback 
 if we run against real C* clusters, whether on dedicated hardware or 
 provisioned on a cloud.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (CASSANDRA-9049) Run validation harness against a real cluster

Philip Thompson created CASSANDRA-9049:
--

 Summary: Run validation harness against a real cluster
 Key: CASSANDRA-9049
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9049
 Project: Cassandra
  Issue Type: Task
Reporter: Philip Thompson
Assignee: Philip Thompson


Currently we run against CCM nodes. We will get more useful data and feedback 
if we run against real C* clusters, whether on dedicated hardware or 
provisioned on a cloud.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader

2015-03-26 Thread Jeff Jirsa (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382602#comment-14382602
 ] 

Jeff Jirsa commented on CASSANDRA-9048:
---

2 cents: Agree with [~carlyeks]'s comment: seems useful, but don't see why it 
would go into the tree. 



 Delimited File Bulk Loader
 --

 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess
 Fix For: 3.0

 Attachments: CASSANDRA-9048.patch


 There is a strong need for bulk loading data from delimited files into 
 Cassandra.  Starting with delimited files means that the data is not 
 currently in the SSTable format, and therefore cannot immediately leverage 
 Cassandra's bulk loading tool, sstableloader, directly.
 A tool supporting delimited files much closer matches the format of the data 
 more often than the SSTable format itself, and a tool that loads from 
 delimited files is very useful.
 In order for this bulk loader to be more generally useful to customers, it 
 should handle a number of options at a minimum:
 - support specifying the input file or to read the data from stdin (so other 
 command-line programs can pipe into the loader)
 - supply the CQL schema for the input data
 - support all data types other than collections (collections is a stretch 
 goal/need)
 - an option to specify the delimiter
 - an option to specify comma as the decimal delimiter (for international use 
 casese)
 - an option to specify how NULL values are specified in the file (e.g., the 
 empty string or the string NULL)
 - an option to specify how BOOLEAN values are specified in the file (e.g., 
 TRUE/FALSE or 0/1)
 - an option to specify the Date and Time format
 - an option to skip some number of rows at the beginning of the file
 - an option to only read in some number of rows from the file
 - an option to indicate how many parse errors to tolerate
 - an option to specify a file that will contain all the lines that did not 
 parse correctly (up to the maximum number of parse errors)
 - an option to specify the CQL port to connect to (with 9042 as the default).
 Additional options would be useful, but this set of options/features is a 
 start.
 A word on COPY.  COPY comes via CQLSH which requires the client to be the 
 same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
 etc).  This tool should be able to connect to any version of Cassandra 
 (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
 Moreover, CQLSH's COPY command does not support a number of the options 
 above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
 considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7970) JSON support for CQL


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-7970:
---
Attachment: 7970-trunk-v2.txt

 JSON support for CQL
 

 Key: CASSANDRA-7970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7970
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Tyler Hobbs
  Labels: client-impacting, cql3.3, docs-impacting
 Fix For: 3.0

 Attachments: 7970-trunk-v1.txt, 7970-trunk-v2.txt


 JSON is popular enough that not supporting it is becoming a competitive 
 weakness.  We can add JSON support in a way that is compatible with our 
 performance goals by *mapping* JSON to an existing schema: one JSON documents 
 maps to one CQL row.
 Thus, it is NOT a goal to support schemaless documents, which is a misfeature 
 [1] [2] [3].  Rather, it is to allow a convenient way to easily turn a JSON 
 document from a service or a user into a CQL row, with all the validation 
 that entails.
 Since we are not looking to support schemaless documents, we will not be 
 adding a JSON data type (CASSANDRA-6833) a la postgresql.  Rather, we will 
 map the JSON to UDT, collections, and primitive CQL types.
 Here's how this might look:
 {code}
 CREATE TYPE address (
   street text,
   city text,
   zip_code int,
   phones settext
 );
 CREATE TABLE users (
   id uuid PRIMARY KEY,
   name text,
   addresses maptext, address
 );
 INSERT INTO users JSON
 {‘id’: 4b856557-7153,
‘name’: ‘jbellis’,
‘address’: {“home”: {“street”: “123 Cassandra Dr”,
 “city”: “Austin”,
 “zip_code”: 78747,
 “phones”: [2101234567]}}};
 SELECT JSON id, address FROM users;
 {code}
 (We would also want to_json and from_json functions to allow mapping a single 
 column's worth of data.  These would not require extra syntax.)
 [1] http://rustyrazorblade.com/2014/07/the-myth-of-schema-less/
 [2] https://blog.compose.io/schema-less-is-usually-a-lie/
 [3] http://dl.acm.org/citation.cfm?id=2481247



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7807) Push notification when tracing completes for an operation

2015-03-26 Thread Robert Stupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7807:

Attachment: 7807-v2.txt

I’ve added functionality to {{SimpleClient}} to specify the requested protocol 
version (also added support for that to {{Client}}/{{debug-cql}} tool).

The code now
* checks for protocol version 4 (added utest)
* checks if the connection registered for the event (added negative utest)
* ensure that event is not sent when tracing probability kicks in (added utest)
* added {{minimumVersion}} field to {{Event.Type}} enum
* also enhanced {{MessagePayloadTest}} to check behavior with native protocol  
4 (CASSANDRA-8553)
* also enhances {{debug-cql}} to specify native protocol version + event display

NB: {{debug-cql}} did not start when C* is running locally (port 7199 used), 
since it sources {{cassandra-env.sh}}

 Push notification when tracing completes for an operation
 -

 Key: CASSANDRA-7807
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7807
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Tyler Hobbs
Assignee: Robert Stupp
Priority: Minor
  Labels: client-impacting, protocolv4
 Fix For: 3.0

 Attachments: 7807-v2.txt, 7807.txt


 Tracing is an asynchronous operation, and drivers currently poll to determine 
 when the trace is complete (in a loop with sleeps).  Instead, the server 
 could push a notification to the driver when the trace completes.
 I'm guessing that most of the work for this will be around pushing 
 notifications to a single connection instead of all connections that have 
 registered listeners for a particular event type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader

2015-03-26 Thread Jeremy Hanna (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14382777#comment-14382777
]

Jeremy Hanna commented on CASSANDRA-9048:
-

I wonder if a bulk loader, if it's not a thick client thing, would need to come
in different forms. Small, medium and large.

Small: a single file for bootstrapping, not a huge amount of data. cqlsh copy
from would work for that.

Medium: the tool that this ticket represents. You might have a bunch of files,
but you don't want to have to fire up spark to do a simple bulk load.

Large/industrial: for a giant amount of data, perhaps on a regular basis. Or
if you were going to fire up spark anyway.

I think all three have their uses. I see each of them being more favorable in
different situations. If we have demand and people willing to maintain each of
the three, why wouldn't we consider them?

Delimited File Bulk Loader
--

Key: CASSANDRA-9048
URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
Project: Cassandra
Issue Type: Improvement
Components: Tools
Reporter: Brian Hess
Fix For: 3.0

Attachments: CASSANDRA-9048.patch

There is a strong need for bulk loading data from delimited files into
Cassandra. Starting with delimited files means that the data is not
currently in the SSTable format, and therefore cannot immediately leverage
Cassandra's bulk loading tool, sstableloader, directly.
A tool supporting delimited files much closer matches the format of the data
more often than the SSTable format itself, and a tool that loads from
delimited files is very useful.
In order for this bulk loader to be more generally useful to customers, it
should handle a number of options at a minimum:
- support specifying the input file or to read the data from stdin (so other
command-line programs can pipe into the loader)
- supply the CQL schema for the input data
- support all data types other than collections (collections is a stretch
goal/need)
- an option to specify the delimiter
- an option to specify comma as the decimal delimiter (for international use
casese)
- an option to specify how NULL values are specified in the file (e.g., the
empty string or the string NULL)
- an option to specify how BOOLEAN values are specified in the file (e.g.,
TRUE/FALSE or 0/1)
- an option to specify the Date and Time format
- an option to skip some number of rows at the beginning of the file
- an option to only read in some number of rows from the file
- an option to indicate how many parse errors to tolerate
- an option to specify a file that will contain all the lines that did not
parse correctly (up to the maximum number of parse errors)
- an option to specify the CQL port to connect to (with 9042 as the default).
Additional options would be useful, but this set of options/features is a
start.
A word on COPY. COPY comes via CQLSH which requires the client to be the
same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra,
etc). This tool should be able to connect to any version of Cassandra
(within reason). For example, it should be able to handle 2.0.x and 2.1.x.
Moreover, CQLSH's COPY command does not support a number of the options
above. Lastly, the performance of COPY in 2.0.x is not high enough to be
considered a bulk ingest tool.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9050) Add debug level logging to Directories.getWriteableLocation()

2015-03-26 Thread Robert Stupp (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-9050:

Attachment: 9050-2.1.txt
9050-2.0.txt

 Add debug level logging to Directories.getWriteableLocation()
 -

 Key: CASSANDRA-9050
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9050
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 2.0.14

 Attachments: 9050-2.0.txt, 9050-2.1.txt


 Add some debug level logging to log
 * blacklisted directories that are excluded
 * directories not matching requested size



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-6541) New versions of Hotspot create new Class objects on every JMX connection causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko resolved CASSANDRA-6541.
--
Resolution: Fixed

 New versions of Hotspot create new Class objects on every JMX connection 
 causing the heap to fill up with them if CMSClassUnloadingEnabled isn't set.
 -

 Key: CASSANDRA-6541
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6541
 Project: Cassandra
  Issue Type: Bug
  Components: Config
Reporter: jonathan lacefield
Assignee: Brandon Williams
Priority: Minor
 Fix For: 2.1 beta2, 2.0.6, 1.2.16

 Attachments: dse_systemlog


 Newer versions of Oracle's Hotspot JVM , post 6u43 (maybe earlier) and 7u25 
 (maybe earlier), are experiencing issues with GC and JMX where heap slowly 
 fills up overtime until OOM or a full GC event occurs, specifically when CMS 
 is leveraged.  Adding:
 {noformat}
 JVM_OPTS=$JVM_OPTS -XX:+CMSClassUnloadingEnabled
 {noformat}
 The the options in cassandra-env.sh alleviates the problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-8478) sstableloader NegativeArraySizeException when deserializing to build histograms


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko resolved CASSANDRA-8478.
--
   Resolution: Won't Fix
Fix Version/s: (was: 1.2.15)

1.2.x C* releases will no longer happen.

Feel free to reopen though if the issue can be reproduced in 2.0 or 2.1.

 sstableloader NegativeArraySizeException when deserializing to build 
 histograms
 ---

 Key: CASSANDRA-8478
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8478
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Erick Ramirez

 When a customer attempts to load sstable data files copied from a production 
 cluster, it returns the following exception:
 {code}
 $ sstableloader -d ip -p rpc_port -v KS/CF/
 null 
 java.lang.NegativeArraySizeException
 at 
 org.apache.cassandra.utils.EstimatedHistogram$EstimatedHistogramSerializer.deserialize(EstimatedHistogram.java:266)
 at 
 org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:292)
 at 
 org.apache.cassandra.io.sstable.SSTableMetadata$SSTableMetadataSerializer.deserialize(SSTableMetadata.java:282)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.openMetadata(SSTableReader.java:234)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.openForBatch(SSTableReader.java:162)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader$1.accept(SSTableLoader.java:100)
 at java.io.File.list(File.java:1155)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:67)
 at 
 org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:121)
 at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:66)
 -pr,--principal kerberos principal 
 -k,--keytab keytab location 
 --ssl-keystore ssl keystore location 
 --ssl-keystore-password ssl keystore password 
 --ssl-keystore-type ssl keystore type 
 --ssl-truststore ssl truststore location 
 --ssl-truststore-password ssl truststore password 
 --ssl-truststore-type ssl truststore type 
 {code}
 It appears to be failing on this line of code:
 {code}
 public EstimatedHistogram deserialize(DataInput dis) throws 
 IOException
 {
 int size = dis.readInt();
 long[] offsets = new long[size - 1];  here
 {code}
 The same error is returned regardless of which data file is attempted. I 
 suspect this may be due to corrupt data files or the way data is written that 
 is not compatible with the sstableloader utility.
 NOTE: Both source and target clusters are DSE 3.2.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8919) cqlsh return error in querying of CompositeType data