[jira] [Commented] (CASSANDRA-3674) add nodetool explicitgc

2011-12-27 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176110#comment-13176110
 ] 

Peter Schuller commented on CASSANDRA-3674:
---

My argument is that for casual and/or new users, they are running Cassandra, 
not the JVM, and they are not JVM experts (also, along those lines the heap 
usage printout of 'nodetool info' is duplication, *and* even misleading because 
it encourages people to interpret it in ways that don't really reflect reality).

I see what you're saying; I just think it's a convenient thing to have. It's 
not that *I* want it, but I'd like to be able to tell people to use it and 
assume they have it available out of the box.

Not too fussed about it though :)



 add nodetool explicitgc
 ---

 Key: CASSANDRA-3674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3674
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3674-trunk.txt


 So that you can easily ask people run nodetool explicitgc and paste the 
 results.
 I'll file a separate JIRA suggesting that we ship with 
 -XX:+ExplicitGCInvokesConcurrent by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-27 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176126#comment-13176126
 ] 

Pavel Yaskevich commented on CASSANDRA-3623:


CASSANDRA-3610 needs rebase to be applied on the latest trunk. Also I took a 
look at the doc you have attached and it looks like test for 1 is broken 
because stress command line shows that you use -S 3000 instead of 1.

{noformat}
Compressed Reads: *10,000* columnSize:
[vijay_tcasstest@vijay_tcass--1a-i-2801d94a ~]$ java -Xms2G -Xmx2G -Xmn1G 
-XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 
-XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 
-XX:+UseCMSInitiatingOccupancyOnly -jar Stress.jar -p 7102 -d 10.87.81.75 -n 
50 -S *3000* -I SnappyCompressor -o read
{noformat}

Also as I mentioned before - you test on the different nodes on the working 
cluster, there are side factors that could be affecting test results. Can you 
please explain why testing performance on the working cluster is a good idea?


 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, 
 MMappedIO-Performance.docx


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-27 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176221#comment-13176221
 ] 

Vijay commented on CASSANDRA-3623:
--

 Also I took a look at the doc you have attached and it looks like test for 
1 is broken because stress command line shows that you use -S 3000 instead 
of 1.
I will fix it.

Also as I mentioned before - you test on the different nodes on the working 
cluster, there are side factors that could be affecting test results. Can you 
please explain why testing performance on the working cluster is a good idea?

How do you know it is a working cluster? They are individual machine isolated 
without any network access to any other machine. There isnt anything which is 
been shared between those machines (They are VM's from the diffrent servers 
than the results which i have ever published). I created this test in different 
just to make a clean environment with cold cache (other option is to reset the 
mmap which i dont want to do). 

I know you have your doubts but I am not that bad ;)

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, 
 MMappedIO-Performance.docx


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-27 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176226#comment-13176226
 ] 

Pavel Yaskevich commented on CASSANDRA-3623:


I ask because you mentioned previously that you done tests on 12 node cluster. 
Testing results on the cloud depend on your neigbours that is why I/O could 
differ dramatically as it does in your tests, let's settle with CASSADRA-3611 
(and CASSANDRA-3610) and I will test it again on the real machine.

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, 
 MMappedIO-Performance.docx


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3673) Allow reduced consistency in sstableloader utility

2011-12-27 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3673:
--

  Component/s: (was: Core)
   Tools
 Priority: Minor  (was: Major)
Affects Version/s: (was: 1.0.6)
   (was: 0.8.0)
Fix Version/s: (was: 1.0.7)
   Issue Type: New Feature  (was: Bug)
  Summary: Allow reduced consistency in sstableloader utility  
(was: Issues in sstableloader utility)

1) For 1.1 we've updated the loader to not become a gossip peer 
(CASSANDRA-3045), but for 1.0.x that's part of how it works.  In the meantime, 
do not use nodetool removetoken; just let it expire normally when it's done.

2) The loader is doing you a favor here; it's a lot cheaper to get the data on 
all the machines in the first place, than to put it only some and repair later. 
 But, I suppose it's reasonable to have an option to reduce the effective 
ConsistencyLevel here.

 Allow reduced consistency in sstableloader utility
 --

 Key: CASSANDRA-3673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3673
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Samarth Gahire
Priority: Minor
  Labels: cassandra, clustering, performance, sstableloader
   Original Estimate: 72h
  Remaining Estimate: 72h

 Below are some of the issues I have been facing since I am using 
 sstable-loader cassandra utility in cassandra-0.8.2
 1) We have configured the sstableloader on a different machine.Since we have 
 loaded sstables from this machine it has become a part of the cluster and 
 except loading time it is always unreachable in describe cluster.
 a)As it is unreachable whenever I changes the schema it says this node is 
 unreachable(but its ok as schema change reflect over the other nodes)
 b) The main problem is when I tried to remove the node out of the cluster 
 using nodetool removetoken ,the process stucks saying RemovalStatus: 
 Removing token (62676456546693435176060154681903071729). Waiting for 
 replication confirmation from [cassandra-1/(10.10.01.10)(This is the ip of 
 loader machine)].As loader is part of the cluster and cassandra tries to 
 stream the data from loader machine and could not stream.
 So instead of making loader machine permanent part of the cluster can we make 
 it temporarily part of the cluster?
 2)When any of the node is down or unreachable with thrift based client like 
 pelop we can insert the data into the cassandra cluster.But this is not the 
 case with the sstable-loader.It do not work(do not stream) when any of the 
 nodes in the cluster is down or unreachable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3611) Make checksum on a compressed blocks optional

2011-12-27 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176133#comment-13176133
 ] 

Pavel Yaskevich commented on CASSANDRA-3611:


{code}
if (FBUtilities.threadLocalRandom().nextDouble()  
metadata.parameters.crcChance)
{code}

So when you have 1.0 in your parameters you will never get checksum checked 
(because nextDouble() is 1.0d exclusive), on the other hand with 0.0 you will 
check checksum every time, shouldn't it work the other way around?

 Make checksum on a compressed blocks optional
 -

 Key: CASSANDRA-3611
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3611
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
Priority: Minor
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-crc-check-chance-v2.patch, 
 0001-crc-check-chance.patch


 Currently every uncompressed block is run against checksum algo, there is cpu 
 overhead in doing same... We might want to make it configurable/optional for 
 some use cases which might not require checksum all the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3674) add nodetool explicitgc

2011-12-27 Thread Radim Kolar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176168#comment-13176168
 ] 

Radim Kolar commented on CASSANDRA-3674:


if this command reports size of live objects on heap after GC then it is 
somewhat usefull, because after doing GC in jconsole no size is printed after 
finishing GC and you need to wait some time until heap graphs are refreshed.

 add nodetool explicitgc
 ---

 Key: CASSANDRA-3674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3674
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3674-trunk.txt


 So that you can easily ask people run nodetool explicitgc and paste the 
 results.
 I'll file a separate JIRA suggesting that we ship with 
 -XX:+ExplicitGCInvokesConcurrent by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3611) Make checksum on a compressed blocks optional

2011-12-27 Thread Pavel Yaskevich (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176133#comment-13176133
 ] 

Pavel Yaskevich edited comment on CASSANDRA-3611 at 12/27/11 10:34 AM:
---

{code}
if (FBUtilities.threadLocalRandom().nextDouble()  
metadata.parameters.crcChance)
{code}

So when you have 1.0 in your parameters you will never get checksum checked 
(because nextDouble() is 1.0d exclusive), on the other hand with 0.0 you will 
check checksum every time, shouldn't it work the other way around?

I also think that we should add check for chance to be between 0.0 and 1.0 in 
CompressionParameters.

  was (Author: xedin):
{code}
if (FBUtilities.threadLocalRandom().nextDouble()  
metadata.parameters.crcChance)
{code}

So when you have 1.0 in your parameters you will never get checksum checked 
(because nextDouble() is 1.0d exclusive), on the other hand with 0.0 you will 
check checksum every time, shouldn't it work the other way around?
  
 Make checksum on a compressed blocks optional
 -

 Key: CASSANDRA-3611
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3611
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
Priority: Minor
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-crc-check-chance-v2.patch, 
 0001-crc-check-chance.patch


 Currently every uncompressed block is run against checksum algo, there is cpu 
 overhead in doing same... We might want to make it configurable/optional for 
 some use cases which might not require checksum all the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3610) Checksum improvement for CompressedRandomAccessReader

2011-12-27 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-3610:
-

Attachment: 0001-use-pure-java-CRC32-v3.patch

rebased

 Checksum improvement for CompressedRandomAccessReader
 -

 Key: CASSANDRA-3610
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3610
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
 Environment: JVM
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-use-pure-java-CRC32-v2.patch, 
 0001-use-pure-java-CRC32-v3.patch, 0001-use-pure-java-CRC32.patch, 
 TestCrc32Performance.java


 When compression is on, Currently we see checksum taking up about 40% of the 
 CPU more than snappy library.
 Looks like hadoop solved it by implementing their own checksum, we can either 
 use it or implement something like that.
 http://images.slidesharecdn.com/1toddlipconyanpeichen-cloudera-hadoopandperformance-final-10132228-phpapp01-slide-15-768.jpg?1321043717
 in our test env it provided 50% improvement over native implementation which 
 uses jni to call the OS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3674) add nodetool explicitgc

2011-12-27 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176249#comment-13176249
 ] 

Peter Schuller commented on CASSANDRA-3674:
---

It prints the heap usage after trying to obtain it as soon as possible after 
the explicit GC completes.

I did however just realize that I haven't tested whether the explicit gc 
invocation is blocking in the case of -XX:+ExplicitGCInvokesConcurrent.


 add nodetool explicitgc
 ---

 Key: CASSANDRA-3674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3674
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3674-trunk.txt


 So that you can easily ask people run nodetool explicitgc and paste the 
 results.
 I'll file a separate JIRA suggesting that we ship with 
 -XX:+ExplicitGCInvokesConcurrent by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Update of LargeDataSetConsiderations by PeterSchuller

2011-12-27 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The LargeDataSetConsiderations page has been changed by PeterSchuller:
http://wiki.apache.org/cassandra/LargeDataSetConsiderations?action=diffrev1=19rev2=20

   * Cassandra will read through sstable index files on start-up, doing what is 
known as index sampling. This is used to keep a subset (currently and by 
default, 1 out of 100) of keys and and their on-disk location in the index, in 
memory. See [[ArchitectureInternals]]. This means that the larger the index 
files are, the longer it takes to perform this sampling. Thus, for very large 
indexes (typically when you have a very large number of keys) the index 
sampling on start-up may be a significant issue.
   * A negative side-effect of a large row-cache is start-up time. The periodic 
saving of the row cache information only saves the keys that are cached; the 
data has to be pre-fetched on start-up. On a large data set, this is probably 
going to be seek-bound and the time it takes to warm up the row cache will be 
linear with respect to the row cache size (assuming sufficiently large amounts 
of data that the seek bound I/O is not subject to optimization by disks).
* Potential future improvement: 
[[https://issues.apache.org/jira/browse/CASSANDRA-1625|CASSANDRA-1625]].
+  * The total number of rows per node correlates directly with the size of 
bloom filters and sampled index entries. Expect the base memory requirement of 
a node to increase linearly with the number of keys (assuming the average row 
key size remains constant).
+   * You can decrease the memory use due to index sampling by changing the 
index sampling interval in cassandra.yaml
+   * You should soon be able to tweak the bloom filter sizes too once 
[[https://issues.apache.org/jira/browse/CASSANDRA-3497|CASSANDRA-3497]] is done
  


[Cassandra Wiki] Update of LargeDataSetConsiderations by PeterSchuller

2011-12-27 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The LargeDataSetConsiderations page has been changed by PeterSchuller:
http://wiki.apache.org/cassandra/LargeDataSetConsiderations?action=diffrev1=20rev2=21

   * Cassandra will read through sstable index files on start-up, doing what is 
known as index sampling. This is used to keep a subset (currently and by 
default, 1 out of 100) of keys and and their on-disk location in the index, in 
memory. See [[ArchitectureInternals]]. This means that the larger the index 
files are, the longer it takes to perform this sampling. Thus, for very large 
indexes (typically when you have a very large number of keys) the index 
sampling on start-up may be a significant issue.
   * A negative side-effect of a large row-cache is start-up time. The periodic 
saving of the row cache information only saves the keys that are cached; the 
data has to be pre-fetched on start-up. On a large data set, this is probably 
going to be seek-bound and the time it takes to warm up the row cache will be 
linear with respect to the row cache size (assuming sufficiently large amounts 
of data that the seek bound I/O is not subject to optimization by disks).
* Potential future improvement: 
[[https://issues.apache.org/jira/browse/CASSANDRA-1625|CASSANDRA-1625]].
-  * The total number of rows per node correlates directly with the size of 
bloom filters and sampled index entries. Expect the base memory requirement of 
a node to increase linearly with the number of keys (assuming the average row 
key size remains constant).
+  * The total number of rows per node correlates directly with the size of 
bloom filters and sampled index entries. Expect the base memory requirement of 
a node to increase linearly with the number of keys (assuming the average row 
key size remains constant). If you are not using caching at all (e.g. you are 
doing analysis type workloads), expect these two to be the two biggest 
consumers of memory.
* You can decrease the memory use due to index sampling by changing the 
index sampling interval in cassandra.yaml
* You should soon be able to tweak the bloom filter sizes too once 
[[https://issues.apache.org/jira/browse/CASSANDRA-3497|CASSANDRA-3497]] is done
  


[jira] [Updated] (CASSANDRA-3497) BloomFilter FP ratio should be configurable or size-restricted some other way

2011-12-27 Thread Yuki Morishita (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-3497:
--

Attachment: 0001-give-default-val-to-fp_chance.patch

Radim,

Thanks for the report. The problem is that the new bloom_filter_fp_chance in 
avro interface definition does not have proper default.
I attached the patch to fix it.

 BloomFilter FP ratio should be configurable or size-restricted some other way
 -

 Key: CASSANDRA-3497
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3497
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.7

 Attachments: 0001-give-default-val-to-fp_chance.patch, 3497-v3.txt, 
 3497-v4.txt, CASSANDRA-1.0-3497.txt


 When you have a live dc and purely analytical dc, in many situations you can 
 have less nodes on the analytical side, but end up getting restricted by 
 having the BloomFilters in-memory, even though you have absolutely no use for 
 them.  It would be nice if you could reduce this memory requirement by tuning 
 the desired FP ratio, or even just disabling them altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1224981 - /cassandra/branches/cassandra-1.0/src/avro/internode.genavro

2011-12-27 Thread jbellis
Author: jbellis
Date: Tue Dec 27 19:12:39 2011
New Revision: 1224981

URL: http://svn.apache.org/viewvc?rev=1224981view=rev
Log:
make avro bloom_filter_fp_chance default to null
patch by yukim; reviewed by jbellis for CASSANDRA-3497

Modified:
cassandra/branches/cassandra-1.0/src/avro/internode.genavro

Modified: cassandra/branches/cassandra-1.0/src/avro/internode.genavro
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/src/avro/internode.genavro?rev=1224981r1=1224980r2=1224981view=diff
==
--- cassandra/branches/cassandra-1.0/src/avro/internode.genavro (original)
+++ cassandra/branches/cassandra-1.0/src/avro/internode.genavro Tue Dec 27 
19:12:39 2011
@@ -71,7 +71,7 @@ protocol InterNode {
 union { null, string } compaction_strategy = null;
 union { null, mapstring } compaction_strategy_options = null;
 union { null, mapstring } compression_options = null;
-union { double, null } bloom_filter_fp_chance;
+union { null, double } bloom_filter_fp_chance = null;
 }
 
 @aliases([org.apache.cassandra.config.avro.KsDef])




[jira] [Resolved] (CASSANDRA-3497) BloomFilter FP ratio should be configurable or size-restricted some other way

2011-12-27 Thread Jonathan Ellis (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3497.
---

Resolution: Fixed

committed

 BloomFilter FP ratio should be configurable or size-restricted some other way
 -

 Key: CASSANDRA-3497
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3497
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.7

 Attachments: 0001-give-default-val-to-fp_chance.patch, 3497-v3.txt, 
 3497-v4.txt, CASSANDRA-1.0-3497.txt


 When you have a live dc and purely analytical dc, in many situations you can 
 have less nodes on the analytical side, but end up getting restricted by 
 having the BloomFilters in-memory, even though you have absolutely no use for 
 them.  It would be nice if you could reduce this memory requirement by tuning 
 the desired FP ratio, or even just disabling them altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1600) Merge get_indexed_slices with get_range_slices

2011-12-27 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176291#comment-13176291
 ] 

Jonathan Ellis commented on CASSANDRA-1600:
---

What do we gain from typedefing ListIndexExpression to FilterClause?  (I 
note this was part of Stu and my original attempts back in April but I don't 
remember a good reason for that.)

{noformat}
+/*
+ * XXX: If the range requested is a token range, we'll have to start 
at the beginning (and stop at the end) of
+ * the indexed row unfortunately (which will be inefficient), because 
we have not way to intuit the small
+ * possible key having a given token. A fix would be to actually store 
the token along the key in the
+ * indexed row.
+ */
{noformat}

This is fine since there's no reason to be searching by token unless you're 
doing an exhaustive scan, i.e. a m/r job.

{noformat}
+
rows.addAll(RangeSliceVerbHandler.executeLocally(command));
{noformat}

Another place the original patches failed... we should avoid this because it 
means we're now allowing one range scan per thrift client instead of one per 
read stage thread, and it bypasses the drop hopeless requests overcapacity 
protection built in there.  Look at SP.LocalReadRunnable for how to do this 
safely.  Simplest fix would be to just continue routing all range scans over 
MessagingService.

Nit: I'd remove this comment
{code}
+// Mostly just a typedef
{code}

since class definitions to hardcode a specific version of a generic type are an 
antipattern, but this is necessary to mix in the CloseableIterator interface.

 Merge get_indexed_slices with get_range_slices
 --

 Key: CASSANDRA-1600
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1600
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Stu Hood
Assignee: Sylvain Lebresne
 Fix For: 1.1

 Attachments: 
 0001-Add-optional-FilterClause-to-KeyRange-and-support-do-v2.patch, 
 0001-Add-optional-FilterClause-to-KeyRange-and-support-doin.txt, 
 0001-Add-optional-FilterClause-to-KeyRange-v3.patch, 
 0002-allow-get_range_slices-to-apply-filter-to-a-sequenti-v2.patch, 
 0002-allow-get_range_slices-to-apply-filter-to-a-sequential.txt, 
 0002-thrift-generated-code-changes-v3.patch, 
 0003-Allow-get_range_slices-to-apply-filter-to-a-sequenti-v3.patch, 
 0004-Update-cql-to-not-use-deprecated-index-scan-v3.patch


 From a comment on 1157:
 {quote}
 IndexClause only has a start key for get_indexed_slices, but it would seem 
 that the reasoning behind using 'KeyRange' for get_range_slices applies there 
 as well, since if you know the range you care about in the primary index, you 
 don't want to continue scanning until you exhaust 'count' (or the cluster).
 Since it would appear that get_indexed_slices would benefit from a KeyRange, 
 why not smash get_(range|indexed)_slices together, and make IndexClause an 
 optional field on KeyRange?
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-27 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-3623:
-

Attachment: (was: CRC+MMapIO.xlsx)

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-27 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-3623:
-

Attachment: (was: MMappedIO-Performance.docx)

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-27 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-3623:
-

Attachment: CRC+MMapIO.xlsx
MMappedIO-Performance.docx

Done,
1) fixed the data for 10K
2) rebased 3610

Thanks!

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file-v3.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v3.patch, CRC+MMapIO.xlsx, 
 MMappedIO-Performance.docx


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1224998 - in /cassandra/trunk: ./ contrib/ doc/cql/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/avro/ src/java/org/apache/cassandra/cql/ src/java/org/apache/cassandra/db/ s

2011-12-27 Thread jbellis
Author: jbellis
Date: Tue Dec 27 20:03:37 2011
New Revision: 1224998

URL: http://svn.apache.org/viewvc?rev=1224998view=rev
Log:
merge from 1.0

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/contrib/   (props changed)
cassandra/trunk/doc/cql/CQL.textile

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/avro/internode.genavro
cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g

cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java
cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
cassandra/trunk/src/java/org/apache/cassandra/db/index/SecondaryIndex.java

cassandra/trunk/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
cassandra/trunk/src/java/org/apache/cassandra/db/index/keys/KeysIndex.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Dec 27 20:03:37 2011
@@ -4,7 +4,7 @@
 
/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1198724,1198726-1206097,1206099-1220925,1220927-1222440
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
-/cassandra/branches/cassandra-1.0:1167085-1222743
+/cassandra/branches/cassandra-1.0:1167085-1224997
 
/cassandra/branches/cassandra-1.0.0:1167104-1167229,1167232-1181093,1181741,1181816,1181820,1182951,1183243
 /cassandra/branches/cassandra-1.0.5:1208016
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1224998r1=1224997r2=1224998view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Tue Dec 27 20:03:37 2011
@@ -45,11 +45,13 @@
  * Avoid creating empty and non cleaned writer during compaction 
(CASSANDRA-3616)
  * stop thrift service in shutdown hook so we can quiesce MessagingService
(CASSANDRA-3335)
+ * (CQL) compaction_strategy_options and compression_parameters for
+   CREATE COLUMNFAMILY statement (CASSANDRA-3374)
 Merged from 0.8:
  * avoid logging (harmless) exception when GC takes  1ms (CASSANDRA-3656)
  * prevent new nodes from thinking down nodes are up forever (CASSANDRA-3626)
  * Flush non-cfs backed secondary indexes (CASSANDRA-3659)
-
+ * Secondary Indexes should report memory consumption (CASSANDRA-3155)
 
 1.0.6
  * (CQL) fix cqlsh support for replicate_on_write (CASSANDRA-3596)

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Dec 27 20:03:37 2011
@@ -4,7 +4,7 @@
 
/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1198724,1198726-1206097,1206099-1220925,1220927-1222440
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369
 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018
-/cassandra/branches/cassandra-1.0/contrib:1167085-1222743
+/cassandra/branches/cassandra-1.0/contrib:1167085-1224997
 
/cassandra/branches/cassandra-1.0.0/contrib:1167104-1167229,1167232-1181093,1181741,1181816,1181820,1182951,1183243
 /cassandra/branches/cassandra-1.0.5/contrib:1208016
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689

Modified: cassandra/trunk/doc/cql/CQL.textile
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/doc/cql/CQL.textile?rev=1224998r1=1224997r2=1224998view=diff
==
--- cassandra/trunk/doc/cql/CQL.textile (original)
+++ cassandra/trunk/doc/cql/CQL.textile Tue Dec 27 20:03:37 2011
@@ -488,9 +488,14 @@ bc(syntax). 
 createColumnFamilyStatement ::= CREATE COLUMNFAMILY name
 ( term storageType PRIMARY KEY
 ( , term storageType )* )
-   ( WITH identifier = cfOptionVal
- ( AND identifier = cfOptionVal )* 
)?
+   ( WITH optionName = cfOptionVal
+ ( AND optionName = cfOptionVal )* 
)?
 ;
+optionName ::= identifier
+   | optionName : identifier
+   | optionName : integer
+   ;
+
 cfOptionVal 

svn commit: r1225001 [2/2] - in /cassandra/trunk: lib/ lib/licenses/ src/java/org/apache/cassandra/db/ test/unit/org/apache/cassandra/db/

2011-12-27 Thread jbellis
Modified: 
cassandra/trunk/src/java/org/apache/cassandra/db/TreeMapBackedSortedColumns.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/TreeMapBackedSortedColumns.java?rev=1225001r1=1225000r2=1225001view=diff
==
--- 
cassandra/trunk/src/java/org/apache/cassandra/db/TreeMapBackedSortedColumns.java
 (original)
+++ 
cassandra/trunk/src/java/org/apache/cassandra/db/TreeMapBackedSortedColumns.java
 Tue Dec 27 20:17:17 2011
@@ -24,11 +24,15 @@ import java.util.SortedMap;
 import java.util.SortedSet;
 import java.util.TreeMap;
 
+import com.google.common.base.Function;
+
 import org.apache.cassandra.db.marshal.AbstractType;
 import org.apache.cassandra.utils.Allocator;
 
-public class TreeMapBackedSortedColumns extends TreeMapByteBuffer, IColumn 
implements ISortedColumns
+public class TreeMapBackedSortedColumns extends 
AbstractThreadUnsafeSortedColumns implements ISortedColumns
 {
+private final TreeMapByteBuffer, IColumn map;
+
 public static final ISortedColumns.Factory factory = new Factory()
 {
 public ISortedColumns create(AbstractType? comparator, boolean 
insertReversed)
@@ -49,17 +53,17 @@ public class TreeMapBackedSortedColumns 
 
 public AbstractType? getComparator()
 {
-return (AbstractType)comparator();
+return (AbstractType)map.comparator();
 }
 
 private TreeMapBackedSortedColumns(AbstractType? comparator)
 {
-super(comparator);
+this.map = new TreeMapByteBuffer, IColumn(comparator);
 }
 
 private TreeMapBackedSortedColumns(SortedMapByteBuffer, IColumn columns)
 {
-super(columns);
+this.map = new TreeMapByteBuffer, IColumn(columns);
 }
 
 public ISortedColumns.Factory getFactory()
@@ -69,7 +73,7 @@ public class TreeMapBackedSortedColumns 
 
 public ISortedColumns cloneMe()
 {
-return new TreeMapBackedSortedColumns(this);
+return new TreeMapBackedSortedColumns(map);
 }
 
 public boolean isInsertReversed()
@@ -88,7 +92,7 @@ public class TreeMapBackedSortedColumns 
 // but TreeMap lacks putAbsent.  Rather than split it into a get, 
then put check, we do it as follows,
 // which saves the extra get in the no-conflict case [for both 
normal and super columns],
 // in exchange for a re-put in the SuperColumn case.
-IColumn oldColumn = put(name, column);
+IColumn oldColumn = map.put(name, column);
 if (oldColumn != null)
 {
 if (oldColumn instanceof SuperColumn)
@@ -98,13 +102,13 @@ public class TreeMapBackedSortedColumns 
 // add the new one to the old, then place old back in the Map, 
rather than copy the old contents
 // into the new Map entry.
 ((SuperColumn) oldColumn).putColumn((SuperColumn)column, 
allocator);
-put(name,  oldColumn);
+map.put(name,  oldColumn);
 }
 else
 {
 // calculate reconciled col from old (existing) col and new col
 IColumn reconciledColumn = column.reconcile(oldColumn, 
allocator);
-put(name, reconciledColumn);
+map.put(name, reconciledColumn);
 }
 }
 }
@@ -112,10 +116,10 @@ public class TreeMapBackedSortedColumns 
 /**
  * We need to go through each column in the column container and resolve 
it before adding
  */
-public void addAll(ISortedColumns cm, Allocator allocator)
+protected void addAllColumns(ISortedColumns cm, Allocator allocator, 
FunctionIColumn, IColumn transformation)
 {
 for (IColumn column : cm.getSortedColumns())
-addColumn(column, allocator);
+addColumn(transformation.apply(column), allocator);
 }
 
 public boolean replace(IColumn oldColumn, IColumn newColumn)
@@ -127,15 +131,15 @@ public class TreeMapBackedSortedColumns 
 // column or the column was not equal to oldColumn (to be coherent
 // with other implementation). We optimize for the common case where
 // oldColumn do is present though.
-IColumn previous = put(oldColumn.name(), newColumn);
+IColumn previous = map.put(oldColumn.name(), newColumn);
 if (previous == null)
 {
-remove(oldColumn.name());
+map.remove(oldColumn.name());
 return false;
 }
 if (!previous.equals(oldColumn))
 {
-put(oldColumn.name(), previous);
+map.put(oldColumn.name(), previous);
 return false;
 }
 return true;
@@ -143,37 +147,42 @@ public class TreeMapBackedSortedColumns 
 
 public IColumn getColumn(ByteBuffer name)
 {
-return get(name);
+return map.get(name);
 }
 
 public void removeColumn(ByteBuffer name)
 {
-remove(name);
+

[jira] [Commented] (CASSANDRA-2893) Add row-level isolation

2011-12-27 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176309#comment-13176309
 ] 

Jonathan Ellis commented on CASSANDRA-2893:
---

Committed with the Functions.identity change.  Leaving open for potential 
performance enhancements.

 Add row-level isolation
 ---

 Key: CASSANDRA-2893
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2893
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-Move-deletion-infos-into-ISortedColumns-v2.patch, 
 0001-Move-deletion-infos-into-ISortedColumns.patch, 
 0002-Make-memtable-use-CF.addAll-v2.patch, 
 0002-Make-memtable-use-CF.addAll.patch, 
 0003-Add-AtomicSortedColumn-and-snapTree-v2.patch, 
 0003-Add-AtomicSortedColumn-and-snapTree.patch, latency-plain.svg, 
 latency.svg, snaptree-0.1-SNAPSHOT.jar


 This could be done using an the atomic ConcurrentMap operations from the 
 Memtable and something like http://code.google.com/p/pcollections/ to replace 
 the ConcurrentSkipListMap in ThreadSafeSortedColumns.  The trick is that 
 pcollections does not provide a SortedMap, so we probably need to write our 
 own.
 Googling [persistent sortedmap] I found 
 http://code.google.com/p/actord/source/browse/trunk/actord/src/main/scala/ff/collection
  (in scala) and http://clojure.org/data_structures#Data Structures-Maps.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3676) Add snaptree dependency to maven central and update pom

2011-12-27 Thread T Jake Luciani (Created) (JIRA)
Add snaptree dependency to maven central and update pom
---

 Key: CASSANDRA-3676
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3676
 Project: Cassandra
  Issue Type: Sub-task
Reporter: T Jake Luciani
Assignee: Stephen Connolly
 Fix For: 1.1


Snaptree dependency needs to be added to maven before we can release 1.1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3497) BloomFilter FP ratio should be configurable or size-restricted some other way

2011-12-27 Thread Radim Kolar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176313#comment-13176313
 ] 

Radim Kolar commented on CASSANDRA-3497:


FP ratio it is not displayed in output of cli: show schema, describe;


 BloomFilter FP ratio should be configurable or size-restricted some other way
 -

 Key: CASSANDRA-3497
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3497
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.7

 Attachments: 0001-give-default-val-to-fp_chance.patch, 3497-v3.txt, 
 3497-v4.txt, CASSANDRA-1.0-3497.txt


 When you have a live dc and purely analytical dc, in many situations you can 
 have less nodes on the analytical side, but end up getting restricted by 
 having the BloomFilters in-memory, even though you have absolutely no use for 
 them.  It would be nice if you could reduce this memory requirement by tuning 
 the desired FP ratio, or even just disabling them altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1225002 - in /cassandra/trunk: CHANGES.txt NEWS.txt

2011-12-27 Thread jbellis
Author: jbellis
Date: Tue Dec 27 20:26:48 2011
New Revision: 1225002

URL: http://svn.apache.org/viewvc?rev=1225002view=rev
Log:
update CHANGES, NEWS

Modified:
cassandra/trunk/CHANGES.txt
cassandra/trunk/NEWS.txt

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1225002r1=1225001r2=1225002view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Tue Dec 27 20:26:48 2011
@@ -1,4 +1,5 @@
 1.1-dev
+ * add row-level isolation via SnapTree (CASSANDRA-2893)
  * Optimize key count estimation when opening sstable on startup
(CASSANDRA-2988)
  * multi-dc replication optimization supporting CL  ONE (CASSANDRA-3577)

Modified: cassandra/trunk/NEWS.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/NEWS.txt?rev=1225002r1=1225001r2=1225002view=diff
==
--- cassandra/trunk/NEWS.txt (original)
+++ cassandra/trunk/NEWS.txt Tue Dec 27 20:26:48 2011
@@ -35,6 +35,14 @@ Upgrading
   and row_cache_{size_in_mb, save_period} in conf/cassandra.yaml are
   used instead of per-ColumnFamily options.
 
+Features
+
+- Cassandra 1.1 adds row-level isolation.  Multi-column updates to
+  a single row have always been *atomic* (either all will be applied,
+  or none) thanks to the CommitLog, but until 1.1 they were not *isolated*
+  -- a reader may see mixed old and new values while the update happens.
+
+
 1.0.6
 =
 




[jira] [Commented] (CASSANDRA-3676) Add snaptree dependency to maven central and update pom

2011-12-27 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176314#comment-13176314
 ] 

Jonathan Ellis commented on CASSANDRA-3676:
---

https://github.com/nbronson/snaptree

 Add snaptree dependency to maven central and update pom
 ---

 Key: CASSANDRA-3676
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3676
 Project: Cassandra
  Issue Type: Sub-task
Reporter: T Jake Luciani
Assignee: Stephen Connolly
 Fix For: 1.1


 Snaptree dependency needs to be added to maven before we can release 1.1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3677) NPE during HH delivery when gossip turned off on target

2011-12-27 Thread Radim Kolar (Created) (JIRA)
NPE during HH delivery when gossip turned off on target
---

 Key: CASSANDRA-3677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3677
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.6
Reporter: Radim Kolar
Priority: Trivial


probably not important bug

ERROR [OptionalTasks:1] 2011-12-27 21:44:25,342 AbstractCassandraDaemon.java 
(line 138) Fatal exception in thread Thread[OptionalTasks:1,5,main]
java.lang.NullPointerException
at 
org.cliffc.high_scale_lib.NonBlockingHashMap.hash(NonBlockingHashMap.java:113)
at 
org.cliffc.high_scale_lib.NonBlockingHashMap.putIfMatch(NonBlockingHashMap.java:553)
at 
org.cliffc.high_scale_lib.NonBlockingHashMap.putIfMatch(NonBlockingHashMap.java:348)
at 
org.cliffc.high_scale_lib.NonBlockingHashMap.putIfAbsent(NonBlockingHashMap.java:319)
at 
org.cliffc.high_scale_lib.NonBlockingHashSet.add(NonBlockingHashSet.java:32)
at 
org.apache.cassandra.db.HintedHandOffManager.scheduleHintDelivery(HintedHandOffManager.java:371)
at 
org.apache.cassandra.db.HintedHandOffManager.scheduleAllDeliveries(HintedHandOffManager.java:356)
at 
org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:84)
at 
org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:119)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:165)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:267)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3658) Fix smallish problems find by FindBugs

2011-12-27 Thread Nick Bailey (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176320#comment-13176320
 ] 

Nick Bailey commented on CASSANDRA-3658:


This breaks a bunch of jmx stuff. A fair amount of jmx methods return Token 
objects so they need to be serializable. I plan on doing CASSANDRA-2805 for 
1.1, but jmx will be broken in trunk until I get that done unless that specific 
patch is reverted.

 Fix smallish problems find by FindBugs
 --

 Key: CASSANDRA-3658
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3658
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: fingbugs
 Fix For: 1.1

 Attachments: 0001-Respect-Future-semantic.patch, 
 0002-Avoid-race-when-reloading-snitch-file.patch, 
 0003-use-static-inner-class-when-possible.patch, 0004-Remove-dead-code.patch, 
 0005-Protect-against-signed-byte-extension.patch, 
 0006-Add-hashCode-method-when-equals-is-overriden.patch, 
 0007-Inverse-argument-of-compare-instead-of-negating-to-a.patch, 
 0008-stop-pretending-Token-is-Serializable-LocalToken-is-.patch, 
 0009-remove-useless-assert-that-is-always-true.patch, 
 0010-Add-equals-and-hashCode-to-Expiring-column.patch


 I've just run (the newly released) FindBugs 2 out of curiosity. Attaching a 
 number of patches related to issue raised by it. There is nothing major at 
 all so all patches are against trunk.
 I've tried keep each issue to it's own patch with a self describing title. It 
 far from covers all FindBugs alerts, but it's a picky tool so I've tried to 
 address only what felt at least vaguely useful. Those are still mostly nits 
 (only patch 2 is probably an actual bug).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1225014 - in /cassandra/trunk/src/java/org/apache/cassandra/dht: LocalToken.java Token.java

2011-12-27 Thread jbellis
Author: jbellis
Date: Tue Dec 27 21:01:57 2011
New Revision: 1225014

URL: http://svn.apache.org/viewvc?rev=1225014view=rev
Log:
make Token serializable again for JMX

Modified:
cassandra/trunk/src/java/org/apache/cassandra/dht/LocalToken.java
cassandra/trunk/src/java/org/apache/cassandra/dht/Token.java

Modified: cassandra/trunk/src/java/org/apache/cassandra/dht/LocalToken.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/dht/LocalToken.java?rev=1225014r1=1225013r2=1225014view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/dht/LocalToken.java (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/dht/LocalToken.java Tue Dec 
27 21:01:57 2011
@@ -24,6 +24,8 @@ import org.apache.cassandra.db.marshal.A
 
 public class LocalToken extends TokenByteBuffer
 {
+static final long serialVersionUID = 8437543776403014875L;
+
 private final AbstractType comparator;
 
 public LocalToken(AbstractType comparator, ByteBuffer token)

Modified: cassandra/trunk/src/java/org/apache/cassandra/dht/Token.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/dht/Token.java?rev=1225014r1=1225013r2=1225014view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/dht/Token.java (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/dht/Token.java Tue Dec 27 
21:01:57 2011
@@ -30,7 +30,7 @@ import org.apache.cassandra.io.ISerializ
 import org.apache.cassandra.service.StorageService;
 import org.apache.cassandra.utils.ByteBufferUtil;
 
-public abstract class TokenT implements RingPositionTokenT
+public abstract class TokenT implements RingPositionTokenT, Serializable
 {
 private static final long serialVersionUID = 1L;
 




[jira] [Commented] (CASSANDRA-3658) Fix smallish problems find by FindBugs

2011-12-27 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176324#comment-13176324
 ] 

Jonathan Ellis commented on CASSANDRA-3658:
---

reverted 0008 for now

 Fix smallish problems find by FindBugs
 --

 Key: CASSANDRA-3658
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3658
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: fingbugs
 Fix For: 1.1

 Attachments: 0001-Respect-Future-semantic.patch, 
 0002-Avoid-race-when-reloading-snitch-file.patch, 
 0003-use-static-inner-class-when-possible.patch, 0004-Remove-dead-code.patch, 
 0005-Protect-against-signed-byte-extension.patch, 
 0006-Add-hashCode-method-when-equals-is-overriden.patch, 
 0007-Inverse-argument-of-compare-instead-of-negating-to-a.patch, 
 0008-stop-pretending-Token-is-Serializable-LocalToken-is-.patch, 
 0009-remove-useless-assert-that-is-always-true.patch, 
 0010-Add-equals-and-hashCode-to-Expiring-column.patch


 I've just run (the newly released) FindBugs 2 out of curiosity. Attaching a 
 number of patches related to issue raised by it. There is nothing major at 
 all so all patches are against trunk.
 I've tried keep each issue to it's own patch with a self describing title. It 
 far from covers all FindBugs alerts, but it's a picky tool so I've tried to 
 address only what felt at least vaguely useful. Those are still mostly nits 
 (only patch 2 is probably an actual bug).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3507) Proposal: separate cqlsh from CQL drivers

2011-12-27 Thread paul cannon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176341#comment-13176341
 ] 

paul cannon commented on CASSANDRA-3507:


bq. Is it possible for the ASF contributors to vote on code that isn't in the 
official tree, like, say, a particular tag of the python CQL driver at Apache 
Extras? If we can distribute the drivers in the same official repository, most 
of these problems go away.

I read through all the rules I can find, and I see nothing prohibiting us from 
voting on and releasing specific source/binary artifacts of the various cql 
drivers alongside c*, as long as they follow the ASF licensing restrictions.

http://www.apache.org/dev/release.html#distribute-other-artifacts seems the 
most apropos.

So, I propose that we call a vote for a Cassandra project release of 
cassandra-dbapi2, alias python-cql, once I get the ASF licensing stuff sorted 
in it, and tag and post its 1.0.7 version. Then we can put the python-cql debs 
in the official debian repository, and everything is happy.

 Proposal: separate cqlsh from CQL drivers
 -

 Key: CASSANDRA-3507
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3507
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging, Tools
Affects Versions: 1.0.3
 Environment: Debian-based systems
Reporter: paul cannon
Assignee: paul cannon
Priority: Minor
  Labels: cql, cqlsh
 Fix For: 1.1


 Whereas:
 * It has been shown to be very desirable to decouple the release cycles of 
 Cassandra from the various client CQL drivers, and
 * It is also desirable to include a good interactive CQL client with releases 
 of Cassandra, and
 * It is not desirable for Cassandra releases to depend on 3rd-party software 
 which is neither bundled with Cassandra nor readily available for every 
 target platform, but
 * Any good interactive CQL client will require a CQL driver;
 Therefore, be it resolved that:
 * cqlsh will not use an official or supported CQL driver, but will include 
 its own private CQL driver, not intended for use by anything else, and
 * the Cassandra project will still recommend installing and using a proper 
 CQL driver for client software.
 To ease maintenance, the private CQL driver included with cqlsh may very well 
 be created by copying the python CQL driver from one directory into 
 another, but the user shouldn't rely on this. Maybe we even ought to take 
 some minor steps to discourage its use for other purposes.
 Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Update of NodeTool by JanneJalkanen

2011-12-27 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The NodeTool page has been changed by JanneJalkanen:
http://wiki.apache.org/cassandra/NodeTool?action=diffrev1=21rev2=22

Comment:
setcompactionthroughput documented

  == Scrub ==
  Cassandra v0.7.1 and v0.7.2 shipped with a bug that caused incorrect 
row-level bloom filters to be generated when compacting sstables generated with 
earlier versions.  This would manifest in IOExceptions during column name-based 
queries.  v0.7.3 provides nodetool scrub to rebuild sstables with correct 
bloom filters, with no data lost. (If your cluster was never on 0.7.0 or 
earlier, you don't have to worry about this.)  Note that nodetool scrub will 
snapshot your data files before rebuilding, just in case.
  
- == upgradesstables ==
+ == Upgradesstables ==
  
  While scrub does rebuild your sstables, it will also discard data it deems 
broken and create a snapshot, which you have to remove manually.  If you just 
wish to rebuild your sstables without all that jazz, then use nodetool 
upgradesstables.  This is useful e.g. when you are upgrading your server, or 
changing compression options.
  
  upgradesstables is available from Cassandra 1.0.4 onwards.
+ 
+ == Setcompactionthroughput ==
+ 
+ As of Cassandra 1.0, the amount of resources that compactions can use can be 
easily controlled using a single value: the compaction throughput, which is 
expressed in Megabytes/second.  You can (and probably should) specify this in 
your cassandra.yaml file, but in some cases it can be very beneficial to change 
it live using the nodetool.
+ 
+ For example, in 
[[http://www.slideshare.net/edwardcapriolo/m6d-cassandrapresentation|this 
presentation]] Edward Capriolo explains how their company throttles compaction 
during the day so that I/O is mostly reserved for serving requests, whereas 
during the night they allocate more capability for running compactions.  This 
can be e.g. accomplished through a simple cron script:
+ 
+ {{{
+ # Script increases compaction throughput to 999 MB/s (i.e. nearly unlimited) 
for 00-06.
+ #
+ # turn into Mr.batch at night
+ 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999
+ # turn back into Dr.Realtime for day
+ 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16
+ }}}
+ 
+ Setting the compaction throughput to zero disables compaction.  This may be 
useful in some cases if you e.g. wish to avoid the compaction I/O during 
extremely busy periods.  It is not a good idea to leave it on for a long 
period, since you will end up with a large number of very small sstables, which 
will start to slow down your reads.
  
  == Cfhistograms ==
  


[jira] [Updated] (CASSANDRA-3611) Make checksum on a compressed blocks optional

2011-12-27 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-3611:
-

Attachment: 0001-crc-check-chance-v3.patch

Done, Thanks!

 Make checksum on a compressed blocks optional
 -

 Key: CASSANDRA-3611
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3611
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
Priority: Minor
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-crc-check-chance-v2.patch, 
 0001-crc-check-chance-v3.patch, 0001-crc-check-chance.patch


 Currently every uncompressed block is run against checksum algo, there is cpu 
 overhead in doing same... We might want to make it configurable/optional for 
 some use cases which might not require checksum all the time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3583) Add rebuild index JMX command

2011-12-27 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176352#comment-13176352
 ] 

Vijay commented on CASSANDRA-3583:
--

Hi Jonathan,

I think you also want to clear the Built flag on the index(es) or the rebuild 
will be incomplete if you cancel or restart partway through.
in the current patch, we dont need to as the new indexes will be new SST's and 
if some one stopped in between then it wont be worser than what it was 
earlier... otherwise we might want to clear the field when we start and reset 
it in the end that way the clients might notice some additional missing 
indexes. Agree?

 Add rebuild index JMX command
 ---

 Key: CASSANDRA-3583
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3583
 Project: Cassandra
  Issue Type: New Feature
  Components: Core, Tools
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-3583.patch


 CASSANDRA-1740 allows aborting an index build, but there is no way to 
 re-attempt the build without restarting the server.
 We've also had requests to allow rebuilding an index that *has* been built, 
 so it would be nice to kill two birds with one stone here.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3631) While sleeping for RING_DELAY, bootstrapping nodes do not show as joining in the ring (or at all)

2011-12-27 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-3631:
-

Attachment: 0001-add-initializing-status-in-nodetool-for-3631.patch

Hi Brandon, Let me know if the attached patch is sufficient... It looks like 
the following.

Address DC  RackStatus State   LoadOwns
Token   
   
143633478586163499463326301508681906517 
10.123.42.165   us-east 1a  Down   Init?   ?   
?   
10.42.134.229   us-east 1a  Up Normal  1.74 GB 35.40%  
33724529808132598296109669138912087817  
10.93.19.6  us-east 1a  Down   Normal  1.78 GB 38.69%  
99546828780538918038465713665698202555  
10.93.74.164us-east 1a  Up Normal  120.37 MB   2.53%   
103845090001698524309715695190561870103 
10.123.59.26us-east 1a  Up Normal  1.15 GB 23.39%  
143633478586163499463326301508681906517

 While sleeping for RING_DELAY, bootstrapping nodes do not show as joining in 
 the ring (or at all)
 -

 Key: CASSANDRA-3631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3631
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.0
Reporter: Brandon Williams
Assignee: Vijay
Priority: Minor
 Fix For: 1.0.7

 Attachments: 0001-add-initializing-status-in-nodetool-for-3631.patch


 As the title says, the nodes do not show in the ring until they are actually 
 in the token selection/streaming phase.  This appears due to CASSANDRA-957, 
 but now can be further exacerbated by longer sleep times for CASSANDRA-3629.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3112) Make repair fail when an unexpected error occurs

2011-12-27 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176420#comment-13176420
 ] 

Vijay commented on CASSANDRA-3112:
--

But do you know what is the reason for it making no progress? Because unless 
we know what can cause it, not sure what to fix?
it is usually is in the Streaming phase, i think adding a SoTimeout might fix 
it... but it is so random i couldn't reproduce in my tests but definitely 
seeing it in production.

How can we lose messages, aren't tcp supposed to avoid this?
Once you send the message the other node might get restarted (without 
validation or starting any thing) or the sockets can get reset, Actually i 
think when i posted this message it was because of CASSANDRA-3577. There isnt 
something like hints or a retry on the messages sent for the repairs.

I understand this isnt the scope of this ticket, but i still think there should 
be a way to orchestrate repairs with a little complicated logic and i will try 
to do some parts of it in the other ticket.




 Make repair fail when an unexpected error occurs
 

 Key: CASSANDRA-3112
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3112
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: repair
 Fix For: 1.1

 Attachments: 0003-Report-streaming-errors-back-to-repair-v4.patch, 
 0004-Reports-validation-compaction-errors-back-to-repair-v4.patch


 CASSANDRA-2433 makes it so that nodetool repair will fail if a node 
 participating to repair dies before completing his part of the repair. This 
 handles most of the situation where repair was previously hanging, but repair 
 can still hang if an unexpected error occurs during either the merkle tree 
 creation (an on-disk corruption triggers an IOError say) or during streaming 
 (though I'm not sure what could make streaming failed outside of 'one of the 
 node died' (besides a bug)).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-12-27 Thread Matt Stump (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176455#comment-13176455
 ] 

Matt Stump commented on CASSANDRA-2474:
---

I wanted to bring this up because it hasn't been mentioned yet, and it's 
currently a topic of discussion on the hector-users list:  for query results of 
composite columns are you going to deserialize the column name or leave it as 
an opaque blob? For the current implementation of composite columns in Hector 
the type information for dynamic composites is encoded in the name, but that 
information is lacking for the static variety.  My understanding is that the 
type information is only stored at the CFDef level as the type alias, and could 
possibly be cached to aide in deserialization but that seems like a bit of a 
hack.

 CQL support for compound columns
 

 Key: CASSANDRA-2474
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Eric Evans
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 1.1

 Attachments: 2474-transposed-1.PNG, 2474-transposed-raw.PNG, 
 2474-transposed-select-no-sparse.PNG, 2474-transposed-select.PNG, 
 raw_composite.txt, screenshot-1.jpg, screenshot-2.jpg


 For the most part, this boils down to supporting the specification of 
 compound column names (the CQL syntax is colon-delimted terms), and then 
 teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3678) New Pluggable Compaction to handle Capped Rows / Super Columns

2011-12-27 Thread Praveen Baratam (Created) (JIRA)
New Pluggable Compaction to handle Capped Rows / Super Columns
--

 Key: CASSANDRA-3678
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3678
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Contrib, Core
 Environment: ALL
Reporter: Praveen Baratam


Now that Pluggable Compaction is released, its feasible to implement a 
CompactionStrategy that handles Capped (Limited in size) Rows or SuperColumns 
in a ColumnFamily. This feature was requested many times on mailing lists by 
many people including me.

http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Use-Case-scenario-Keeping-a-window-of-data-online-analytics-td4694907.html

The above thread was quoted in Cassandra - Use Cases too.

Reading and interpreting many conversations over this issue, I could infer that 
it was discussed in two flavors.

1. Enforcing Max Columns per Row/SC 
2. Sliding Time Window


Many a times MEMTABLE/SSTABLE approach of Cassandra is quoted as a limiting 
factor for an amicable implementation. In  my perspective the above mentioned 
SSTABLE approach could mean some trade-offs and clever engineering but its 
still doable.

This feature is not intended to offer a drop-in replacement for specialized 
tools like RRDTool, jRobin, etc. but to decrease the overhead of retro fitting 
such functionality into CASSANDRA and finding an approach that achieves the 
principal purpose of discarding obsolete data and stretching only as far as 
necessary.

This ticket is to discuss ideas and implementation details of such compaction 
strategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3678) New Pluggable Compaction to handle Capped Rows / Super Columns

2011-12-27 Thread Praveen Baratam (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Praveen Baratam updated CASSANDRA-3678:
---

Description: 
Now that Pluggable Compaction is released, its feasible to implement a 
CompactionStrategy that handles Capped (Limited in size) Rows or SuperColumns 
in a ColumnFamily. This feature was requested many times on mailing lists by 
many people including me.

http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Use-Case-scenario-Keeping-a-window-of-data-online-analytics-td4694907.html

The above thread was quoted in Cassandra - Use Cases too.

Reading and interpreting many conversations over this issue, I could infer that 
it was discussed in two flavors.

1. Enforcing Max Columns per Row/SC 
2. Sliding Time Window


Many a times MEMTABLE/SSTABLE approach of Cassandra is quoted as a limiting 
factor for an amicable implementation. In  my perspective the above mentioned 
SSTABLE approach could mean some trade-offs and clever engineering but its 
still doable.

This feature is not intended to offer a drop-in replacement for specialized 
tools like RRDTool, jRobin, etc. but to decrease the overhead of retro fitting 
such functionality into CASSANDRA and finding an approach that achieves the 
principal purpose of discarding obsolete data and stretching only as far as 
necessary.

  was:
Now that Pluggable Compaction is released, its feasible to implement a 
CompactionStrategy that handles Capped (Limited in size) Rows or SuperColumns 
in a ColumnFamily. This feature was requested many times on mailing lists by 
many people including me.

http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Use-Case-scenario-Keeping-a-window-of-data-online-analytics-td4694907.html

The above thread was quoted in Cassandra - Use Cases too.

Reading and interpreting many conversations over this issue, I could infer that 
it was discussed in two flavors.

1. Enforcing Max Columns per Row/SC 
2. Sliding Time Window


Many a times MEMTABLE/SSTABLE approach of Cassandra is quoted as a limiting 
factor for an amicable implementation. In  my perspective the above mentioned 
SSTABLE approach could mean some trade-offs and clever engineering but its 
still doable.

This feature is not intended to offer a drop-in replacement for specialized 
tools like RRDTool, jRobin, etc. but to decrease the overhead of retro fitting 
such functionality into CASSANDRA and finding an approach that achieves the 
principal purpose of discarding obsolete data and stretching only as far as 
necessary.

This ticket is to discuss ideas and implementation details of such compaction 
strategy.


 New Pluggable Compaction to handle Capped Rows / Super Columns
 --

 Key: CASSANDRA-3678
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3678
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Contrib, Core
 Environment: ALL
Reporter: Praveen Baratam
  Labels: features
   Original Estimate: 672h
  Remaining Estimate: 672h

 Now that Pluggable Compaction is released, its feasible to implement a 
 CompactionStrategy that handles Capped (Limited in size) Rows or SuperColumns 
 in a ColumnFamily. This feature was requested many times on mailing lists by 
 many people including me.
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Use-Case-scenario-Keeping-a-window-of-data-online-analytics-td4694907.html
 The above thread was quoted in Cassandra - Use Cases too.
 Reading and interpreting many conversations over this issue, I could infer 
 that it was discussed in two flavors.
 1. Enforcing Max Columns per Row/SC 
 2. Sliding Time Window
 Many a times MEMTABLE/SSTABLE approach of Cassandra is quoted as a limiting 
 factor for an amicable implementation. In  my perspective the above mentioned 
 SSTABLE approach could mean some trade-offs and clever engineering but its 
 still doable.
 This feature is not intended to offer a drop-in replacement for specialized 
 tools like RRDTool, jRobin, etc. but to decrease the overhead of retro 
 fitting such functionality into CASSANDRA and finding an approach that 
 achieves the principal purpose of discarding obsolete data and stretching 
 only as far as necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira