[jira] Commented: (CASSANDRA-2014) Can't delete whole row from Hadoop MapReduce

2011-02-17 Thread Patrik Modesto (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995728#comment-12995728
 ] 

Patrik Modesto commented on CASSANDRA-2014:
---

Hi, can the fix be merged please?

> Can't delete whole row from Hadoop MapReduce
> 
>
> Key: CASSANDRA-2014
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2014
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.7.0
> Environment: Debian Linux 2.6.32 amd64
>Reporter: Patrik Modesto
> Fix For: 0.7.2
>
> Attachments: 2014-mr-delete-whole-row.patch
>
>
> ColumnFamilyRecordWriter.java doesn't support Mutation with Deletion without 
> slice_predicat and super_column to delete whole row. The other way I tried is 
> to specify SlicePredicate with empty start and finish and I got:
> {code}
> java.io.IOException: InvalidRequestException(why:Deletion does not yet 
> support SliceRange predicates.)
> at 
> org.apache.cassandra.hadoop.ColumnFamilyRecordWriter$RangeClient.run(ColumnFamilyRecordWriter.java:355)
> {code}
> I tryied to patch the ColumnFamilyRecordWriter.java like this:
> {code}
> --- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
> +++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
> @@ -166,10 +166,17 @@ implements 
> org.apache.hadoop.mapred.RecordWriter  // deletion
>  Deletion deletion = new Deletion(amut.deletion.timestamp);
>  mutation.setDeletion(deletion);
> +
>  org.apache.cassandra.avro.SlicePredicate apred = 
> amut.deletion.predicate;
> -if (amut.deletion.super_column != null)
> +if (apred == null && amut.deletion.super_column == null)
> +{
> +// epmty; delete whole row
> +}
> +else if (amut.deletion.super_column != null)
> +{
>  // super column
>  deletion.setSuper_column(copy(amut.deletion.super_column));
> +}
>  else if (apred.column_names != null)
>  {
>  // column names
> {code}
> but that didn't work as well.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2180) Cassandra node should log in system.log how and when it was signalled to stop and when it stopped and if it stopped without problems

2011-02-17 Thread Mateusz Korniak (JIRA)
Cassandra node should log in system.log how and when it was signalled to stop 
and when it stopped and if it stopped without problems


 Key: CASSANDRA-2180
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2180
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
 Environment: Linux
Reporter: Mateusz Korniak
Priority: Minor


During start we see: 
 INFO [main] 2011-02-17 10:47:46,754 AbstractCassandraDaemon.java (line 77) 
Logging initialized
But during stopping node via kill -TERM node.pid last message in logs we see:
 INFO [Thread-3] 2011-02-17 10:49:22,290 CassandraDaemon.java (line 153) 
Listening for thrift clients...

Information when  cassandra node was closed gracefully (signalled) and if whole 
process went smooth would be nice there.
TIA


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2070) replace ExpiringMap with google collections MapMaker

2011-02-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2070:
---

Attachment: CASSANDRA-2070.patch

> replace ExpiringMap with google collections MapMaker
> 
>
> Key: CASSANDRA-2070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.2
>
> Attachments: CASSANDRA-2070.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2070) replace ExpiringMap with google collections MapMaker

2011-02-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2070:
---

Remaining Estimate: 4h
 Original Estimate: 4h

> replace ExpiringMap with google collections MapMaker
> 
>
> Key: CASSANDRA-2070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.8
>
> Attachments: CASSANDRA-2070.patch
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2181) sstable2json should return better error message if the usage is wrong

2011-02-17 Thread Shotaro Kamio (JIRA)
sstable2json should return better error message if the usage is wrong
-

 Key: CASSANDRA-2181
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2181
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.7.2
 Environment: linux
Reporter: Shotaro Kamio
Priority: Minor


These errors are not user friendly.
(Cassandra 0.7.2)

$ bin/sstable2json PATH_TO/Order-f-7-Data.db -k aaa -x 0
 WARN 21:55:34,383 Schema definitions were defined both locally and in 
cassandra.yaml. Definitions in cassandra.yaml were ignored.
{
"aaa": {Exception in thread "main" java.lang.NullPointerException
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108)
at 
org.apache.cassandra.tools.SSTableExport.serializeRow(SSTableExport.java:178)
at 
org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:310)
at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:444)

$ bin/sstable2json PATH_TO/Order-f-7-Data.db -k aaa 
 WARN 21:55:49,603 Schema definitions were defined both locally and in 
cassandra.yaml. Definitions in cassandra.yaml were ignored.
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.cassandra.tools.SSTableExport.export(SSTableExport.java:284)
at org.apache.cassandra.tools.SSTableExport.main(SSTableExport.java:444)



-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Resolved: (CASSANDRA-2180) Cassandra node should log in system.log how and when it was signalled to stop and when it stopped and if it stopped without problems

2011-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2180.
---

Resolution: Invalid

Cassandra is crash-only. See 
http://www.usenix.org/events/hotos03/tech/candea.html for background.

> Cassandra node should log in system.log how and when it was signalled to stop 
> and when it stopped and if it stopped without problems
> 
>
> Key: CASSANDRA-2180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2180
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
> Environment: Linux
>Reporter: Mateusz Korniak
>Priority: Minor
>
> During start we see: 
>  INFO [main] 2011-02-17 10:47:46,754 AbstractCassandraDaemon.java (line 77) 
> Logging initialized
> But during stopping node via kill -TERM node.pid last message in logs we see:
>  INFO [Thread-3] 2011-02-17 10:49:22,290 CassandraDaemon.java (line 153) 
> Listening for thrift clients...
> Information when  cassandra node was closed gracefully (signalled) and if 
> whole process went smooth would be nice there.
> TIA

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1071636 - in /cassandra/branches/cassandra-0.7: CHANGES.txt src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java

2011-02-17 Thread jbellis
Author: jbellis
Date: Thu Feb 17 15:08:24 2011
New Revision: 1071636

URL: http://svn.apache.org/viewvc?rev=1071636&view=rev
Log:
Handle whole-row deletions in CFOutputFormat
patch by Patrik Modesto; reviewed by jbellis for CASSANDRA-2014

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1071636&r1=1071635&r2=1071636&view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Thu Feb 17 15:08:24 2011
@@ -3,8 +3,10 @@
  * lower-latency read repair (CASSANDRA-2069)
  * add hinted_handoff_throttle_delay_in_ms option (CASSANDRA-2161)
  * fixes for cache save/load (CASSANDRA-2172, -2174)
+ * Handle whole-row deletions in CFOutputFormat (CASSANDRA-2014)
  * Make memtable_flush_writers flush in parallel (CASSANDRA-2178)
 
+
 0.7.2
  * copy DecoratedKey.key when inserting into caches to avoid retaining
a reference to the underlying buffer (CASSANDRA-2102)

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java?rev=1071636&r1=1071635&r2=1071636&view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
 Thu Feb 17 15:08:24 2011
@@ -143,33 +143,23 @@ implements org.apache.hadoop.mapred.Reco
 {
 Mutation mutation = new Mutation();
 org.apache.cassandra.avro.ColumnOrSuperColumn acosc = 
amut.column_or_supercolumn;
-if (acosc != null)
-{
-// creation
-ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
-mutation.setColumn_or_supercolumn(cosc);
-if (acosc.column != null)
-// standard column
-cosc.setColumn(avroToThrift(acosc.column));
-else
-{
-// super column
-ByteBuffer scolname = acosc.super_column.name;
-List scolcols = new 
ArrayList(acosc.super_column.columns.size());
-for (org.apache.cassandra.avro.Column acol : 
acosc.super_column.columns)
-scolcols.add(avroToThrift(acol));
-cosc.setSuper_column(new SuperColumn(scolname, scolcols));
-}
-}
-else
+if (acosc == null)
 {
 // deletion
+assert amut.deletion != null;
 Deletion deletion = new Deletion(amut.deletion.timestamp);
 mutation.setDeletion(deletion);
+
 org.apache.cassandra.avro.SlicePredicate apred = 
amut.deletion.predicate;
-if (amut.deletion.super_column != null)
+if (apred == null && amut.deletion.super_column == null)
+{
+// leave Deletion alone to delete entire row
+}
+else if (amut.deletion.super_column != null)
+{
 // super column
 
deletion.setSuper_column(ByteBufferUtil.getArray(amut.deletion.super_column));
+}
 else if (apred.column_names != null)
 {
 // column names
@@ -184,6 +174,24 @@ implements org.apache.hadoop.mapred.Reco
 deletion.setPredicate(new 
SlicePredicate().setSlice_range(avroToThrift(apred.slice_range)));
 }
 }
+else
+{
+// creation
+ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
+mutation.setColumn_or_supercolumn(cosc);
+if (acosc.column != null)
+// standard column
+cosc.setColumn(avroToThrift(acosc.column));
+else
+{
+// super column
+ByteBuffer scolname = acosc.super_column.name;
+List scolcols = new 
ArrayList(acosc.super_column.columns.size());
+for (org.apache.cassandra.avro.Column acol : 
acosc.super_column.columns)
+scolcols.add(avroToThrift(acol));
+cosc.setSuper_column(new SuperColumn(scolname, scolcols));
+}
+}
 return mutation;
 }
 




[jira] Resolved: (CASSANDRA-2014) Can't delete whole row from Hadoop MapReduce

2011-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2014.
---

   Resolution: Fixed
Fix Version/s: (was: 0.7.2)
   0.7.3
 Reviewer: jbellis  (was: stuhood)
 Assignee: Patrik Modesto

committed, thanks!

> Can't delete whole row from Hadoop MapReduce
> 
>
> Key: CASSANDRA-2014
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2014
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.7.0
> Environment: Debian Linux 2.6.32 amd64
>Reporter: Patrik Modesto
>Assignee: Patrik Modesto
> Fix For: 0.7.3
>
> Attachments: 2014-mr-delete-whole-row.patch
>
>
> ColumnFamilyRecordWriter.java doesn't support Mutation with Deletion without 
> slice_predicat and super_column to delete whole row. The other way I tried is 
> to specify SlicePredicate with empty start and finish and I got:
> {code}
> java.io.IOException: InvalidRequestException(why:Deletion does not yet 
> support SliceRange predicates.)
> at 
> org.apache.cassandra.hadoop.ColumnFamilyRecordWriter$RangeClient.run(ColumnFamilyRecordWriter.java:355)
> {code}
> I tryied to patch the ColumnFamilyRecordWriter.java like this:
> {code}
> --- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
> +++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
> @@ -166,10 +166,17 @@ implements 
> org.apache.hadoop.mapred.RecordWriter  // deletion
>  Deletion deletion = new Deletion(amut.deletion.timestamp);
>  mutation.setDeletion(deletion);
> +
>  org.apache.cassandra.avro.SlicePredicate apred = 
> amut.deletion.predicate;
> -if (amut.deletion.super_column != null)
> +if (apred == null && amut.deletion.super_column == null)
> +{
> +// epmty; delete whole row
> +}
> +else if (amut.deletion.super_column != null)
> +{
>  // super column
>  deletion.setSuper_column(copy(amut.deletion.super_column));
> +}
>  else if (apred.column_names != null)
>  {
>  // column names
> {code}
> but that didn't work as well.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Reopened: (CASSANDRA-2070) replace ExpiringMap with google collections MapMaker

2011-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reopened CASSANDRA-2070:
---

  Assignee: (was: Pavel Yaskevich)

> replace ExpiringMap with google collections MapMaker
> 
>
> Key: CASSANDRA-2070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Priority: Trivial
> Attachments: CASSANDRA-2070.patch
>
>   Original Estimate: 4h
>  Time Spent: 4h
>  Remaining Estimate: 0h
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Resolved: (CASSANDRA-2070) replace ExpiringMap with google collections MapMaker

2011-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2070.
---

   Resolution: Won't Fix
Fix Version/s: (was: 0.8)

> replace ExpiringMap with google collections MapMaker
> 
>
> Key: CASSANDRA-2070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Priority: Trivial
> Attachments: CASSANDRA-2070.patch
>
>   Original Estimate: 4h
>  Time Spent: 4h
>  Remaining Estimate: 0h
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2070) replace ExpiringMap with google collections MapMaker

2011-02-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995848#comment-12995848
 ] 

Jonathan Ellis commented on CASSANDRA-2070:
---

I didn't realize that the guava API doesn't give us access to the Entry age.  
Double-wrapping with another Entry is not worth it (this is a 
performance-critical section).  I opened 
http://code.google.com/p/guava-libraries/issues/detail?id=550 as an enhancement 
request; closing this as WontFix until/unless that gets addressed.

> replace ExpiringMap with google collections MapMaker
> 
>
> Key: CASSANDRA-2070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Attachments: CASSANDRA-2070.patch
>
>   Original Estimate: 4h
>  Time Spent: 4h
>  Remaining Estimate: 0h
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1071645 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/utils/ExpiringMap.java

2011-02-17 Thread jbellis
Author: jbellis
Date: Thu Feb 17 15:30:08 2011
New Revision: 1071645

URL: http://svn.apache.org/viewvc?rev=1071645&view=rev
Log:
ExpiringMap cleanup
patch by Pavel Yaskevich

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/utils/ExpiringMap.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/utils/ExpiringMap.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/utils/ExpiringMap.java?rev=1071645&r1=1071644&r2=1071645&view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/utils/ExpiringMap.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/utils/ExpiringMap.java
 Thu Feb 17 15:30:08 2011
@@ -117,35 +117,20 @@ public class ExpiringMap
 
 public V get(K key)
 {
-V result = null;
 CacheableObject co = cache.get(key);
-if (co != null)
-{
-result = co.getValue();
-}
-return result;
+return co == null ? null : co.getValue();
 }
 
 public V remove(K key)
 {
 CacheableObject co = cache.remove(key);
-V result = null;
-if (co != null)
-{
-result = co.getValue();
-}
-return result;
+return co == null ? null : co.getValue();
 }
 
 public long getAge(K key)
 {
-long age = 0;
 CacheableObject co = cache.get(key);
-if (co != null)
-{
-age = co.age;
-}
-return age;
+return co == null ? 0 : co.age;
 }
 
 public int size()




[jira] Commented: (CASSANDRA-2070) replace ExpiringMap with google collections MapMaker

2011-02-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995850#comment-12995850
 ] 

Pavel Yaskevich commented on CASSANDRA-2070:


Yeah, they have those two issues - no getAge and no evictionListener (fixed in 
version 8)...

> replace ExpiringMap with google collections MapMaker
> 
>
> Key: CASSANDRA-2070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2070
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Priority: Trivial
> Attachments: CASSANDRA-2070.patch
>
>   Original Estimate: 4h
>  Time Spent: 4h
>  Remaining Estimate: 0h
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1071646 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/hadoop/ src/java/org/apache/cassandra/utils/

2011-02-17 Thread jbellis
Author: jbellis
Date: Thu Feb 17 15:33:51 2011
New Revision: 1071646

URL: http://svn.apache.org/viewvc?rev=1071646&view=rev
Log:
merge from 0.7

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/trunk/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
cassandra/trunk/src/java/org/apache/cassandra/utils/ExpiringMap.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Feb 17 15:33:51 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1071070
-/cassandra/branches/cassandra-0.7:1026516-1071411,1071482,1071485
+/cassandra/branches/cassandra-0.7:1026516-1071645
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689
 /incubator/cassandra/branches/cassandra-0.3:774578-796573

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1071646&r1=1071645&r2=1071646&view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Thu Feb 17 15:33:51 2011
@@ -15,8 +15,10 @@
  * lower-latency read repair (CASSANDRA-2069)
  * add hinted_handoff_throttle_delay_in_ms option (CASSANDRA-2161)
  * fixes for cache save/load (CASSANDRA-2172, -2174)
+ * Handle whole-row deletions in CFOutputFormat (CASSANDRA-2014)
  * Make memtable_flush_writers flush in parallel (CASSANDRA-2178)
 
+
 0.7.2
  * copy DecoratedKey.key when inserting into caches to avoid retaining
a reference to the underlying buffer (CASSANDRA-2102)

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Feb 17 15:33:51 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1071411,1071482,1071485
+/cassandra/branches/cassandra-0.7/contrib:1026516-1071645
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689
 /incubator/cassandra/branches/cassandra-0.3/contrib:774578-796573

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Feb 17 15:33:51 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1071070
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1071411,1071482,1071485
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1071645
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Feb 17 15:33:51 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1071070
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1071411,1071482,1071485
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1071645
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1051699-1053689
 
/incubator/cassandra/branches/cassandra-0.3/interf

[jira] Commented: (CASSANDRA-2014) Can't delete whole row from Hadoop MapReduce

2011-02-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995852#comment-12995852
 ] 

Hudson commented on CASSANDRA-2014:
---

Integrated in Cassandra-0.7 #284 (See 
[https://hudson.apache.org/hudson/job/Cassandra-0.7/284/])
Handle whole-row deletions in CFOutputFormat
patch by Patrik Modesto; reviewed by jbellis for CASSANDRA-2014


> Can't delete whole row from Hadoop MapReduce
> 
>
> Key: CASSANDRA-2014
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2014
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 0.7.0
> Environment: Debian Linux 2.6.32 amd64
>Reporter: Patrik Modesto
>Assignee: Patrik Modesto
> Fix For: 0.7.3
>
> Attachments: 2014-mr-delete-whole-row.patch
>
>
> ColumnFamilyRecordWriter.java doesn't support Mutation with Deletion without 
> slice_predicat and super_column to delete whole row. The other way I tried is 
> to specify SlicePredicate with empty start and finish and I got:
> {code}
> java.io.IOException: InvalidRequestException(why:Deletion does not yet 
> support SliceRange predicates.)
> at 
> org.apache.cassandra.hadoop.ColumnFamilyRecordWriter$RangeClient.run(ColumnFamilyRecordWriter.java:355)
> {code}
> I tryied to patch the ColumnFamilyRecordWriter.java like this:
> {code}
> --- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
> +++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordWriter.java
> @@ -166,10 +166,17 @@ implements 
> org.apache.hadoop.mapred.RecordWriter  // deletion
>  Deletion deletion = new Deletion(amut.deletion.timestamp);
>  mutation.setDeletion(deletion);
> +
>  org.apache.cassandra.avro.SlicePredicate apred = 
> amut.deletion.predicate;
> -if (amut.deletion.super_column != null)
> +if (apred == null && amut.deletion.super_column == null)
> +{
> +// epmty; delete whole row
> +}
> +else if (amut.deletion.super_column != null)
> +{
>  // super column
>  deletion.setSuper_column(copy(amut.deletion.super_column));
> +}
>  else if (apred.column_names != null)
>  {
>  // column names
> {code}
> but that didn't work as well.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1071649 - /cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java

2011-02-17 Thread jbellis
Author: jbellis
Date: Thu Feb 17 15:40:42 2011
New Revision: 1071649

URL: http://svn.apache.org/viewvc?rev=1071649&view=rev
Log:
fix #1255 for counters

Modified:
cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java

Modified: cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java?rev=1071649&r1=1071648&r2=1071649&view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java Thu 
Feb 17 15:40:42 2011
@@ -167,10 +167,12 @@ public class CounterMutation implements 
 // We need to transform all CounterUpdateColumn to CounterColumn and 
we need to deepCopy. Both are done 
 // below since CUC.asCounterColumn() does a deep copy.
 RowMutation rm = new RowMutation(rowMutation.getTable(), 
ByteBufferUtil.clone(rowMutation.key()));
+Table table = Table.open(rm.getTable());
 
 for (ColumnFamily cf_ : rowMutation.getColumnFamilies())
 {
 ColumnFamily cf = cf_.cloneMeShallow();
+ColumnFamilyStore cfs = table.getColumnFamilyStore(cf.id());
 for (IColumn column : cf_.getColumnsMap().values())
 {
 cf.addColumn(column.localCopy(null)); // TODO fix this




svn commit: r1071651 - /cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java

2011-02-17 Thread jbellis
Author: jbellis
Date: Thu Feb 17 15:41:16 2011
New Revision: 1071651

URL: http://svn.apache.org/viewvc?rev=1071651&view=rev
Log:
... really fix #1255 for counters

Modified:
cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java

Modified: cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java?rev=1071651&r1=1071650&r2=1071651&view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/db/CounterMutation.java Thu 
Feb 17 15:41:16 2011
@@ -175,7 +175,7 @@ public class CounterMutation implements 
 ColumnFamilyStore cfs = table.getColumnFamilyStore(cf.id());
 for (IColumn column : cf_.getColumnsMap().values())
 {
-cf.addColumn(column.localCopy(null)); // TODO fix this
+cf.addColumn(column.localCopy(cfs));
 }
 rm.add(cf);
 }




svn commit: r1071659 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java

2011-02-17 Thread jbellis
Author: jbellis
Date: Thu Feb 17 15:54:27 2011
New Revision: 1071659

URL: http://svn.apache.org/viewvc?rev=1071659&view=rev
Log:
fix callback when repair request times out (see #2069)
patch by jbellis

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java?rev=1071659&r1=1071658&r2=1071659&view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java
 Thu Feb 17 15:54:27 2011
@@ -66,7 +66,7 @@ public class RepairCallback implement
 throw new AssertionError(ex);
 }
 
-return resolver.isDataPresent() ? resolver.resolve() : null;
+return resolver.getMessageCount() > 0 ? resolver.resolve() : null;
 }
 
 public void response(Message message)




svn commit: r1071661 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/service/

2011-02-17 Thread jbellis
Author: jbellis
Date: Thu Feb 17 15:54:59 2011
New Revision: 1071661

URL: http://svn.apache.org/viewvc?rev=1071661&view=rev
Log:
merge from 0.7

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/service/RepairCallback.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Feb 17 15:54:59 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1071070
-/cassandra/branches/cassandra-0.7:1026516-1071645
+/cassandra/branches/cassandra-0.7:1026516-1071660
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689
 /incubator/cassandra/branches/cassandra-0.3:774578-796573

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Feb 17 15:54:59 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1071645
+/cassandra/branches/cassandra-0.7/contrib:1026516-1071660
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689
 /incubator/cassandra/branches/cassandra-0.3/contrib:774578-796573

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Feb 17 15:54:59 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1071070
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1071645
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1071660
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Feb 17 15:54:59 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1071070
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1071645
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1071660
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1051699-1053689
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/column_t.java:774578-792198

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Feb 17 15:54:59 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:922689-1052356,1052358-1053452,1053454,1053456-1071070
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:1026516-1071645
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:1026516-1071660
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3

[jira] Updated: (CASSANDRA-1902) Migrate cached pages during compaction

2011-02-17 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-1902:
--

Attachment: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt

> Migrate cached pages during compaction 
> ---
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 0.7.2
>
> Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 1902_v1.txt
>
>   Original Estimate: 32h
>  Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1071663 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java

2011-02-17 Thread jbellis
Author: jbellis
Date: Thu Feb 17 15:56:10 2011
New Revision: 1071663

URL: http://svn.apache.org/viewvc?rev=1071663&view=rev
Log:
resolve is actually useless unless we have multiple replies
patch by jbellis

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java?rev=1071663&r1=1071662&r2=1071663&view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/RepairCallback.java
 Thu Feb 17 15:56:10 2011
@@ -66,7 +66,7 @@ public class RepairCallback implement
 throw new AssertionError(ex);
 }
 
-return resolver.getMessageCount() > 0 ? resolver.resolve() : null;
+return resolver.getMessageCount() > 1 ? resolver.resolve() : null;
 }
 
 public void response(Message message)




[jira] Updated: (CASSANDRA-1902) Migrate cached pages during compaction

2011-02-17 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-1902:
--

Attachment: (was: 1902_v1.txt)

> Migrate cached pages during compaction 
> ---
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 0.7.2
>
> Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
>   Original Estimate: 32h
>  Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1902) Migrate cached pages during compaction

2011-02-17 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995861#comment-12995861
 ] 

T Jake Luciani commented on CASSANDRA-1902:
---

Attached new version with a configuration to enable/disable(default) this and 
CASSANDRA-1470 if you are going to test you need to set 
enable_page_cache_migration: true in cassandra.yaml

> Migrate cached pages during compaction 
> ---
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 0.7.2
>
> Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
>   Original Estimate: 32h
>  Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2182) Cassandra doesn't startup on single core boxes.

2011-02-17 Thread T Jake Luciani (JIRA)
Cassandra doesn't startup on single core boxes.
---

 Key: CASSANDRA-2182
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2182
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 0.7.3


I happened to run cassandra in a VM and got the following error, caused by the 
single core:

ERROR 10:47:30,304 Exception encountered during startup.
java.lang.AssertionError: multi-threaded stages must have at least 2 threads
at 
org.apache.cassandra.concurrent.StageManager.multiThreadedStage(StageManager.java:60)
at 
org.apache.cassandra.concurrent.StageManager.(StageManager.java:53)
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:303)
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:159)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2182) Cassandra doesn't startup on single core boxes.

2011-02-17 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-2182:
--

Attachment: 0001-CASSANDRA-2182-rr-pool-needs-at-least-two-threads.txt

> Cassandra doesn't startup on single core boxes.
> ---
>
> Key: CASSANDRA-2182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 
> 0001-CASSANDRA-2182-rr-pool-needs-at-least-two-threads.txt
>
>
> I happened to run cassandra in a VM and got the following error, caused by 
> the single core:
> ERROR 10:47:30,304 Exception encountered during startup.
> java.lang.AssertionError: multi-threaded stages must have at least 2 threads
> at 
> org.apache.cassandra.concurrent.StageManager.multiThreadedStage(StageManager.java:60)
> at 
> org.apache.cassandra.concurrent.StageManager.(StageManager.java:53)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:303)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:159)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at 
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2182) Cassandra doesn't startup on single core boxes.

2011-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2182:
--

Affects Version/s: (was: 0.7.1)
   0.7.3

+1 (this was from CASSANDRA-2069 which was committed after 0.7.2)

> Cassandra doesn't startup on single core boxes.
> ---
>
> Key: CASSANDRA-2182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.3
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 
> 0001-CASSANDRA-2182-rr-pool-needs-at-least-two-threads.txt
>
>
> I happened to run cassandra in a VM and got the following error, caused by 
> the single core:
> ERROR 10:47:30,304 Exception encountered during startup.
> java.lang.AssertionError: multi-threaded stages must have at least 2 threads
> at 
> org.apache.cassandra.concurrent.StageManager.multiThreadedStage(StageManager.java:60)
> at 
> org.apache.cassandra.concurrent.StageManager.(StageManager.java:53)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:303)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:159)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at 
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-02-17 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1938:


Attachment: (was: 0001-Use-uuid-instead-of-IP-for-counters.patch)

> Use UUID as node identifiers in counters instead of IP addresses 
> -
>
> Key: CASSANDRA-1938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 0.8
>
> Attachments: 0001-Use-uuid-instead-of-IP-for-counters.patch, 
> 0002-Merge-old-shard-locally.patch, 0003-Thrift-change-to-CfDef.patch, 
> 1938_discussion
>
>   Original Estimate: 56h
>  Remaining Estimate: 56h
>
> The use of IP addresses as node identifiers in the partition of a given
> counter is fragile. Changes of the node's IP addresses can result in data
> loss. This patch proposes to use UUIDs instead.
> NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-02-17 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1938:


Attachment: 0002-Merge-old-shard-locally.patch
0001-Use-uuid-instead-of-IP-for-counters.patch

> Use UUID as node identifiers in counters instead of IP addresses 
> -
>
> Key: CASSANDRA-1938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 0.8
>
> Attachments: 0001-Use-uuid-instead-of-IP-for-counters.patch, 
> 0002-Merge-old-shard-locally.patch, 0003-Thrift-change-to-CfDef.patch, 
> 1938_discussion
>
>   Original Estimate: 56h
>  Remaining Estimate: 56h
>
> The use of IP addresses as node identifiers in the partition of a given
> counter is fragile. Changes of the node's IP addresses can result in data
> loss. This patch proposes to use UUIDs instead.
> NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-02-17 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1938:


Attachment: (was: 0002-Merge-old-shard-locally.patch)

> Use UUID as node identifiers in counters instead of IP addresses 
> -
>
> Key: CASSANDRA-1938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
> Fix For: 0.8
>
> Attachments: 0001-Use-uuid-instead-of-IP-for-counters.patch, 
> 0002-Merge-old-shard-locally.patch, 0003-Thrift-change-to-CfDef.patch, 
> 1938_discussion
>
>   Original Estimate: 56h
>  Remaining Estimate: 56h
>
> The use of IP addresses as node identifiers in the partition of a given
> counter is fragile. Changes of the node's IP addresses can result in data
> loss. This patch proposes to use UUIDs instead.
> NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Resolved: (CASSANDRA-2182) Cassandra doesn't startup on single core boxes.

2011-02-17 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani resolved CASSANDRA-2182.
---

Resolution: Fixed

> Cassandra doesn't startup on single core boxes.
> ---
>
> Key: CASSANDRA-2182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.3
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 
> 0001-CASSANDRA-2182-rr-pool-needs-at-least-two-threads.txt
>
>
> I happened to run cassandra in a VM and got the following error, caused by 
> the single core:
> ERROR 10:47:30,304 Exception encountered during startup.
> java.lang.AssertionError: multi-threaded stages must have at least 2 threads
> at 
> org.apache.cassandra.concurrent.StageManager.multiThreadedStage(StageManager.java:60)
> at 
> org.apache.cassandra.concurrent.StageManager.(StageManager.java:53)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:303)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:159)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at 
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2183) memtable_flush_after_mins setting not working

2011-02-17 Thread Ching-cheng (JIRA)
memtable_flush_after_mins setting not working
-

 Key: CASSANDRA-2183
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2183
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.1, 0.7.0
Reporter: Ching-cheng
Priority: Minor


We have observed the behavior that memtable_flush_after_mins setting not 
working occasionally.   After some testing and code digging, we finally figured 
out what going on.
The memtable_flush_after_mins won't work on certain condition with current 
implementation in Cassandra.

In org.apache.cassandra.db.Table,  the scheduled flush task is setup by the 
following code during construction.

--
int minCheckMs = Integer.MAX_VALUE;
   
for (ColumnFamilyStore cfs : columnFamilyStores.values())  
{
minCheckMs = Math.min(minCheckMs, cfs.getMemtableFlushAfterMins() * 60 * 
1000);
}

Runnable runnable = new Runnable()
{
   public void run()
   {
   for (ColumnFamilyStore cfs : columnFamilyStores.values())
   {
   cfs.forceFlushIfExpired();
   }
   }
};
flushTask = StorageService.scheduledTasks.scheduleWithFixedDelay(runnable, 
minCheckMs, minCheckMs, TimeUnit.MILLISECONDS);
--

Now for our application, we will create a keyspacewithout any columnfamily 
first.  And only add needed columnfamily later depends on request.

However, when keyspacegot created (without any columnfamily ), the above code 
will actually schedule a fixed delay flush check task with Integer.MAX_VALUE ms
since there is no columnfamily yet.

Later when you add columnfamily to this empty keyspace, the initCf() method in 
Table.java doesn't check whether the scheduled flush check task interval need
to be updated or not.   To fix this, we'd need to restart the Cassandra after 
columnfamily added into the keyspace. 

I would suggest that add additional logic in initCf() method to recreate a 
scheduled flush check task if needed.


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2183) memtable_flush_after_mins setting not working

2011-02-17 Thread Ching-cheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ching-cheng updated CASSANDRA-2183:
---

Description: 
We have observed the behavior that memtable_flush_after_mins setting not 
working occasionally.   After some testing and code digging, we finally figured 
out what going on.
The memtable_flush_after_mins won't work on certain condition with current 
implementation in Cassandra.

In org.apache.cassandra.db.Table,  the scheduled flush task is setup by the 
following code during construction.

--
int minCheckMs = Integer.MAX_VALUE;
   
for (ColumnFamilyStore cfs : columnFamilyStores.values())  
{
minCheckMs = Math.min(minCheckMs, cfs.getMemtableFlushAfterMins() * 60 * 
1000);
}

Runnable runnable = new Runnable()
{
   public void run()
   {
   for (ColumnFamilyStore cfs : columnFamilyStores.values())
   {
   cfs.forceFlushIfExpired();
   }
   }
};
flushTask = StorageService.scheduledTasks.scheduleWithFixedDelay(runnable, 
minCheckMs, minCheckMs, TimeUnit.MILLISECONDS);
--

Now for our application, we will create a keyspacewithout without any 
columnfamily first.  And only add needed columnfamily later depends on request.

However, when keyspacegot created (without any columnfamily ), the above code 
will actually schedule a fixed delay flush check task with Integer.MAX_VALUE ms
since there is no columnfamily yet.

Later when you add columnfamily to this empty keyspace, the initCf() method in 
Table.java doesn't check whether the scheduled flush check task interval need
to be updated or not.   To fix this, we'd need to restart the Cassandra after 
columnfamily added into the keyspace. 

I would suggest that add additional logic in initCf() method to recreate a 
scheduled flush check task if needed.


  was:
We have observed the behavior that memtable_flush_after_mins setting not 
working occasionally.   After some testing and code digging, we finally figured 
out what going on.
The memtable_flush_after_mins won't work on certain condition with current 
implementation in Cassandra.

In org.apache.cassandra.db.Table,  the scheduled flush task is setup by the 
following code during construction.

--
int minCheckMs = Integer.MAX_VALUE;
   
for (ColumnFamilyStore cfs : columnFamilyStores.values())  
{
minCheckMs = Math.min(minCheckMs, cfs.getMemtableFlushAfterMins() * 60 * 
1000);
}

Runnable runnable = new Runnable()
{
   public void run()
   {
   for (ColumnFamilyStore cfs : columnFamilyStores.values())
   {
   cfs.forceFlushIfExpired();
   }
   }
};
flushTask = StorageService.scheduledTasks.scheduleWithFixedDelay(runnable, 
minCheckMs, minCheckMs, TimeUnit.MILLISECONDS);
--

Now for our application, we will create a keyspacewithout any columnfamily 
first.  And only add needed columnfamily later depends on request.

However, when keyspacegot created (without any columnfamily ), the above code 
will actually schedule a fixed delay flush check task with Integer.MAX_VALUE ms
since there is no columnfamily yet.

Later when you add columnfamily to this empty keyspace, the initCf() method in 
Table.java doesn't check whether the scheduled flush check task interval need
to be updated or not.   To fix this, we'd need to restart the Cassandra after 
columnfamily added into the keyspace. 

I would suggest that add additional logic in initCf() method to recreate a 
scheduled flush check task if needed.



> memtable_flush_after_mins setting not working
> -
>
> Key: CASSANDRA-2183
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2183
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.0, 0.7.1
>Reporter: Ching-cheng
>Priority: Minor
>
> We have observed the behavior that memtable_flush_after_mins setting not 
> working occasionally.   After some testing and code digging, we finally 
> figured out what going on.
> The memtable_flush_after_mins won't work on certain condition with current 
> implementation in Cassandra.
> In org.apache.cassandra.db.Table,  the scheduled flush task is setup by the 
> following code during construction.
> --
> int minCheckMs = Integer.MAX_VALUE;
>
> for (ColumnFamilyStore cf

[jira] Updated: (CASSANDRA-2183) memtable_flush_after_mins setting not working

2011-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2183:
--

Attachment: 2183.txt

The original approach of checking barely more often than we think is necessary 
is overengineering the problem; even with 1000s of CFs, checking every 10s 
would have no affect whatsoever on performance.  Patch attached.

> memtable_flush_after_mins setting not working
> -
>
> Key: CASSANDRA-2183
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2183
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Ching-cheng
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 2183.txt
>
>
> We have observed the behavior that memtable_flush_after_mins setting not 
> working occasionally.   After some testing and code digging, we finally 
> figured out what going on.
> The memtable_flush_after_mins won't work on certain condition with current 
> implementation in Cassandra.
> In org.apache.cassandra.db.Table,  the scheduled flush task is setup by the 
> following code during construction.
> --
> int minCheckMs = Integer.MAX_VALUE;
>
> for (ColumnFamilyStore cfs : columnFamilyStores.values())  
> {
> minCheckMs = Math.min(minCheckMs, cfs.getMemtableFlushAfterMins() * 60 * 
> 1000);
> }
> Runnable runnable = new Runnable()
> {
>public void run()
>{
>for (ColumnFamilyStore cfs : columnFamilyStores.values())
>{
>cfs.forceFlushIfExpired();
>}
>}
> };
> flushTask = StorageService.scheduledTasks.scheduleWithFixedDelay(runnable, 
> minCheckMs, minCheckMs, TimeUnit.MILLISECONDS);
> --
> Now for our application, we will create a keyspacewithout without any 
> columnfamily first.  And only add needed columnfamily later depends on 
> request.
> However, when keyspacegot created (without any columnfamily ), the above code 
> will actually schedule a fixed delay flush check task with Integer.MAX_VALUE 
> ms
> since there is no columnfamily yet.
> Later when you add columnfamily to this empty keyspace, the initCf() method 
> in Table.java doesn't check whether the scheduled flush check task interval 
> need
> to be updated or not.   To fix this, we'd need to restart the Cassandra after 
> columnfamily added into the keyspace. 
> I would suggest that add additional logic in initCf() method to recreate a 
> scheduled flush check task if needed.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2183) memtable_flush_after_mins setting not working

2011-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2183:
--

 Reviewer: chenyy
Affects Version/s: (was: 0.7.1)
Fix Version/s: 0.7.3
 Assignee: Jonathan Ellis

> memtable_flush_after_mins setting not working
> -
>
> Key: CASSANDRA-2183
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2183
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Ching-cheng
>Assignee: Jonathan Ellis
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 2183.txt
>
>
> We have observed the behavior that memtable_flush_after_mins setting not 
> working occasionally.   After some testing and code digging, we finally 
> figured out what going on.
> The memtable_flush_after_mins won't work on certain condition with current 
> implementation in Cassandra.
> In org.apache.cassandra.db.Table,  the scheduled flush task is setup by the 
> following code during construction.
> --
> int minCheckMs = Integer.MAX_VALUE;
>
> for (ColumnFamilyStore cfs : columnFamilyStores.values())  
> {
> minCheckMs = Math.min(minCheckMs, cfs.getMemtableFlushAfterMins() * 60 * 
> 1000);
> }
> Runnable runnable = new Runnable()
> {
>public void run()
>{
>for (ColumnFamilyStore cfs : columnFamilyStores.values())
>{
>cfs.forceFlushIfExpired();
>}
>}
> };
> flushTask = StorageService.scheduledTasks.scheduleWithFixedDelay(runnable, 
> minCheckMs, minCheckMs, TimeUnit.MILLISECONDS);
> --
> Now for our application, we will create a keyspacewithout without any 
> columnfamily first.  And only add needed columnfamily later depends on 
> request.
> However, when keyspacegot created (without any columnfamily ), the above code 
> will actually schedule a fixed delay flush check task with Integer.MAX_VALUE 
> ms
> since there is no columnfamily yet.
> Later when you add columnfamily to this empty keyspace, the initCf() method 
> in Table.java doesn't check whether the scheduled flush check task interval 
> need
> to be updated or not.   To fix this, we'd need to restart the Cassandra after 
> columnfamily added into the keyspace. 
> I would suggest that add additional logic in initCf() method to recreate a 
> scheduled flush check task if needed.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1902) Migrate cached pages during compaction

2011-02-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995921#comment-12995921
 ] 

Jonathan Ellis commented on CASSANDRA-1902:
---

Can you give an overview of what the patch does?  Where and how you "track 
contiguous pages per key," why that is the right solution, what that has to do 
with MAX_BYTES_IN_PAGE_CACHE, and whatever else I don't know enough to ask for 
yet. :)

> Migrate cached pages during compaction 
> ---
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 0.7.2
>
> Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
>   Original Estimate: 32h
>  Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2047) Stress --keep-going should become --keep-trying

2011-02-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2047:
---

  Reviewer: tjake
Remaining Estimate: 4h
 Original Estimate: 4h

> Stress --keep-going should become --keep-trying
> ---
>
> Key: CASSANDRA-2047
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2047
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.2
>
> Attachments: CASSANDRA-2047.patch
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> The --keep-going flag makes the stress tool drop messages that time out on 
> the floor.
> I think it's more realistic (esp for a stress tool) to keep trying till this 
> read/write succeeds.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2047) Stress --keep-going should become --keep-trying

2011-02-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2047:
---

Attachment: CASSANDRA-2047.patch

branch: cassandra-0.7.2 (latest commit 0026c6789dce4a1438329128eefe03c9fbc861da)

> Stress --keep-going should become --keep-trying
> ---
>
> Key: CASSANDRA-2047
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2047
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.2
>
> Attachments: CASSANDRA-2047.patch
>
>
> The --keep-going flag makes the stress tool drop messages that time out on 
> the floor.
> I think it's more realistic (esp for a stress tool) to keep trying till this 
> read/write succeeds.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2182) Cassandra doesn't startup on single core boxes.

2011-02-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995936#comment-12995936
 ] 

Hudson commented on CASSANDRA-2182:
---

Integrated in Cassandra-0.7 #287 (See 
[https://hudson.apache.org/hudson/job/Cassandra-0.7/287/])
read repair stage requires a minimum of 2 threads
patch by tjake for CASSANDRA-2182


> Cassandra doesn't startup on single core boxes.
> ---
>
> Key: CASSANDRA-2182
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2182
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.3
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 
> 0001-CASSANDRA-2182-rr-pool-needs-at-least-two-threads.txt
>
>
> I happened to run cassandra in a VM and got the following error, caused by 
> the single core:
> ERROR 10:47:30,304 Exception encountered during startup.
> java.lang.AssertionError: multi-threaded stages must have at least 2 threads
> at 
> org.apache.cassandra.concurrent.StageManager.multiThreadedStage(StageManager.java:60)
> at 
> org.apache.cassandra.concurrent.StageManager.(StageManager.java:53)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:303)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:159)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:316)
> at 
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1969) Use BB for row cache - To Improve GC performance.

2011-02-17 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-1969:
-

Attachment: 0004-Null-Check-and-duplicate-bb.txt

ahaa, hope git works!

> Use BB for row cache - To Improve GC performance.
> -
>
> Key: CASSANDRA-1969
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1969
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Linux and Mac
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
> Attachments: 0001-Config-1969.txt, 
> 0001-introduce-ICache-InstrumentingCache-IRowCacheProvider.txt, 
> 0002-Update_existing-1965.txt, 0002-implement-SerializingCache.txt, 
> 0002-implement-SerializingCacheV2.txt, 0003-New_Cache_Providers-1969.txt, 
> 0003-add-ICache.isCopying-method.txt, 0004-Null-Check-and-duplicate-bb.txt, 
> 0004-TestCase-1969.txt, BB_Cache-1945.png, JMX-Cache-1945.png, 
> Old_Cahce-1945.png, POC-0001-Config-1945.txt, 
> POC-0002-Update_existing-1945.txt, POC-0003-New_Cache_Providers-1945.txt
>
>
> Java BB.allocateDirect() will allocate native memory out of the JVM and will 
> help reducing the GC pressure in the JVM with a large Cache.
> From some of the basic tests it shows around 50% improvement than doing a 
> normal Object cache.
> In addition this patch provide the users an option to choose 
> BB.allocateDirect or store everything in the heap.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1902) Migrate cached pages during compaction

2011-02-17 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995947#comment-12995947
 ] 

T Jake Luciani commented on CASSANDRA-1902:
---

bq. Can you give an overview of what the patch does?

At a high level this is to lessen the affect of sstable compaction and cleanup 
on reads, it does this by letting compaction figure out what rows in a SSTable 
are in the OS page cache and make sure that the compacted row remains in the 
page cache. Remember post CASSANDRA-1470 we are not putting anything in the 
page cache during compaction. 

bq. Where and how you "track contiguous pages per key," why that is the right 
solution

This is the right solution because the worst case is the same as the current 
code today. It can really only help because it's just giving the OS hints, it's 
upto the OS to do with that info what it thinks is best.

The important piece is in CLibrary.getCachedPages(File file, int 
minContiguousPages)

This takes a file and mmaps it in 2G chunks then uses the posix mincore() call 
to get a vector of which pages in the range are actually cached (for a totally 
unread file this is []). We use the starting offset + (pagecache_size * each 
mapped page) to return a vector of positions on disk. we use the 
minContiguousPages to filter down the noise of cache fragments.


Jump to SSTableScanner, here we use the file positions from getCachedPaged to 
figure out if a given row is considered "active". If it is we set the 
isInPageCache flag on the SSTableIdentityIterator.


Jump to CompactionManager, if any part of a row has been flagged as active then 
we make sure when we write the new SSTable this rows data is not forced out of 
the page cache (the default action from CASSANDRA-1470)

The two variables we probably should expose here are: 

1. BRAF.MAX_BYTES_IN_PAGE_CACHE - this says how many bytes should i let the 
page cache buffer before I force a flush of the OS cache for this files working 
(this is currently set to 128mb which, based on my testing is a nice default)

2. SSTableScanner's call to getCachedPages uses a minContiguousPages setting of 
32.  Again this is a nice default I've found.


By increasing (1) you pollute your page cache more but slightly increase your 
write performance.
By increasing (2) you migrate less and less rows during compaction.





> Migrate cached pages during compaction 
> ---
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 0.7.2
>
> Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
>   Original Estimate: 32h
>  Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1902) Migrate cached pages during compaction

2011-02-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995960#comment-12995960
 ] 

Jonathan Ellis commented on CASSANDRA-1902:
---

Instead of re-memmapping things, can we use the one we already have in 
SSTableReader.dfile (assuming mmap mode is turned on)?

> Migrate cached pages during compaction 
> ---
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 0.7.2
>
> Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
>   Original Estimate: 32h
>  Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1902) Migrate cached pages during compaction

2011-02-17 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995963#comment-12995963
 ] 

T Jake Luciani commented on CASSANDRA-1902:
---

I tried that first but it wouldn't handle standard reading (which still works 
now).  

The cost isn't high at all, calling mincore is fast. Doesn't actually read 
anything.



> Migrate cached pages during compaction 
> ---
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 0.7.2
>
> Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
>   Original Estimate: 32h
>  Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2184) Returning split length of 0 confuses Pig

2011-02-17 Thread Jonathan Ellis (JIRA)
Returning split length of 0 confuses Pig


 Key: CASSANDRA-2184
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2184
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 0.6
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.3


Matt Kennedy reports on the user list,

bq. There is a new feature in Pig 0.8 that will try to reduce the number of 
splits used to speed up the whole job.  Since the ColumnFamilyInputFormat lists 
the input size as zero, this feature eliminates all of the splits except for 
one. 
bq. The workaround is to disable this feature for jobs that use 
CassandraStorage by setting -Dpig.splitCombination=false in the pig_cassandra 
script.
{noformat}

bq. However, we wanted to keep splitCombination on because it is a useful 
optimization for a lot of our use cases, so I went digging for the least 
intrusive way to keep the split combiner on, but also prevent it from combining 
splits that read from Cassandra.  My solution, which you are welcome to 
critique, is to change line 65 of 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/hadoop/ColumnFamilySplit.java
 such that it returns Long.MAX_VALUE instead of zero.

I looked into actually returning the number of keys in the split but Hadoop 
javadoc says "Get the size of the split, so that the input splits can be sorted 
by size" so since our splits should be very very close in size this doesn't 
sound like it's worth doing an extra round trip to the host servers to get 
super accurate numbers on.  Returning MAX_VALUE seems like it's good enough.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2137) stress.java should have a way to specify the replication strategy

2011-02-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2137:
---

Remaining Estimate: 0.5h
 Original Estimate: 0.5h

> stress.java should have a way to specify the replication strategy
> -
>
> Key: CASSANDRA-2137
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2137
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.1
>Reporter: Brandon Williams
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.2
>
> Attachments: CASSANDRA-2137.patch
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> It would be nice to have stress.java be a one stop shop when you need to test 
> something other than SimpleStrategy.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2137) stress.java should have a way to specify the replication strategy

2011-02-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2137:
---

Attachment: CASSANDRA-2137.patch

branch cassandra-0.7.2 (latest commit 0026c6789dce4a1438329128eefe03c9fbc861da)

> stress.java should have a way to specify the replication strategy
> -
>
> Key: CASSANDRA-2137
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2137
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.1
>Reporter: Brandon Williams
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.2
>
> Attachments: CASSANDRA-2137.patch
>
>
> It would be nice to have stress.java be a one stop shop when you need to test 
> something other than SimpleStrategy.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2124) JDBC driver for CQL

2011-02-17 Thread Vivek Mishra (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Mishra updated CASSANDRA-2124:


Attachment: cassandra_generic_decoder.patch

Draft patch with logic for column decoding as discussed.

This is not final patch as i need to correct code style check and test case 
around it.

> JDBC driver for CQL
> ---
>
> Key: CASSANDRA-2124
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2124
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API
>Reporter: Eric Evans
>Assignee: Vivek Mishra
>Priority: Minor
>  Labels: cql
> Attachments: Cassandra-2124_v1.0, cassandra-0.7.1-2124_v2.0, 
> cassandra-0.7.1-2124_v2.1, cassandra_generic_decoder.patch
>
>
> A simple connection class and corresponding pool was created for CQL as a 
> part of CASSANDRA-1710, but a JDBC driver (either in addition to, or as a 
> replacement for) would also be interesting.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2137) stress.java should have a way to specify the replication strategy

2011-02-17 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995995#comment-12995995
 ] 

Brandon Williams commented on CASSANDRA-2137:
-

We also need to be able to pass strategy properties, or NTS isn't very useful.

> stress.java should have a way to specify the replication strategy
> -
>
> Key: CASSANDRA-2137
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2137
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.1
>Reporter: Brandon Williams
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.2
>
> Attachments: CASSANDRA-2137.patch
>
>   Original Estimate: 0.5h
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> It would be nice to have stress.java be a one stop shop when you need to test 
> something other than SimpleStrategy.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2175) make key cache preheating use less memory

2011-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2175:
--

Attachment: 2175-0.6.txt

adds -Dcompaction_preheat_key_cache (default false) for 0.6.  Will merge to 0.7 
as a yaml property.

> make key cache preheating use less memory
> -
>
> Key: CASSANDRA-2175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2175
> Project: Cassandra
>  Issue Type: Improvement
>Affects Versions: 0.6.9, 0.7.0 rc 3
>Reporter: Jonathan Ellis
>Priority: Minor
> Fix For: 0.6.12, 0.7.3
>
> Attachments: 2175-0.6.txt
>
>
> CASSANDRA-1878 pre-heats the key cache post-compaction so latency doesn't 
> suffer while warming the cache back up.  This can double the memory used 
> temporarily; for a large key cache, this can have a substantial impact.
> For now a boolean on/off is probably the best we can do.  With 
> http://code.google.com/p/concurrentlinkedhashmap/issues/detail?id=21 though, 
> we could say "preheat the hottest X keys."

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2175) make key cache preheating use less memory

2011-02-17 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996020#comment-12996020
 ] 

Brandon Williams commented on CASSANDRA-2175:
-

+1

> make key cache preheating use less memory
> -
>
> Key: CASSANDRA-2175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2175
> Project: Cassandra
>  Issue Type: Improvement
>Affects Versions: 0.6.9, 0.7.0 rc 3
>Reporter: Jonathan Ellis
>Priority: Minor
> Fix For: 0.6.12, 0.7.3
>
> Attachments: 2175-0.6.txt
>
>
> CASSANDRA-1878 pre-heats the key cache post-compaction so latency doesn't 
> suffer while warming the cache back up.  This can double the memory used 
> temporarily; for a large key cache, this can have a substantial impact.
> For now a boolean on/off is probably the best we can do.  With 
> http://code.google.com/p/concurrentlinkedhashmap/issues/detail?id=21 though, 
> we could say "preheat the hottest X keys."

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2137) stress.java should have a way to specify the replication strategy

2011-02-17 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-2137:
---

Attachment: CASSANDRA-2137-v2.patch

Options are:
-R (--replication-strategy)
-O (--strategy-properties) following format:
:,:

This is in the --help too.

> stress.java should have a way to specify the replication strategy
> -
>
> Key: CASSANDRA-2137
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2137
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.1
>Reporter: Brandon Williams
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.2
>
> Attachments: CASSANDRA-2137-v2.patch, CASSANDRA-2137.patch
>
>   Original Estimate: 0.5h
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> It would be nice to have stress.java be a one stop shop when you need to test 
> something other than SimpleStrategy.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1278) Make bulk loading into Cassandra less crappy, more pluggable

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1278:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Make bulk loading into Cassandra less crappy, more pluggable
> 
>
> Key: CASSANDRA-1278
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1278
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jeremy Hanna
>Assignee: Matthew F. Dennis
> Fix For: 0.7.3
>
> Attachments: 1278-cassandra-0.7.1.txt, 1278-cassandra-0.7.txt
>
>   Original Estimate: 40h
>  Time Spent: 40h 40m
>  Remaining Estimate: 0h
>
> Currently bulk loading into Cassandra is a black art.  People are either 
> directed to just do it responsibly with thrift or a higher level client, or 
> they have to explore the contrib/bmt example - 
> http://wiki.apache.org/cassandra/BinaryMemtable  That contrib module requires 
> delving into the code to find out how it works and then applying it to the 
> given problem.  Using either method, the user also needs to keep in mind that 
> overloading the cluster is possible - which will hopefully be addressed in 
> CASSANDRA-685
> This improvement would be to create a contrib module or set of documents 
> dealing with bulk loading.  Perhaps it could include code in the Core to make 
> it more pluggable for external clients of different types.
> It is just that this is something that many that are new to Cassandra need to 
> do - bulk load their data into Cassandra.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1472) Add bitmap secondary indexes

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1472:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Add bitmap secondary indexes
> 
>
> Key: CASSANDRA-1472
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Assignee: Stu Hood
> Fix For: 0.7.3
>
> Attachments: 0.7-1472-v5.tgz, 0.7-1472-v6.tgz, 
> 0001-CASSANDRA-1472-rebased-to-0.7-branch.txt, 
> 0019-Rename-bugfixes-and-fileclose.txt, 1472-v3.tgz, 1472-v4.tgz, 
> 1472-v5.tgz, anatomy.png, v4-bench-c32.txt
>
>
> Bitmap indexes are a very efficient structure for dealing with immutable 
> data. We can take advantage of the fact that SSTables are immutable by 
> attaching them directly to SSTables as a new component (supported by 
> CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1337:


Fix Version/s: (was: 0.7.2)
   0.7.3

> parallelize fetching rows for low-cardinality indexes
> -
>
> Key: CASSANDRA-1337
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 
> 0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> currently, we read the indexed rows from the first node (in partitioner 
> order); if that does not have enough matching rows, we read the rows from the 
> next, and so forth.
> we should use the statistics fom CASSANDRA-1155 to query multiple nodes in 
> parallel, such that we have a high chance of getting enough rows w/o having 
> to do another round of queries (but, if our estimate is incorrect, we do need 
> to loop and do more rounds until we have enough data or we have fetched from 
> each node).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1143) Nodetool gives cryptic errors when given a nonexistent keyspace arg

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1143:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Nodetool gives cryptic errors when given a nonexistent keyspace arg
> ---
>
> Key: CASSANDRA-1143
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1143
> Project: Cassandra
>  Issue Type: Wish
>  Components: Tools
> Environment: Sun Java 1.6u20, Cassandra 0.6.2, CentOS 5.5.
>Reporter: Ian Soboroff
>Assignee: Joaquin Casares
>Priority: Trivial
> Fix For: 0.7.3
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I typoed the keyspace arg to 'nodetool repair', and got the following 
> exception:
> /usr/local/src/cassandra/bin/nodetool --host node4 repair DocDb
> Exception in thread "main" java.lang.RuntimeException: No replica strategy 
> configured for DocDb
> at 
> org.apache.cassandra.service.StorageService.getReplicationStrategy(StorageService.java:246)
> at 
> org.apache.cassandra.service.StorageService.constructRangeToEndPointMap(StorageService.java:466)
> at 
> org.apache.cassandra.service.StorageService.getRangeToAddressMap(StorageService.java:452)
> at 
> org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:145)
> at 
> org.apache.cassandra.service.StorageService.forceTableRepair(StorageService.java:1075)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
> at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
> at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
> at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
> at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
> at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
> at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
> at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
> at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
> at sun.rmi.transport.Transport$1.run(Transport.java:159)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
> at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
> at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> It would be better to report that the keyspace doesn't exist, rather than the 
> keyspace doesn't have a replication strategy.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1379) Uncached row reads may block cached reads

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1379:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Uncached row reads may block cached reads
> -
>
> Key: CASSANDRA-1379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1379
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: David King
>Assignee: Javier Canillas
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: CASSANDRA-1379.patch
>
>
> The cap on the number of concurrent reads appears to cap the *total* number 
> of concurrent reads instead of just capping the reads that are bound for 
> disk. That is, given N concurrent readers if all of them are busy waiting on 
> disk, even reads that can be served from the row cache will block waiting for 
> them.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1537) Add option (on CF) to remove expired column on minor compactions

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1537:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Add option (on CF) to remove expired column on minor compactions
> 
>
> Key: CASSANDRA-1537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1537
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 0.7.3
>
>
> In some use cases, you can safely remove the tombstones of an expired column.
> In theory, this is true in each case where you know that you will never 
> update a column 
> using a ttl strictly lesser that the one of the old column.
> This will be the case for instance if you always use the same ttl on all the 
> columns of a CF
> (say you use the CF for a long term persistent cache).
> I propose adding an option (by CF) that says 'always remove tombstone of 
> expired columns
> for that CF'.  

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1498) Add a simple MapReduce system test

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1498:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Add a simple MapReduce system test
> --
>
> Key: CASSANDRA-1498
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1498
> Project: Cassandra
>  Issue Type: Test
>  Components: Hadoop, Tools
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
> Fix For: 0.7.3
>
>
> We don't have a good way to regression test MapReduce functionality.  This 
> ticket entails making a simple system test to do that (in python).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1740) Nodetool commands to query and stop compaction, repair, and cleanup

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1740:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Nodetool commands to query and stop compaction, repair, and cleanup
> ---
>
> Key: CASSANDRA-1740
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1740
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Chip Salzenberg
>Assignee: Jon Hermes
>Priority: Minor
> Fix For: 0.7.3
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The only way to stop compaction, repair, or cleanup in progress is to stop 
> and restart the entire Cassandra server.  Please provide nodetool commands to 
> query whether such things are running, and stop them if they are.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1882) rate limit all background I/O

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1882:


Fix Version/s: (was: 0.7.2)
   0.7.3

> rate limit all background I/O
> -
>
> Key: CASSANDRA-1882
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1882
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Peter Schuller
>Assignee: Peter Schuller
>Priority: Minor
> Fix For: 0.7.3
>
>
> There is a clear need to support rate limiting of all background I/O (e.g., 
> compaction, repair). In some cases background I/O is naturally rate limited 
> as a result of being CPU bottlenecked, but in all cases where the CPU is not 
> the bottleneck, background streaming I/O is almost guaranteed (barring a very 
> very smart RAID controller or I/O subsystem that happens to cater extremely 
> well to the use case) to be detrimental to the latency and throughput of 
> regular live traffic (reads).
> Ways in which live traffic is negatively affected by backgrounds I/O includes:
> * Indirectly by page cache eviction (see e.g. CASSANDRA-1470).
> * Reads are directly detrimental when not otherwise limited for the usual 
> reasons; large continuing read requests that keep coming are battling with 
> latency sensitive live traffic (mostly seek bound). Mixing seek-bound latency 
> critical with bulk streaming is a classic no-no for I/O scheduling.
> * Writes are directly detrimental in a similar fashion.
> * But in particular, writes are more difficult still: Caching effects tend to 
> augment the effects because lacking any kind of fsync() or direct I/O, the 
> operating system and/or RAID controller tends to defer writes when possible. 
> This often leads to a very sudden throttling of the application when caches 
> are filled, at which point there is potentially a huge backlog of data to 
> write.
> ** This may evict a lot of data from page cache since dirty buffers cannot be 
> evicted prior to being flushed out (though CASSANDRA-1470 and related will 
> hopefully help here).
> ** In particular, one major reason why batter-backed RAID controllers are 
> great is that they have the capability to "eat" storms of writes very quickly 
> and schedule them pretty efficiently with respect to a concurrent continuous 
> stream of reads. But this ability is defeated if we just throw data at it 
> until entirely full. Instead a rate-limited approach means that data can be 
> thrown at said RAID controller at a reasonable pace and it can be allowed to 
> do its job of limiting the impact of those writes on reads.
> I propose a mechanism whereby all such backgrounds reads are rate limited in 
> terms of MB/sec throughput. There would be:
> * A configuration option to state the target rate (probably a global, until 
> there is support for per-cf sstable placement)
> * A configuration option to state the sampling granularity. The granularity 
> would have to be small enough for rate limiting to be effective (i.e., the 
> amount of I/O generated in between each sample must be reasonably small) 
> while large enough to not be expensive (neither in terms of gettimeofday() 
> type over-head, nor in terms of causing smaller writes so that would-be 
> streaming operations become seek bound). There would likely be a recommended 
> value on the order of say 5 MB, with a recommendation to multiply that with 
> the number of disks in the underlying device (5 MB assumes classic mechanical 
> disks).
> Because of coarse granularity (= infrequent synchronization), there should 
> not be a significant overhead associated with maintaining shared global rate 
> limiter for the Cassandra instance.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1673) Add a way to force remove tombstones before GCGraceSeconds

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1673:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Add a way to force remove tombstones before GCGraceSeconds
> --
>
> Key: CASSANDRA-1673
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1673
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: T Jake Luciani
>Assignee: Joaquin Casares
>Priority: Minor
> Fix For: 0.7.3
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> In some circumstances it might be useful to be able to force delete 
> tombstones before GCGraceSeconds has elapsed.
> Example, If you know your cluster is consistent and you want to free up space.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1585) Schema rename with compaction race

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1585:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Schema rename with compaction race
> --
>
> Key: CASSANDRA-1585
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1585
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Stu Hood
>Assignee: Gary Dusbabek
>Priority: Critical
> Fix For: 0.7.3
>
>
> We observed what appeared to be a race between an ongoing compaction and a 
> system_drop_cf call. The destination SSTable of the compaction was not 
> removed by the drop, so it remained in the data directory. Recreating a CF of 
> the same name caused the SSTable to become active again (or to at least show 
> in the gossiped load).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1559) make {SuperColumn,ColumnFamily}.addColumn() correct in the face of concurrent removals

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1559:


Fix Version/s: (was: 0.7.2)
   0.7.3

> make {SuperColumn,ColumnFamily}.addColumn() correct in the face of concurrent 
> removals
> --
>
> Key: CASSANDRA-1559
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1559
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7 beta 1
>Reporter: Peter Schuller
>Priority: Trivial
> Fix For: 0.7.3
>
> Attachments: trunk-1559.txt
>
>
> As reported on the mailing list, the particular lock-less spinning used 
> during column addition is not correct for the case of concurrent removals. As 
> also agreed, concurrent removals do not actually happen.
> Attaching a patch which, as jbellis suggests as an alternative to documenting 
> the restriction on removals not being supported concurrently, makes 
> addColumn() handle that case.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1877) sstable2json fails to skip corrupt rows

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1877:


Fix Version/s: (was: 0.7.2)
   0.7.3

> sstable2json fails to skip corrupt rows
> ---
>
> Key: CASSANDRA-1877
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1877
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Nick Bailey
>Assignee: Nick Bailey
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 0001-Attempt-to-skip-corruption-in-sstable-export.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> sstable2json only catches IOException when attempting to skip corrupt rows. 
> It should probably just catch Throwable.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1761) Indexes: Auto-generating the CFname may collide with user-generated names

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1761:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Indexes: Auto-generating the CFname may collide with user-generated names
> -
>
> Key: CASSANDRA-1761
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1761
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7 beta 3
>Reporter: Jon Hermes
>Assignee: Jon Hermes
>Priority: Minor
> Fix For: 0.7.3
>
>   Original Estimate: 16h
>  Remaining Estimate: 16h
>
> {noformat}column_families:
>   - name: CF
> comparator: BytesType
> column_metadata: 
>   - name: foo
> index_name: 626172
> index_type: KEYS
>   - name: bar
> index_type: KEYS{noformat}
> Auto-generated versus user-supplied names collide in the YAML above. The code:
> {code}cfname = parentCf + "." + (info.getIndexName() == null ? 
> FBUtilities.bytesToHex(info.name) : info.getIndexName()){code}
> From the first ColumnDefinition, we create cfname = "CF.626172" (from the 
> fail clause of the ternany, user-supplied name)
> From the second ColumnDefinition, we create cfname = "CF.626172" (from the 
> pass clause of the ternary, we generate the name)
> They're in hex form. This is possible, but fairly unlikely that someone will 
> do this.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1828) Create a pig storefunc

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1828:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Create a pig storefunc
> --
>
> Key: CASSANDRA-1828
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1828
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Contrib, Hadoop
>Affects Versions: 0.7 beta 1
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 0001-add-storage-ability-to-pig-CassandraStorage.txt, 
> 0002-Fix-build-bin-script.txt
>
>   Original Estimate: 32h
>  Remaining Estimate: 32h
>
> Now that we have a ColumnFamilyOutputFormat, we can write data back to 
> cassandra in mapreduce jobs, however we can only do this in java.  It would 
> be nice if pig could also output to cassandra.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1902) Migrate cached pages during compaction

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1902:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Migrate cached pages during compaction 
> ---
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 0.7.3
>
> Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
>   Original Estimate: 32h
>  Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1956) Convert row cache to row+filter cache

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1956:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Convert row cache to row+filter cache
> -
>
> Key: CASSANDRA-1956
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1956
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Stu Hood
>Assignee: Daniel Doubleday
> Fix For: 0.7.3
>
> Attachments: 0001-row-cache-filter.patch
>
>
> Changing the row cache to a row+filter cache would make it much more useful. 
> We currently have to warn against using the row cache with wide rows, where 
> the read pattern is typically a peek at the head, but this usecase would be 
> perfect supported by a cache that stored only columns matching the filter.
> Possible implementations:
> * (copout) Cache a single filter per row, and leave the cache key as is
> * Cache a list of filters per row, leaving the cache key as is: this is 
> likely to have some gotchas for weird usage patterns, and it requires the 
> list overheard
> * Change the cache key to "rowkey+filterid": basically ideal, but you need a 
> secondary index to lookup cache entries by rowkey so that you can keep them 
> in sync with the memtable
> * others?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1955) memtable flushing can block writes due to queue size limitations even though overall write throughput is below capacity

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1955:


Fix Version/s: (was: 0.7.2)
   0.7.3

> memtable flushing can block writes due to queue size limitations even though 
> overall write throughput is below capacity
> ---
>
> Key: CASSANDRA-1955
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1955
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Peter Schuller
>Assignee: Peter Schuller
> Fix For: 0.7.3
>
>
> It seems that someone ran into this (see cassandra-user thread "Question re: 
> the use of multiple ColumnFamilies").
> If my interpretation is correct, the queue size is set to the concurrency in 
> the case of the flushSorter, and set to memtable_flush_writers in the case of 
> flushWriter in ColumnFamilyStore.
> While the choice of concurrency for the two executors makes perfect sense, 
> the queue sizing does not. As a user, I would expect, and did expect, that 
> for a given memtable independently tuned (w.r.t. flushing thresholds etc), 
> writes to the CF would not block until there is at least one other memtable 
> *for that CF* waiting to be flushed.
> With the current behavior, if I am not misinterpreting, whether or not writes 
> will inappropriately block is very much dependent on not just the overall 
> write throughput, but also the incidental timing of memtable flushes across 
> multiple column families.
> The simplest way to mitigate (but not fix) this is probably to set the queue 
> size to be equal to the number of column families if that is higher than the 
> number of CPU cores. But that is only a mitigation because nothing prevents 
> e.g. a large number of memtable flushes for a small column family under 
> temporary write load, can still block a large (possibly more important) 
> memtable flush for another CF. Such a shared-but-larger queue would also not 
> prevent heap usage spikes resulting from some a single cf with very large 
> memtable thresholds being rapidly written to, with a queue sized for lots of 
> cf:s that are in practice not used. In other words, this mitigation technique 
> would effectively negate the backpressure mechanism in some cases and likely 
> lead to more people having OOM issues when saturating a CF with writes.
> A more involved change is to make each CF have it's own queue through which 
> flushes go prior to being submitted to flushSorter, which would guarantee 
> that at least one memtable can always be in pending flush state for a given 
> CF. The global queue could effectively have size 1 hard-coded since the queue 
> is no longer really used as if it were a queue.
> The flushWriter is unaffected since it is a separate concern that is supposed 
> to be I/O bound. The current behavior would not be perfect if there is a huge 
> discrepancy between memtable flush thresholds of different memtables, but it 
> does not seem high priority to make a change here in practice.
> So, I propose either:
> (a) changing the flushSorter queue size to be max(num cores, num cfs)
> (b) creating a per-cf queue
> I'll volunteer to work on it as a nice bite sized change, assuming there is 
> agreement on what needs to be done. Given the concerns with (a), I think (b) 
> is the right solution unless it turns out to cause major complexity. Worth 
> noting is that these are not performance sensitive given the low frequency of 
> memtable flushes, so an extra queue:ing step should not be an issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1996) Add to the hadoop integration contrib stuff - more complicated set of text

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1996:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Add to the hadoop integration contrib stuff - more complicated set of text
> --
>
> Key: CASSANDRA-1996
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1996
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib, Hadoop
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Trivial
> Fix For: 0.7.3
>
>
> It would be nice to have a more complicated set of text to do the word count 
> over - perhaps a more interesting example too.  Based on CASSANDRA-1993, it 
> would be nice for sanity checking purposes as well.  This could be for the 
> word count example and perhaps have an easier way to test the pig and hadoop 
> output streaming examples as well.  I've noticed that I forget and people in 
> the channel don't get through the README for the Pig example - maybe there 
> are just too many steps and we could do better with the defaults.  Anyway, so 
> other improvement could go in this ticket as well.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2003) get_range_slices test

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2003:


Fix Version/s: (was: 0.7.2)
   0.7.3

> get_range_slices test
> -
>
> Key: CASSANDRA-2003
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2003
> Project: Cassandra
>  Issue Type: Test
>  Components: Core
> Environment: RandomPartitioner
>Reporter: Kelvin Kakugawa
>Assignee: Kelvin Kakugawa
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 0002-Assert-that-we-don-t-double-count-any-keys.txt, 
> CASSANDRA-2003-0.7-0001.patch, CASSANDRA-2003-0001.patch
>
>
> Test get_range_slices (on an RP cluster) to walk:
> * all keys on each node
> * all keys across cluster

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2008) CLI help incorrect in places

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2008:


Fix Version/s: (was: 0.7.2)
   0.7.3

> CLI help incorrect in places
> 
>
> Key: CASSANDRA-2008
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2008
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Aaron Morton
>Assignee: Aaron Morton
>Priority: Trivial
> Fix For: 0.7.3
>
>
> Found some errors in the CLI help, such as these for create column family.
> - memtable_operations: Flush memtables after this many operations
> - memtable_throughput: ... or after this many bytes have been written
> - memtable_flush_after: ... or after this many seconds
> Should be millions of ops, MB's written and minutes not seconds.  Have 
> confirmed thats how the values are used. Will check all the help. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2019) add interface to DatabaseDescriptor to help setting seeds and tokens at boot time (for EC2 feature)

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2019:


Fix Version/s: (was: 0.7.2)
   0.7.3

> add interface to DatabaseDescriptor to help setting seeds and tokens at boot 
> time (for EC2 feature)
> ---
>
> Key: CASSANDRA-2019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2019
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Yang Yang
> Fix For: 0.7.3
>
> Attachments: DBdscrpt.diff
>
>
> 1)
> in the amazon EC2 environment, machines die off and more frequently, and new 
> instances are brought up more often, so even if we list out all the nodes in 
> the ring as seeds, all of these seeds may have gradually died off after some 
> time. at that time , new nodes joining won't be able to find the existing 
> ring, and will establish a new, separate ring of the same name, which is 
> wrong. 
> we have made some custom code that utilizes external systems to figure out 
> the EC2 autoscaler group, and at boot time feed this info to the new node. 
> but this requires the new node to be able to modify the seeds at boot time. 
> right now the DatabaseDescriptor only has a getSeeds() method, we would like 
> to add a setSeeds()
> 2) similarly, for the token-1  trick, we wrote code to do this automatically. 
>  the StorageService.initServer() code reads the initialToken at boot time, we 
> need to be able to modify this  at boot time. so we would like to add 
> DatabaseDescriptor.setInitialToken()
> patch attached. 
> also theoretically we could do both just by modifying the config file, but 
> that requires running a separate process before cassandra daemon starts, 
> which is not as clean.
> Thanks a lot 
> Yang

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2062) Better control of iterator consumption

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2062:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Better control of iterator consumption
> --
>
> Key: CASSANDRA-2062
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2062
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 0001-Improved-iterator-for-merging-sorted-iterators.txt, 
> 0002-Port-all-collating-consumers-to-MergeIterator.txt, 
> 0003-A-ManyToOne-merge-iterator-implementation.txt, 
> 0004-Port-most-ReducingIterator-consumers-to-ManyToOne.txt
>
>
> The core reason for this ticket is to gain control over the consumption of 
> the lazy nested iterators in the read path.
> {quote}We survive now because we write the size of the row at the front of 
> the row (via some serious acrobatics at write time), which gives us hasNext() 
> for rows for free. But it became apparent while working on the block-based 
> format that hasNext() will not be cheap unless the current item has been 
> consumed. "Consumption" of the row is easy, and blocks will be framed so that 
> they can be very easily skipped, but you don't want to have to seek to the 
> end of the row to answer hasNext, and then seek back to the beginning to 
> consume the row, which is what CollatingIterator would have forced us to 
> do.{quote}
> While we're at it, we can also improve efficiency: for {{M}} iterators 
> containing {{N}} total items, commons.collections.CollatingIterator performs 
> a {{O(M*N)}} merge, and calls hasNext multiple times per returned value. We 
> can do better.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2075) Eliminate excess comparator creation

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2075:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Eliminate excess comparator creation
> 
>
> Key: CASSANDRA-2075
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2075
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Stu Hood
>Priority: Minor
>  Labels: abstract_types, gc
> Fix For: 0.7.3
>
>
> Despite the singleton status of each AbstractType, we end up creating at 
> least one new comparator per query. By making more of the "wrapper" 
> comparators that exist in the codebase members of AbstractType, we could cut 
> down on the "new Comparator" spam.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1978) get_range_slices: allow key and token to be interoperable

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1978:


Fix Version/s: (was: 0.7.2)
   0.7.3

> get_range_slices: allow key and token to be interoperable
> -
>
> Key: CASSANDRA-1978
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1978
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Kelvin Kakugawa
>Assignee: Kelvin Kakugawa
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 
> 0001-CASSANDRA-1978-allow-key-token-to-be-interoperable-i.patch
>
>
> problem: get_range_slices requires two keys or two tokens, so we can't walk a 
> randomly partitioned cluster by token.
> solution: allow keys and tokens to be mixed.  however, if one side is a 
> token, promote the bounds to a dht.Range, instead of a dht.Bounds.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2100) Restart required to change cache_save_period

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2100:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Restart required to change cache_save_period
> 
>
> Key: CASSANDRA-2100
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2100
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Nick Bailey
>Assignee: Jon Hermes
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 2100.txt
>
>
> The cache_save_period is set in the schema for each column family.  However 
> this value is only checked when a node starts up so changing this value isn't 
> really dynamic.
> We should actually change this when the schema changes instead of having to 
> restart.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2045:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Simplify HH to decrease read load when nodes come back
> --
>
> Key: CASSANDRA-2045
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2045
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chris Goffinet
> Fix For: 0.7.3
>
>
> Currently when HH is enabled, hints are stored, and when a node comes back, 
> we begin sending that node data. We do a lookup on the local node for the row 
> to send. To help reduce read load (if a node is offline for long period of 
> time) we should store the data we want forward the node locally instead. We 
> wouldn't have to do any lookups, just take byte[] and send to the destination.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2007) Move demo Keyspace1 definition from casandra.yaml to an input file for cassandra-cli

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2007:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Move demo Keyspace1 definition from casandra.yaml to an input file for 
> cassandra-cli
> 
>
> Key: CASSANDRA-2007
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2007
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Aaron Morton
>Assignee: Aaron Morton
>Priority: Trivial
> Fix For: 0.7.3
>
> Attachments: 2007-1.patch
>
>
> Th suggested way to make schema changes is through cassandra-cli but we do 
> not have an example of how to do it. Additionally, to get the demo keyspace 
> created users have to use a different process. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2047) Stress --keep-going should become --keep-trying

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2047:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Stress --keep-going should become --keep-trying
> ---
>
> Key: CASSANDRA-2047
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2047
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.3
>
> Attachments: CASSANDRA-2047.patch
>
>   Original Estimate: 4h
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The --keep-going flag makes the stress tool drop messages that time out on 
> the floor.
> I think it's more realistic (esp for a stress tool) to keep trying till this 
> read/write succeeds.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2020) stress.java performance falls off heavily towards the end

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2020:


Fix Version/s: (was: 0.7.2)
   0.7.3

> stress.java performance falls off heavily towards the end
> -
>
> Key: CASSANDRA-2020
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2020
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Reporter: Brandon Williams
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 0.7.3
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> This is due to threads completing towards the end, such that there aren't 
> enough to fully stress the cluster.  The main problem here is that 
> stress.java is a straight port of stress.py, where each thread runs through 
> some range until it's done, and the threads finish at different times 
> (probably offset by jvm warmup time.)  Instead, a producer/consumer model 
> would work better.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2061) Missing logging for some exceptions

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2061:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Missing logging for some exceptions
> ---
>
> Key: CASSANDRA-2061
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2061
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Stu Hood
>Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 2061-0.7.txt, 2061.txt
>
>
> {quote}Since you are using ScheduledThreadPoolExecutor.schedule(), the 
> exception was swallowed by the FutureTask.
> You will have to perform a get() method on the ScheduledFuture, and you will 
> get ExecutionException if there was any exception occured in run().{quote}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2107) MessageDigests are created in several places, centralize the creation and error handling

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2107:


Fix Version/s: (was: 0.7.2)
   0.7.3

> MessageDigests are created in several places, centralize the creation and 
> error handling
> 
>
> Key: CASSANDRA-2107
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2107
> Project: Cassandra
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Matthew F. Dennis
>Assignee: Matthew F. Dennis
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 2107-cassandra-0.7.txt
>
>
> MessageDigest.getInstance("SomeAlg") throws NoSuchAlgorithm exception (a 
> checked exception).  This is annoying as it causes everyone that uses 
> standard algs like MD5 to surround their code in try/catch.  We should 
> concentrate the creation in one method that doesn't raise an exception (i.e. 
> catches NoSuchAlgorithm and raises a RuntimeException) just to clean the code 
> up a little.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2108) stress.py uses the same fmt string in several places, it should be centralized

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2108:


Fix Version/s: (was: 0.7.2)
   0.7.3

> stress.py uses the same fmt string in several places, it should be centralized
> --
>
> Key: CASSANDRA-2108
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2108
> Project: Cassandra
>  Issue Type: Improvement
>Affects Versions: 0.7.0
>Reporter: Matthew F. Dennis
>Assignee: Matthew F. Dennis
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 2108-cassandra-0.7.txt
>
>
> the variable "fmt" is duplicated in several places in stress.py  While this 
> isn't really a problem, it's slightly annoying as I often temporarily hack up 
> stress.py to generate different types of keys and having it spread throughput 
> the file is somewhat annoying.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2088) Temp files for failed compactions/streaming not cleaned up

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2088:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Temp files for failed compactions/streaming not cleaned up
> --
>
> Key: CASSANDRA-2088
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2088
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Stu Hood
> Fix For: 0.7.3
>
>
> From separate reports, compaction and repair are currently missing 
> opportunities to clean up tmp files after failures.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2034) Make Read Repair unnecessary when Hinted Handoff is enabled

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2034:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Make Read Repair unnecessary when Hinted Handoff is enabled
> ---
>
> Key: CASSANDRA-2034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2034
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
>
> Currently, HH is purely an optimization -- if a machine goes down, enabling 
> HH means RR/AES will have less work to do, but you can't disable RR entirely 
> in most situations since HH doesn't kick in until the FailureDetector does.
> Let's add a scheduled task to the mutate path, such that we return to the 
> client normally after ConsistencyLevel is achieved, but after RpcTimeout we 
> check the responseHandler write acks and write local hints for any missing 
> targets.
> This would making disabling RR when HH is enabled a much more reasonable 
> option, which has a huge impact on read throughput.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2129) removetoken after removetoken rf error fails to work

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2129:


Fix Version/s: (was: 0.7.2)
   0.7.3

> removetoken after removetoken rf error fails to work
> 
>
> Key: CASSANDRA-2129
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2129
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Mike Bulman
>Assignee: Nick Bailey
> Fix For: 0.7.3
>
>
> 2 node cluster, a keyspace existed with rf=2.  Tried removetoken and got:
> mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ 
> bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794
> Exception in thread "main" java.lang.IllegalStateException: replication 
> factor (2) exceeds number of endpoints (1)
> Deleted the keyspace, and tried again:
> mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ 
> bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794
> Exception in thread "main" java.lang.UnsupportedOperationException: This node 
> is already processing a removal. Wait for it to complete.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2137) stress.java should have a way to specify the replication strategy

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2137:


Fix Version/s: (was: 0.7.2)
   0.7.3

> stress.java should have a way to specify the replication strategy
> -
>
> Key: CASSANDRA-2137
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2137
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.1
>Reporter: Brandon Williams
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.3
>
> Attachments: CASSANDRA-2137-v2.patch, CASSANDRA-2137.patch
>
>   Original Estimate: 0.5h
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> It would be nice to have stress.java be a one stop shop when you need to test 
> something other than SimpleStrategy.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2145) Simplify ColumnSortedMap

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2145:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Simplify ColumnSortedMap
> 
>
> Key: CASSANDRA-2145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2145
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Assignee: Stu Hood
> Fix For: 0.7.3
>
> Attachments: 
> 0001-CASSANDRA-2145-Extract-serialization-from-ColumnSorted.txt
>
>
> We can simplify ColumnSortedMap substantially by hijacking the shell of 
> another sorted map implementation, rather than having scads of methods 
> implemented as "UnsupportedOperation"s.
> Also, CASSANDRA-674 needs a way to feed a supercolumn an arbitrary sorted 
> iterator, rather than necessarily deserializing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2160) Add "join" command to the nodetool

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2160:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Add "join" command to the nodetool
> --
>
> Key: CASSANDRA-2160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2160
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.0
>Reporter: Pavel Yaskevich
>Assignee: Pavel Yaskevich
>Priority: Minor
> Fix For: 0.7.3
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Nodetool should be able to make current node join/re-join a ring.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2158) memtable_throughput_in_mb can not support sizes over 2.2 gigs because of an integer overflow.

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2158:


Fix Version/s: (was: 0.7.2)
   0.7.3

> memtable_throughput_in_mb can not support sizes over 2.2 gigs because of an 
> integer overflow.
> -
>
> Key: CASSANDRA-2158
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2158
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Eddie
>Assignee: Jonathan Ellis
> Fix For: 0.7.3
>
> Attachments: 2158.txt
>
>
> If memtable_throughput_in_mb is set past 2.2 gigs, no errors are thrown.  
> However, as soon as data starts being written it is almost immediately being 
> flushed.  Several hundred SSTables are created in minutes.  I am almost 
> positive that the problem is that when memtable_throughput_in_mb is being 
> converted into bytes the result is stored in an integer, which is overflowing.
> From memtable.java:
> private final int THRESHOLD;
> private final int THRESHOLD_COUNT;
> ...
> this.THRESHOLD = cfs.getMemtableThroughputInMB() * 1024 * 1024;
> this.THRESHOLD_COUNT = (int) (cfs.getMemtableOperationsInMillions() * 1024 * 
> 1024);
> NOTE:
> I also think currentThroughput also needs to be changed from an int to a 
> long.  I'm not sure if it is as simple as this or if this also is used in 
> other places.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2162) cassandra-cli --keyspace option doesn't work properly when used with authentication

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2162:


Fix Version/s: (was: 0.7.2)
   0.7.3

> cassandra-cli --keyspace option doesn't work properly when used with 
> authentication
> ---
>
> Key: CASSANDRA-2162
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2162
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Affects Versions: 0.7.0
>Reporter: Jim Ancona
>Priority: Minor
>  Labels: authentication, cli
> Fix For: 0.7.3
>
> Attachments: 
> 0001-Swap-order-of-authentication-and-keyspace-processing.patch
>
>
> The logic to select the keyspace is applied before authentication credentials 
> are processed in cassandra-cli. This leads to a "Keyspace FOO not found" 
> message at login for a keyspace that exists.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2166) Add getNaturalEndpoints method for token/ip

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2166:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Add getNaturalEndpoints method for token/ip
> ---
>
> Key: CASSANDRA-2166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2166
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.7.1
>Reporter: Nick Bailey
>Assignee: Ankit Shah
>Priority: Minor
> Fix For: 0.7.3
>
>
> getNaturalEndpoint currently takes a key and gives you the replicas. It would 
> also be nice to have a method that takes a token/ip and gives you the 
> replicas.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2154) Update StreamingTransferTest to include multiple ColumnFamilies

2011-02-17 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2154:


Fix Version/s: (was: 0.7.2)
   0.7.3

> Update StreamingTransferTest to include multiple ColumnFamilies
> ---
>
> Key: CASSANDRA-2154
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2154
> Project: Cassandra
>  Issue Type: Test
>  Components: Tests
>Reporter: Jonathan Ellis
>Assignee: Pavel Yaskevich
> Fix For: 0.7.3
>
> Attachments: CASSANDRA-2154-StreamingTransferTest.patch
>
>   Original Estimate: 4h
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The goal is to make sure we catch any future regressions like CASSANDRA-1992.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1071771 - /cassandra/branches/cassandra-0.7/contrib/stress/src/org/apache/cassandra/contrib/stress/Session.java

2011-02-17 Thread brandonwilliams
Author: brandonwilliams
Date: Thu Feb 17 20:30:54 2011
New Revision: 1071771

URL: http://svn.apache.org/viewvc?rev=1071771&view=rev
Log:
Add ability to configure replication strategy and strategy opts to
stress.
Patch by Pavel Yaskevich, reviewed by brandonwilliams for CASSANDRA-2137

Modified:

cassandra/branches/cassandra-0.7/contrib/stress/src/org/apache/cassandra/contrib/stress/Session.java

Modified: 
cassandra/branches/cassandra-0.7/contrib/stress/src/org/apache/cassandra/contrib/stress/Session.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/contrib/stress/src/org/apache/cassandra/contrib/stress/Session.java?rev=1071771&r1=1071770&r2=1071771&view=diff
==
--- 
cassandra/branches/cassandra-0.7/contrib/stress/src/org/apache/cassandra/contrib/stress/Session.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/contrib/stress/src/org/apache/cassandra/contrib/stress/Session.java
 Thu Feb 17 20:30:54 2011
@@ -19,9 +19,7 @@ package org.apache.cassandra.contrib.str
 
 import java.io.*;
 import java.nio.ByteBuffer;
-import java.util.ArrayList;
-import java.util.Arrays;
-import java.util.List;
+import java.util.*;
 import java.util.concurrent.atomic.AtomicIntegerArray;
 import java.util.concurrent.atomic.AtomicLongArray;
 
@@ -29,6 +27,7 @@ import org.apache.commons.cli.*;
 
 import org.apache.cassandra.db.ColumnFamilyType;
 import org.apache.cassandra.thrift.*;
+import org.apache.commons.lang.StringUtils;
 import org.apache.thrift.protocol.TBinaryProtocol;
 import org.apache.thrift.transport.TFramedTransport;
 import org.apache.thrift.transport.TSocket;
@@ -68,6 +67,8 @@ public class Session
 availableOptions.addOption("l",  "replication-factor",   true,   
"Replication Factor to use when creating needed column families, default:1");
 availableOptions.addOption("e",  "consistency-level",true,   
"Consistency Level to use (ONE, QUORUM, LOCAL_QUORUM, EACH_QUORUM, ALL, ANY), 
default:ONE");
 availableOptions.addOption("x",  "create-index", true,   "Type 
of index to create on needed column families (KEYS)");
+availableOptions.addOption("R",  "replication-strategy", true,   
"Replication strategy to use (only on insert if keyspace does not exist), 
default:org.apache.cassandra.locator.SimpleStrategy");
+availableOptions.addOption("O",  "strategy-properties",  true,   
"Replication strategy properties in the following format 
:,:,...");
 }
 
 private int numKeys  = 1000 * 1000;
@@ -93,6 +94,8 @@ public class Session
 private Stress.Operation operation = Stress.Operation.INSERT;
 private ColumnFamilyType columnFamilyType = ColumnFamilyType.Standard;
 private ConsistencyLevel consistencyLevel = ConsistencyLevel.ONE;
+private String replicationStrategy = 
"org.apache.cassandra.locator.SimpleStrategy";
+private Map replicationStrategyOptions = new 
HashMap();
 
 // required by Gaussian distribution.
 protected int   mean;
@@ -202,6 +205,24 @@ public class Session
 
 if (cmd.hasOption("x"))
 indexType = 
IndexType.valueOf(cmd.getOptionValue("x").toUpperCase());
+
+if (cmd.hasOption("R"))
+replicationStrategy = cmd.getOptionValue("R");
+
+if (cmd.hasOption("O"))
+{
+String[] pairs = StringUtils.split(cmd.getOptionValue("O"), 
',');
+
+for (String pair : pairs)
+{
+String[] keyAndValue = StringUtils.split(pair, ':');
+
+if (keyAndValue.length != 2)
+throw new RuntimeException("Invalid 
--strategy-properties value.");
+
+replicationStrategyOptions.put(keyAndValue[0], 
keyAndValue[1]);
+}
+}
 }
 catch (ParseException e)
 {
@@ -337,8 +358,14 @@ public class Session
 CfDef superCfDef = new CfDef("Keyspace1", 
"Super1").setColumn_metadata(Arrays.asList(superSubColumn)).setColumn_type("Super");
 
 keyspace.setName("Keyspace1");
-
keyspace.setStrategy_class("org.apache.cassandra.locator.SimpleStrategy");
+keyspace.setStrategy_class(replicationStrategy);
 keyspace.setReplication_factor(replicationFactor);
+
+if (!replicationStrategyOptions.isEmpty())
+{
+keyspace.setStrategy_options(replicationStrategyOptions);
+}
+
 keyspace.setCf_defs(new ArrayList(Arrays.asList(standardCfDef, 
superCfDef)));
 
 Cassandra.Client client = getClient(false);




[jira] Commented: (CASSANDRA-2162) cassandra-cli --keyspace option doesn't work properly when used with authentication

2011-02-17 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996052#comment-12996052
 ] 

Pavel Yaskevich commented on CASSANDRA-2162:


LGTM

> cassandra-cli --keyspace option doesn't work properly when used with 
> authentication
> ---
>
> Key: CASSANDRA-2162
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2162
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Affects Versions: 0.7.0
>Reporter: Jim Ancona
>Priority: Minor
>  Labels: authentication, cli
> Fix For: 0.7.3
>
> Attachments: 
> 0001-Swap-order-of-authentication-and-keyspace-processing.patch
>
>
> The logic to select the keyspace is applied before authentication credentials 
> are processed in cassandra-cli. This leads to a "Keyspace FOO not found" 
> message at login for a keyspace that exists.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2047) Stress --keep-going should become --keep-trying

2011-02-17 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996059#comment-12996059
 ] 

Brandon Williams commented on CASSANDRA-2047:
-

I think we still need keep-going for reads when testing read repair. 
CASSANDRA-2069 would be much harder to verify without it.

> Stress --keep-going should become --keep-trying
> ---
>
> Key: CASSANDRA-2047
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2047
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 0.7.1
>Reporter: T Jake Luciani
>Assignee: Pavel Yaskevich
>Priority: Trivial
> Fix For: 0.7.3
>
> Attachments: CASSANDRA-2047.patch
>
>   Original Estimate: 4h
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The --keep-going flag makes the stress tool drop messages that time out on 
> the floor.
> I think it's more realistic (esp for a stress tool) to keep trying till this 
> read/write succeeds.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2100) Restart required to change cache_save_period

2011-02-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2100:
--

Reviewer: brandon.williams

> Restart required to change cache_save_period
> 
>
> Key: CASSANDRA-2100
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2100
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Nick Bailey
>Assignee: Jon Hermes
>Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 2100.txt
>
>
> The cache_save_period is set in the schema for each column family.  However 
> this value is only checked when a node starts up so changing this value isn't 
> really dynamic.
> We should actually change this when the schema changes instead of having to 
> restart.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1071777 - in /cassandra/branches/cassandra-0.6: CHANGES.txt src/java/org/apache/cassandra/db/CompactionManager.java

2011-02-17 Thread jbellis
Author: jbellis
Date: Thu Feb 17 20:46:51 2011
New Revision: 1071777

URL: http://svn.apache.org/viewvc?rev=1071777&view=rev
Log:
make key cache preheating default to false
patch by jbellis; reviewed by brandonwilliams for CASSANDRA-2175

Modified:
cassandra/branches/cassandra-0.6/CHANGES.txt

cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/CompactionManager.java

Modified: cassandra/branches/cassandra-0.6/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/CHANGES.txt?rev=1071777&r1=1071776&r2=1071777&view=diff
==
--- cassandra/branches/cassandra-0.6/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.6/CHANGES.txt Thu Feb 17 20:46:51 2011
@@ -4,6 +4,8 @@
  * update commitlog replay to catch bogus RowMutation lengths caused
by unclean shutdown (CASSANDRA-2128)
  * add -Dhinted_handoff_throttle option (CASSANDRA-2161)
+ * make key cache preheating default to false; enable with
+   -Dcompaction_preheat_key_cache=true (CASSANDRA-2175)
 
 
 0.6.11

Modified: 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/CompactionManager.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/CompactionManager.java?rev=1071777&r1=1071776&r2=1071777&view=diff
==
--- 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/CompactionManager.java
 (original)
+++ 
cassandra/branches/cassandra-0.6/src/java/org/apache/cassandra/db/CompactionManager.java
 Thu Feb 17 20:46:51 2011
@@ -282,6 +282,7 @@ public class CompactionManager implement
 executor.beginCompaction(cfs, ci);
 
 Map cachedKeys = new 
HashMap();
+boolean preheatKeyCache = 
Boolean.getBoolean("compaction_preheat_key_cache");
 
 try
 {
@@ -309,12 +310,15 @@ public class CompactionManager implement
 logger.warn("Large row " + row.key.key + " in " + 
cfs.getColumnFamilyName() + " " + rowsize + " bytes");
 cfs.addToCompactedRowStats(rowsize);
 
-for (SSTableReader sstable : sstables)
+if (preheatKeyCache)
 {
-if (sstable.getCachedPosition(row.key) != null)
+for (SSTableReader sstable : sstables)
 {
-cachedKeys.put(row.key, new 
SSTable.PositionSize(prevpos, rowsize));
-break;
+if (sstable.getCachedPosition(row.key) != null)
+{
+cachedKeys.put(row.key, new 
SSTable.PositionSize(prevpos, rowsize));
+break;
+}
 }
 }
 }
@@ -326,7 +330,7 @@ public class CompactionManager implement
 
 SSTableReader ssTable = writer.closeAndOpenReader();
 cfs.replaceCompactedSSTables(sstables, Arrays.asList(ssTable));
-for (Entry entry : 
cachedKeys.entrySet())
+for (Entry entry : 
cachedKeys.entrySet()) // empty if preheat is off
 ssTable.cacheKey(entry.getKey(), entry.getValue());
 submitMinorIfNeeded(cfs);
 




svn commit: r1071778 - in /cassandra/branches/cassandra-0.7: CHANGES.txt contrib/py_stress/stress.py

2011-02-17 Thread brandonwilliams
Author: brandonwilliams
Date: Thu Feb 17 20:52:02 2011
New Revision: 1071778

URL: http://svn.apache.org/viewvc?rev=1071778&view=rev
Log:
Make py_stress key format global.
Patch by Matthew Dennis, reviewed by brandonwilliams for CASSANDRA-2108.

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt
cassandra/branches/cassandra-0.7/contrib/py_stress/stress.py

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1071778&r1=1071777&r2=1071778&view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Thu Feb 17 20:52:02 2011
@@ -5,6 +5,7 @@
  * fixes for cache save/load (CASSANDRA-2172, -2174)
  * Handle whole-row deletions in CFOutputFormat (CASSANDRA-2014)
  * Make memtable_flush_writers flush in parallel (CASSANDRA-2178)
+ * refactor stress.py to have only one copy of the format string used for 
creating row keys (CASSANDRA-2108)
 
 
 0.7.2

Modified: cassandra/branches/cassandra-0.7/contrib/py_stress/stress.py
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/contrib/py_stress/stress.py?rev=1071778&r1=1071777&r2=1071778&view=diff
==
--- cassandra/branches/cassandra-0.7/contrib/py_stress/stress.py (original)
+++ cassandra/branches/cassandra-0.7/contrib/py_stress/stress.py Thu Feb 17 
20:52:02 2011
@@ -127,6 +127,9 @@ if options.nodefile != None:
 with open(options.nodefile) as f:
 nodes = [n.strip() for n in f.readlines() if len(n.strip()) > 0]
 
+#format string for keys
+fmt = '%0' + str(len(str(total_keys))) + 'd'
+
 # a generator that generates all keys according to a bell curve centered
 # around the middle of the keys generated (0..total_keys).  Remember that
 # about 68% of keys will be within stdev away from the mean and 
@@ -148,7 +151,6 @@ def generate_values():
 return values
 
 def key_generator_gauss():
-fmt = '%0' + str(len(str(total_keys))) + 'd'
 while True:
 guess = gauss(mean, stdev)
 if 0 <= guess < total_keys:
@@ -157,7 +159,6 @@ def key_generator_gauss():
 # a generator that will generate all keys w/ equal probability.  this is the
 # worst case for caching.
 def key_generator_random():
-fmt = '%0' + str(len(str(total_keys))) + 'd'
 return fmt % randint(0, total_keys - 1)
 
 key_generator = key_generator_gauss
@@ -220,7 +221,6 @@ class Inserter(Operation):
 def run(self):
 values = generate_values()
 columns = [Column('C' + str(j), 'unset', time.time() * 100) for j 
in xrange(columns_per_key)]
-fmt = '%0' + str(len(str(total_keys))) + 'd'
 if 'super' == options.cftype:
 supers = [SuperColumn('S' + str(j), columns) for j in 
xrange(supers_per_key)]
 for i in self.range:
@@ -295,7 +295,6 @@ class RangeSlicer(Operation):
 end = self.range[-1]
 current = begin
 last = current + options.rangecount
-fmt = '%0' + str(len(str(total_keys))) + 'd'
 p = SlicePredicate(slice_range=SliceRange('', '', False, 
columns_per_key))
 if 'super' == options.cftype:
 while current < end:
@@ -348,7 +347,6 @@ class RangeSlicer(Operation):
 # from the thread's appointed range
 class IndexedRangeSlicer(Operation):
 def run(self):
-fmt = '%0' + str(len(str(total_keys))) + 'd'
 p = SlicePredicate(slice_range=SliceRange('', '', False, 
columns_per_key))
 values = generate_values()
 parent = ColumnParent('Standard1')




[jira] Commented: (CASSANDRA-1311) Support (asynchronous) triggers

2011-02-17 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996063#comment-12996063
 ] 

Stu Hood commented on CASSANDRA-1311:
-

I haven't really seen a good example of where entity groups are not just a pain 
in the user's rear end: essentially, we are breaking down and asking users to 
perform their own partitioning so that we can give them strong consistency 
between two dimensions that are normally eventually consistent.

Not a strong feeling, just something that's been nagging at me.

> Support (asynchronous) triggers
> ---
>
> Key: CASSANDRA-1311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1311
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Contrib
>Reporter: Maxim Grinev
> Fix For: 0.8
>
> Attachments: HOWTO-PatchAndRunTriggerExample-update1.txt, 
> HOWTO-PatchAndRunTriggerExample.txt, ImplementationDetails-update1.pdf, 
> ImplementationDetails.pdf, trunk-967053.txt, trunk-984391-update1.txt, 
> trunk-984391-update2.txt
>
>
> Asynchronous triggers is a basic mechanism to implement various use cases of 
> asynchronous execution of application code at database side. For example to 
> support indexes and materialized views, online analytics, push-based data 
> propagation.
> Please find the motivation, triggers description and list of applications:
> http://maxgrinev.com/2010/07/23/extending-cassandra-with-asynchronous-triggers/
> An example of using triggers for indexing:
> http://maxgrinev.com/2010/07/23/managing-indexes-in-cassandra-using-async-triggers/
> Implementation details are attached.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >