date:20110823

[jira] [Commented] (CASSANDRA-2995) Making Storage Engine Pluggable

2011-08-23 Thread Terje Marthinussen (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089352#comment-13089352
 ] 

Terje Marthinussen commented on CASSANDRA-2995:
---

I have to agree with Stu here. 

Making the storage layer more pluggable and inspiring people to make 
alternative storage engines must be a good thing and the data model, or the 
functionality provided by it, is not magical.

Also, I also increasingly feel we have too many pieces of code somehow 
"touching" to a larger or lesser degree the internals of code manipulating the 
internals of sstables today (compactions, streaming, cache updates, etc) and 
this increases risk of sstable corruptions, which there has been way too many 
of since the 0.7 branch was made.

With ssd's and memory (for caching) dropping faster in price than most of us 
can track, I will (looking forwards) gladly take some performance penalties due 
to slight increase in random ops, in return for better isolation and reduced 
risk of data corruptions.

> Making Storage Engine Pluggable
> ---
>
> Key: CASSANDRA-2995
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2995
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 0.8.2
>Reporter: Muga Nishizawa
>
> Will you design and implement Cassandra's storage engine API like MyCassandra?
> MyCassandra provides extensible architecture for pluging other storage 
> engines to Cassandra like MySQL.  
> https://github.com/sunsuk7tp/MyCassandra/
>   
> It could be advantageous for Cassandra to make the storage engine pluggable.  
> This could allow Cassandra to 
> - deal with potential use cases where maybe the current sstables are not the 
> best fit
> - allow several types of internal storage formats (at the same time) 
> optimized for different data types
> - allow easier experiments and research on new storage formats (encourage 
> research institutions to do strange things with Cassandra)
> - there could also be potential advantages from better isolation of the data 
> engine in terms of less risk for data corruptions if other parts of Cassandra 
> change

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2645) Improve getBootstrapToken efficiency

2011-08-23 Thread Marcus Eriksson (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-2645:
---

Attachment: patch-2645.txt

> Improve getBootstrapToken efficiency
> 
>
> Key: CASSANDRA-2645
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2645
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Priority: Trivial
>  Labels: lhf
> Attachments: patch-2645.txt
>
>
> The destination for a getBootstrapToken request unnecessarily sorts the key 
> samples. Since each set of samples is individually sorted, the problem boils 
> down to "median of sorted lists".

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2645) Improve getBootstrapToken efficiency

2011-08-23 Thread Marcus Eriksson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089404#comment-13089404
 ] 

Marcus Eriksson commented on CASSANDRA-2645:


speedup is actually minimal so i doubt it is worth it

> Improve getBootstrapToken efficiency
> 
>
> Key: CASSANDRA-2645
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2645
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Priority: Trivial
>  Labels: lhf
> Attachments: patch-2645.txt
>
>
> The destination for a getBootstrapToken request unnecessarily sorts the key 
> samples. Since each set of samples is individually sorted, the problem boils 
> down to "median of sorted lists".

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2500) Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter

2011-08-23 Thread Kelley Reynolds (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089420#comment-13089420
 ] 

Kelley Reynolds commented on CASSANDRA-2500:


It's not quite as easy as a one-liner but I have manually modified the 
generated thrift bindings to put them in a unique namespace so they will not 
conflict with fauna. Good news: Compatibility! Bad news: You get to do that for 
every release since it's no longer automatic. (could almost certainly be made 
automatic but I wouldn't call that an improvement from an automated standalone 
thrift binding release)

> Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter
> ---
>
> Key: CASSANDRA-2500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2500
> Project: Cassandra
>  Issue Type: Task
>  Components: API
>Reporter: Jon Hermes
>Assignee: Kelley Reynolds
>  Labels: cql
> Fix For: 0.8.5
>
> Attachments: 2500.txt, genthriftrb.txt, rbcql-0.0.0.tgz
>
>
> Create a ruby driver for CQL.
> Lacking something standard (such as py-dbapi), going with something common 
> instead -- RoR ActiveRecord Connection Adapter 
> (http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/AbstractAdapter.html).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2500) Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter

2011-08-23 Thread Kelley Reynolds (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089420#comment-13089420
 ] 

Kelley Reynolds edited comment on CASSANDRA-2500 at 8/23/11 12:33 PM:
--

It's not quite as easy as a one-liner but I have manually modified the 
generated thrift bindings to put them in a unique namespace so they will not 
conflict with fauna. Good news: Compatibility! Bad news: You get to do that for 
every release since it's no longer automatic. (could almost certainly be made 
automatic but I wouldn't call that an improvement from an automated standalone 
thrift binding release which also solves other problems at the same time)

  was (Author: kreynolds):
It's not quite as easy as a one-liner but I have manually modified the 
generated thrift bindings to put them in a unique namespace so they will not 
conflict with fauna. Good news: Compatibility! Bad news: You get to do that for 
every release since it's no longer automatic. (could almost certainly be made 
automatic but I wouldn't call that an improvement from an automated standalone 
thrift binding release)
  
> Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter
> ---
>
> Key: CASSANDRA-2500
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2500
> Project: Cassandra
>  Issue Type: Task
>  Components: API
>Reporter: Jon Hermes
>Assignee: Kelley Reynolds
>  Labels: cql
> Fix For: 0.8.5
>
> Attachments: 2500.txt, genthriftrb.txt, rbcql-0.0.0.tgz
>
>
> Create a ruby driver for CQL.
> Lacking something standard (such as py-dbapi), going with something common 
> instead -- RoR ActiveRecord Connection Adapter 
> (http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/AbstractAdapter.html).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3068) Fix count()

2011-08-23 Thread T Jake Luciani (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089437#comment-13089437
 ] 

T Jake Luciani commented on CASSANDRA-3068:
---

I agree offering transposed rows in CQL is a better way to support count(). 
achieves the same functionality while keeping SQL spec intact.

> Fix count()
> ---
>
> Key: CASSANDRA-3068
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3068
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: API, Core
>Reporter: Jonathan Ellis
> Fix For: 1.0
>
>
> count() has been broken since it was introduced in CASSANDRA-1704.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3024) sstable and message varint encoding

2011-08-23 Thread Vladimir Loncar (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089447#comment-13089447
 ] 

Vladimir Loncar commented on CASSANDRA-3024:


Unfortunately, due to time constraints, I am unable to continue working on this 
ticket. If anyone wishes to take over, feel free to do so. If not, I will try 
to find time in October to complete this.

> sstable and message varint encoding
> ---
>
> Key: CASSANDRA-3024
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3024
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Priority: Minor
> Fix For: 1.0
>
>
> We could save some sstable space by encoding longs and ints as vlong and 
> vint, respectively.  (Probably most "short" lengths would be better as vint 
> as well.)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[Cassandra Wiki] Update of "FrontPage" by Ryuan Murphy

2011-08-23 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "FrontPage" page has been changed by Ryuan Murphy:
http://wiki.apache.org/cassandra/FrontPage?action=diff&rev1=70&rev2=71

   * [[FrontPage_JP|Japanese 日本語]]
   * [[FrontPage_PT-BR|BrazilianPortuguese Português do Brasil]]
  
+ === On the web ===
+ Here are useful links
+ 
+  * [[UsefulLinks]] - Tutorials for !JavaServer Faces, !MyFaces project 
activity and other useful links
+  * PHP Facesaver Proxy [[http://www.facesaver.net/|Facebook Proxy]]
+  * PHP Facesaver Proxy [[http://facoxy.com/|Facebook Proxy]]
+  * PHP Facesaver Proxy [[http://vroxy.com/|Facebook Proxy]]
+  * PHP Facesaver Proxy [[http://subick.com/|Facebook Proxy]]
+  * PHP Facesaver Proxy [[http://midhat.com/|Facebook Proxy]]
+

[Cassandra Wiki] Update of "FrontPage" by MikeDean

2011-08-23 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "FrontPage" page has been changed by MikeDean:
http://wiki.apache.org/cassandra/FrontPage?action=diff&rev1=72&rev2=73

  
  == Related Information ==
   * [[http://incubator.apache.org/thrift|Thrift]], used by Cassandra for 
client access
+  * [[http://seoexpertglobal.com|SEO]], used by Cassandra
   * RelatedProjects: Projects using or extending Cassandra
  
  == Google SoC 2010 Page ==

[Cassandra Wiki] Trivial Update of "FrontPage" by daisyjrt

2011-08-23 Thread Apache Wiki

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "FrontPage" page has been changed by daisyjrt:
http://wiki.apache.org/cassandra/FrontPage?action=diff&rev1=74&rev2=75

   * Commits: commits@cassandra.apache.org 
[[mailto:commits-subscr...@cassandra.apache.org|(subscribe)]]
  
  == Related Information ==
-  * [[http://incubator.apache.org/thrift|Thrift]], used by Cassandra for 
client access
+  * [[http://www.bohemjewel.com|Thrift]], used by Cassandra for client access
   * RelatedProjects: Projects using or extending Cassandra
  
  == Google SoC 2010 Page ==

[jira] [Created] (CASSANDRA-3070) counter repair

2011-08-23 Thread ivan (JIRA)

counter repair
--

 Key: CASSANDRA-3070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3070
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.4
Reporter: ivan


Hi!

We have some counters out of sync but repair doesn't sync values.
We tried nodetool repair.
We use LOCAL_QUORUM for read. A repair row mutation is sent to other nodes 
while reading a bad row but counters wasn't repaired by mutation.

Output of two nodes were uploaded. (Some new debug messages were added.)


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3070) counter repair

2011-08-23 Thread ivan (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ivan updated CASSANDRA-3070:


Attachment: counter_local_quroum_maybeschedulerepairs.txt

> counter repair
> --
>
> Key: CASSANDRA-3070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3070
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.8.4
>Reporter: ivan
> Attachments: counter_local_quroum_maybeschedulerepairs.txt
>
>
> Hi!
> We have some counters out of sync but repair doesn't sync values.
> We tried nodetool repair.
> We use LOCAL_QUORUM for read. A repair row mutation is sent to other nodes 
> while reading a bad row but counters wasn't repaired by mutation.
> Output of two nodes were uploaded. (Some new debug messages were added.)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2995) Making Storage Engine Pluggable

2011-08-23 Thread Shunsuke Nakamura (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089557#comment-13089557
 ] 

Shunsuke Nakamura edited comment on CASSANDRA-2995 at 8/23/11 4:06 PM:
---

I am a developer of MyCassandra, that provides storage engine pluggability. 
It supports MySQL, Redis, Kyoto Cabinet, MongoDB and the others in addition to 
the original storage engine of Cassandra.

A storage engine is a performance bottleneck in most applications. The 
pluggability provides rapid improvement of performance and adaptability to 
various applications in a very straightforward way though caching techniques 
started working well these days.

- Data model

MyCassandra supports the same schemaless multi-dimensional map as Cassandra. 
It maps the data model to tables in RDB and key-value pairs in key-value stores 
using object serialization and labeling keys and columns. 
The mapping does not make a sparse RDB table because a record is mapped to a 
key-value pair.


- Open problems
Data model mapping requires more elaboration.

-- Secondary indices
  They require a raw column though usually a row is object-serialized. It is 
also difficult to add a secondary index later for today's MyCassandra because 
it requires traversal of all rows a schema change.

-- Row capacity
  The size of a column in RDB is fixed when a schema is defined. It is static.

-- Lookup efficiency in a row
  MyCassandra serializes a row to store it into a table or a KVS. A lookup in a 
row requires deserialization.

A research paper about MyCassandra become available soon. 
It has design performance results of MyCassandra.

http://www.slideshare.net/sunsuk7tp/mycassandra-8499189


  was (Author: sunsuk7tp):
I am a developer of MyCassandra, that provides storage engine pluggability. 
It supports MySQL, Redis, Kyoto Cabinet, MongoDB and the others in addition to 
the original storage engine of Cassandra.

A storage engine is a performance bottleneck in most applications. The 
pluggability provides rapid improvement of performance and adaptability to 
various applications in a very straightforward way though caching techniques 
started working well these days.


* Data model

MyCassandra supports the same schemaless multi-dimensional map as Cassandra. 
It maps the data model to tables in RDB and key-value pairs in key-value stores 
using object serialization and labeling keys and columns. 
The mapping does not make a sparse RDB table because a record is mapped to a 
key-value pair.


* Open problems
Data model mapping requires more elaboration.

- Secondary indices
  They require a raw column though usually a row is object-serialized. It is 
also difficult to add a secondary index later for today's MyCassandra because 
it requires traversal of all rows a schema change.

- Row capacity
  The size of a column in RDB is fixed when a schema is defined. It is static.

- Lookup efficiency in a row
  MyCassandra serializes a row to store it into a table or a KVS. A lookup in a 
row requires deserialization.

A research paper about MyCassandra become available soon. 
It has design performance results of MyCassandra.

http://www.slideshare.net/sunsuk7tp/mycassandra-8499189

  
> Making Storage Engine Pluggable
> ---
>
> Key: CASSANDRA-2995
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2995
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 0.8.2
>Reporter: Muga Nishizawa
>
> Will you design and implement Cassandra's storage engine API like MyCassandra?
> MyCassandra provides extensible architecture for pluging other storage 
> engines to Cassandra like MySQL.  
> https://github.com/sunsuk7tp/MyCassandra/
>   
> It could be advantageous for Cassandra to make the storage engine pluggable.  
> This could allow Cassandra to 
> - deal with potential use cases where maybe the current sstables are not the 
> best fit
> - allow several types of internal storage formats (at the same time) 
> optimized for different data types
> - allow easier experiments and research on new storage formats (encourage 
> research institutions to do strange things with Cassandra)
> - there could also be potential advantages from better isolation of the data 
> engine in terms of less risk for data corruptions if other parts of Cassandra 
> change

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2995) Making Storage Engine Pluggable

2011-08-23 Thread Shunsuke Nakamura (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089557#comment-13089557
 ] 

Shunsuke Nakamura commented on CASSANDRA-2995:
--

I am a developer of MyCassandra, that provides storage engine pluggability. 
It supports MySQL, Redis, Kyoto Cabinet, MongoDB and the others in addition to 
the original storage engine of Cassandra.

A storage engine is a performance bottleneck in most applications. The 
pluggability provides rapid improvement of performance and adaptability to 
various applications in a very straightforward way though caching techniques 
started working well these days.


* Data model

MyCassandra supports the same schemaless multi-dimensional map as Cassandra. 
It maps the data model to tables in RDB and key-value pairs in key-value stores 
using object serialization and labeling keys and columns. 
The mapping does not make a sparse RDB table because a record is mapped to a 
key-value pair.


* Open problems
Data model mapping requires more elaboration.

- Secondary indices
  They require a raw column though usually a row is object-serialized. It is 
also difficult to add a secondary index later for today's MyCassandra because 
it requires traversal of all rows a schema change.

- Row capacity
  The size of a column in RDB is fixed when a schema is defined. It is static.

- Lookup efficiency in a row
  MyCassandra serializes a row to store it into a table or a KVS. A lookup in a 
row requires deserialization.

A research paper about MyCassandra become available soon. 
It has design performance results of MyCassandra.

http://www.slideshare.net/sunsuk7tp/mycassandra-8499189


> Making Storage Engine Pluggable
> ---
>
> Key: CASSANDRA-2995
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2995
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 0.8.2
>Reporter: Muga Nishizawa
>
> Will you design and implement Cassandra's storage engine API like MyCassandra?
> MyCassandra provides extensible architecture for pluging other storage 
> engines to Cassandra like MySQL.  
> https://github.com/sunsuk7tp/MyCassandra/
>   
> It could be advantageous for Cassandra to make the storage engine pluggable.  
> This could allow Cassandra to 
> - deal with potential use cases where maybe the current sstables are not the 
> best fit
> - allow several types of internal storage formats (at the same time) 
> optimized for different data types
> - allow easier experiments and research on new storage formats (encourage 
> research institutions to do strange things with Cassandra)
> - there could also be potential advantages from better isolation of the data 
> engine in terms of less risk for data corruptions if other parts of Cassandra 
> change

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2995) Making Storage Engine Pluggable

2011-08-23 Thread Shunsuke Nakamura (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089557#comment-13089557
 ] 

Shunsuke Nakamura edited comment on CASSANDRA-2995 at 8/23/11 4:17 PM:
---

I am a developer of MyCassandra, that provides storage engine pluggability. 
It supports MySQL, Redis, Kyoto Cabinet, MongoDB and the others in addition to 
the original storage engine of Cassandra.

A storage engine is a performance bottleneck in most applications. The 
pluggability provides rapid improvement of performance and adaptability to 
various applications in a very straightforward way though caching techniques 
started working well these days.

* Data model

MyCassandra supports the same schemaless multi-dimensional map as Cassandra. 
It maps the data model to tables in RDB and key-value pairs in key-value stores 
using object serialization and labeling keys and columns. 
The mapping does not make a sparse RDB table because a record is mapped to a 
key-value pair.


* Open problems
Data model mapping requires more elaboration.

・ Secondary indices
  They require a raw column though usually a row is object-serialized. It is 
also difficult to add a secondary index later for today's MyCassandra because 
it requires traversal of all rows a schema change.

・ Row capacity
  The size of a column in RDB is fixed when a schema is defined. It is static.

・ Lookup efficiency in a row
  MyCassandra serializes a row to store it into a table or a KVS. A lookup in a 
row requires deserialization.

A research paper about MyCassandra become available soon. 
It has design performance and results of MyCassandra.

http://www.slideshare.net/sunsuk7tp/mycassandra-8499189


  was (Author: sunsuk7tp):
I am a developer of MyCassandra, that provides storage engine pluggability. 
It supports MySQL, Redis, Kyoto Cabinet, MongoDB and the others in addition to 
the original storage engine of Cassandra.

A storage engine is a performance bottleneck in most applications. The 
pluggability provides rapid improvement of performance and adaptability to 
various applications in a very straightforward way though caching techniques 
started working well these days.

- Data model

MyCassandra supports the same schemaless multi-dimensional map as Cassandra. 
It maps the data model to tables in RDB and key-value pairs in key-value stores 
using object serialization and labeling keys and columns. 
The mapping does not make a sparse RDB table because a record is mapped to a 
key-value pair.


- Open problems
Data model mapping requires more elaboration.

-- Secondary indices
  They require a raw column though usually a row is object-serialized. It is 
also difficult to add a secondary index later for today's MyCassandra because 
it requires traversal of all rows a schema change.

-- Row capacity
  The size of a column in RDB is fixed when a schema is defined. It is static.

-- Lookup efficiency in a row
  MyCassandra serializes a row to store it into a table or a KVS. A lookup in a 
row requires deserialization.

A research paper about MyCassandra become available soon. 
It has design performance results of MyCassandra.

http://www.slideshare.net/sunsuk7tp/mycassandra-8499189

  
> Making Storage Engine Pluggable
> ---
>
> Key: CASSANDRA-2995
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2995
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 0.8.2
>Reporter: Muga Nishizawa
>
> Will you design and implement Cassandra's storage engine API like MyCassandra?
> MyCassandra provides extensible architecture for pluging other storage 
> engines to Cassandra like MySQL.  
> https://github.com/sunsuk7tp/MyCassandra/
>   
> It could be advantageous for Cassandra to make the storage engine pluggable.  
> This could allow Cassandra to 
> - deal with potential use cases where maybe the current sstables are not the 
> best fit
> - allow several types of internal storage formats (at the same time) 
> optimized for different data types
> - allow easier experiments and research on new storage formats (encourage 
> research institutions to do strange things with Cassandra)
> - there could also be potential advantages from better isolation of the data 
> engine in terms of less risk for data corruptions if other parts of Cassandra 
> change

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2810) RuntimeException in Pig when using "dump" command on column name

2011-08-23 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2810:


Attachment: 2810-v3.txt

v3 also removes decomposing the values before inserting and instead forces them 
into a ByteBuffer with objToBB, since we actually don't care about the type. 
(why did we ever change this?)

This means that a UDF that doesn't preserve the schema and hands us back 
DataByteArrays when we fed it specific types can't make us fail anymore.

> RuntimeException in Pig when using "dump" command on column name
> 
>
> Key: CASSANDRA-2810
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.8.1
> Environment: Ubuntu 10.10, 32 bits
> java version "1.6.0_24"
> Brisk beta-2 installed from Debian packages
>Reporter: Silvère Lestang
>Assignee: Brandon Williams
> Attachments: 2810-v2.txt, 2810-v3.txt, 2810.txt
>
>
> This bug was previously report on [Brisk bug 
> tracker|https://datastax.jira.com/browse/BRISK-232].
> In cassandra-cli:
> {code}
> [default@unknown] create keyspace Test
> with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
> and strategy_options = [{replication_factor:1}];
> [default@unknown] use Test;
> Authenticated to keyspace: Test
> [default@Test] create column family test;
> [default@Test] set test[ascii('row1')][long(1)]=integer(35);
> set test[ascii('row1')][long(2)]=integer(36);
> set test[ascii('row1')][long(3)]=integer(38);
> set test[ascii('row2')][long(1)]=integer(45);
> set test[ascii('row2')][long(2)]=integer(42);
> set test[ascii('row2')][long(3)]=integer(33);
> [default@Test] list test;
> Using default limit of 100
> ---
> RowKey: 726f7731
> => (column=0001, value=35, timestamp=1308744931122000)
> => (column=0002, value=36, timestamp=1308744931124000)
> => (column=0003, value=38, timestamp=1308744931125000)
> ---
> RowKey: 726f7732
> => (column=0001, value=45, timestamp=1308744931127000)
> => (column=0002, value=42, timestamp=1308744931128000)
> => (column=0003, value=33, timestamp=1308744932722000)
> 2 Rows Returned.
> [default@Test] describe keyspace;
> Keyspace: Test:
>   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
>   Durable Writes: true
> Options: [replication_factor:1]
>   Column Families:
> ColumnFamily: test
>   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
>   Default column value validator: 
> org.apache.cassandra.db.marshal.BytesType
>   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
>   Row cache size / save period in seconds: 0.0/0
>   Key cache size / save period in seconds: 20.0/14400
>   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
>   GC grace seconds: 864000
>   Compaction min/max thresholds: 4/32
>   Read repair chance: 1.0
>   Replicate on write: false
>   Built indexes: []
> {code}
> In Pig command line:
> {code}
> grunt> test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
> (rowkey:chararray, columns: bag {T: (name:long, value:int)});
> grunt> value_test = foreach test generate rowkey, columns.name, columns.value;
> grunt> dump value_test;
> {code}
> In /var/log/cassandra/system.log, I have severals time this exception:
> {code}
> INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
> TaskInProgress.java (line 551) Error from 
> attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected 
> data type -1 found in stream.
>   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
>   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
>   at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
>   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
>   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
>   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
>   at 
> org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
>   at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
>   at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
>   at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.

[jira] [Assigned] (CASSANDRA-3070) counter repair

2011-08-23 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-3070:
-

Assignee: Sylvain Lebresne

> counter repair
> --
>
> Key: CASSANDRA-3070
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3070
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.8.4
>Reporter: ivan
>Assignee: Sylvain Lebresne
> Attachments: counter_local_quroum_maybeschedulerepairs.txt
>
>
> Hi!
> We have some counters out of sync but repair doesn't sync values.
> We tried nodetool repair.
> We use LOCAL_QUORUM for read. A repair row mutation is sent to other nodes 
> while reading a bad row but counters wasn't repaired by mutation.
> Output of two nodes were uploaded. (Some new debug messages were added.)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-3071:


Attachment: 3071.txt

This was originally part of a patch in CASSANDRA-957, but looks worthy enough 
to break out and get committed in older versions.

> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Brandon Williams (JIRA)

Gossip state is not removed after a new IP takes over a token
-

 Key: CASSANDRA-3071
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.7.9, 0.8.5
 Attachments: 3071.txt

When a new node takes over a token, the endpoint state in the gossiper is never 
removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-957) convenience workflow for replacing dead node

2011-08-23 Thread Vijay (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-957:


Attachment: 0003-making-bootstrap-sleep-longer.patch
0002-upport-for-hints-on-token-v3.patch
0001-support-for-replace-token-v3.patch

Rebased with a better way to look for operator error.

> convenience workflow for replacing dead node
> 
>
> Key: CASSANDRA-957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-957
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core, Tools
>Affects Versions: 0.8.2
>Reporter: Jonathan Ellis
>Assignee: Vijay
> Fix For: 1.0
>
> Attachments: 0001-Support-Token-Replace.patch, 
> 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 
> 0001-Support-token-replace.patch, 0001-support-for-replace-token-v3.patch, 
> 0002-Do-not-include-local-node-when-computing-workMap.patch, 
> 0002-Rework-Hints-to-be-on-token.patch, 
> 0002-Rework-Hints-to-be-on-token.patch, 
> 0002-upport-for-hints-on-token-v3.patch, 
> 0003-Make-HintedHandoff-More-reliable.patch, 
> 0003-Make-hints-More-reliable.patch, 0003-making-bootstrap-sleep-longer.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Replacing a dead node with a new one is a common operation, but "nodetool 
> removetoken" followed by bootstrap is inefficient (re-replicating data first 
> to the remaining nodes, then to the new one) and manually bootstrapping to a 
> token "just less than" the old one's, followed by "nodetool removetoken" is 
> slightly painful and prone to manual errors.
> First question: how would you expose this in our tool ecosystem?  It needs to 
> be a startup-time option to the new node, so it can't be nodetool, and 
> messing with the config xml definitely takes the "convenience" out.  A 
> one-off -DreplaceToken=XXY argument?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2995) Making Storage Engine Pluggable

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089583#comment-13089583
 ] 

Jonathan Ellis commented on CASSANDRA-2995:
---

I understand the "wouldn't it be cool" factor but let's be clear: Cassandra is 
not a research-paper creation engine.  Pluggability here improves generality at 
the cost of a substantial increase of implementation complexity, QA work, and 
cognitive burden on users.  This is a bad direction for the project.

> Making Storage Engine Pluggable
> ---
>
> Key: CASSANDRA-2995
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2995
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 0.8.2
>Reporter: Muga Nishizawa
>
> Will you design and implement Cassandra's storage engine API like MyCassandra?
> MyCassandra provides extensible architecture for pluging other storage 
> engines to Cassandra like MySQL.  
> https://github.com/sunsuk7tp/MyCassandra/
>   
> It could be advantageous for Cassandra to make the storage engine pluggable.  
> This could allow Cassandra to 
> - deal with potential use cases where maybe the current sstables are not the 
> best fit
> - allow several types of internal storage formats (at the same time) 
> optimized for different data types
> - allow easier experiments and research on new storage formats (encourage 
> research institutions to do strange things with Cassandra)
> - there could also be potential advantages from better isolation of the data 
> engine in terms of less risk for data corruptions if other parts of Cassandra 
> change

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2708) memory leak in CompactionManager's estimatedCompactions

2011-08-23 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2708:
--

Fix Version/s: 0.8.5

> memory leak in CompactionManager's estimatedCompactions
> ---
>
> Key: CASSANDRA-2708
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2708
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.7.0
>Reporter: Dan LaRocque
>Assignee: Dan LaRocque
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: cassandra-0.7-2708.txt
>
>
> CompactionManager's estimatedCompactions map seems to hold all or most 
> ColumnFamilyStores in the system as keys.  Keys are never removed from 
> estimatedCompactions.
> I have a project that embeds Cassandra as a storage backend.  Some of my 
> integration tests create and drop a single keyspace and pair of column 
> families a hundred or 150 times in one JVM.  These tests always OOM'd.  
> Loading some near-death heapdumps in mat suggested CompactionManager's 
> estimatedCompactions held over 80% of total heap via its ColumnFamilyStore 
> keys.  estimatedCompactions had the only inbound reference to these CFSs, and 
> the CFSs themselves had invalid = true.
> As a workaround, I changed estimatedCompactions to a WeakReference-keyed map 
> (using Guava MapMaker).  My integration tests no longer OOM.
> I'm generally unfamiliar with Cassandra's guts.  I don't know whether weak 
> referencing the keys of estimatedCompactions is correct (or ideal).  But, 
> that did seem to confirm my guess that retained references to dead CFSs in 
> estimatedCompactions were swamping my heap after lots of 
> Keyspace+ColumnFamily drops.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089588#comment-13089588
 ] 

Jonathan Ellis commented on CASSANDRA-3071:
---

Is it possible for this to remove an endpoint that it shouldn't?

E.g., X has token T

X moves to token U but node N was down

N comes back up, thinks T and U are both owned by X

node Y takes token T

we remove X from gossip

> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089594#comment-13089594
 ] 

Brandon Williams commented on CASSANDRA-3071:
-

bq. N comes back up, thinks T and U are both owned by X

I don't think this can happen.  When N starts up, it will load the persisted 
tokens, BUT they won't be associated with IPs.  It can only learn that U is 
owned by X via gossip, and T will be down until it learns about Y.

> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089601#comment-13089601
 ] 

Jonathan Ellis commented on CASSANDRA-3071:
---

what behavior does this fix?  just gossip trying to reach the old node?

> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089604#comment-13089604
 ] 

Brandon Williams commented on CASSANDRA-3071:
-

I think it solves this: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Completely-removing-a-node-from-the-cluster-td6705079.html
  Jeremy also reported a problem with a large amount of hints that I think this 
solves since SP.shouldHint is directly impacted by this.

> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-3071:


Attachment: 3071-v2.txt

v2 adds more protection around shouldHint by checking that the endpoint is a 
member.  

> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071-v2.txt, 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089613#comment-13089613
 ] 

Jonathan Ellis commented on CASSANDRA-3071:
---

If the node doesn't have a token, it doesn't matter if it's a gossip member, it 
won't be part of getWriteEndpoints and won't be hinted anyway.

> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071-v2.txt, 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089619#comment-13089619
 ] 

Jonathan Ellis commented on CASSANDRA-3071:
---

+1 on v1, -1 on conflating gossip membership w/ token ownership as in v2

> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071-v2.txt, 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1160825 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java

2011-08-23 Thread brandonwilliams

Author: brandonwilliams
Date: Tue Aug 23 18:00:34 2011
New Revision: 1160825

URL: http://svn.apache.org/viewvc?rev=1160825&view=rev
Log:
Remove gossip state when a new IP takes over a token.
Patch by brandonwilliams, reviewed by jbellis for CASSANDRA-3071

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java?rev=1160825&r1=1160824&r2=1160825&view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java
 Tue Aug 23 18:00:34 2011
@@ -745,6 +745,7 @@ public class StorageService implements I
 logger_.info(String.format("Nodes %s and %s have the same token 
%s.  %s is the new owner",
endpoint, currentOwner, token, 
endpoint));
 tokenMetadata_.updateNormalToken(token, endpoint);
+Gossiper.instance.removeEndpoint(currentOwner);
 if (!isClientMode)
 SystemTable.updateToken(endpoint, token);
 }

svn commit: r1160827 - in /cassandra/branches/cassandra-0.8: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/service/

2011-08-23 Thread brandonwilliams

Author: brandonwilliams
Date: Tue Aug 23 18:08:51 2011
New Revision: 1160827

URL: http://svn.apache.org/viewvc?rev=1160827&view=rev
Log:
Merge from 0.7

Modified:
cassandra/branches/cassandra-0.8/   (props changed)
cassandra/branches/cassandra-0.8/contrib/   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java

Propchange: cassandra/branches/cassandra-0.8/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug 23 18:08:51 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1160444
+/cassandra/branches/cassandra-0.7:1026516-1160444,1160825
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/branches/cassandra-0.8:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0:1125021-1130369

Propchange: cassandra/branches/cassandra-0.8/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug 23 18:08:51 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1160444
+/cassandra/branches/cassandra-0.7/contrib:1026516-1160444,1160825
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
 /cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug 23 18:08:51 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1160444
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1160444,1160825
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug 23 18:08:51 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1160444
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1160444,1160825
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug 23 18:08:51 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:1026516-1160444
+/cassandra/branches/cassandra-0.7/i

svn commit: r1160828 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/service/

2011-08-23 Thread brandonwilliams

Author: brandonwilliams
Date: Tue Aug 23 18:10:28 2011
New Revision: 1160828

URL: http://svn.apache.org/viewvc?rev=1160828&view=rev
Log:
Merge from 0.8

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug 23 18:10:28 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1160444
+/cassandra/branches/cassandra-0.7:1026516-1160444,1160825
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1160459
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1160459,1160827
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug 23 18:10:28 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1160444
+/cassandra/branches/cassandra-0.7/contrib:1026516-1160444,1160825
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
-/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1160459
+/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1160459,1160827
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369
 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug 23 18:10:28 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1160444
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1160444,1160825
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
-/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1160459
+/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1160459,1160827
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369
 
/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1101014-1125018
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug 23 18:10:28 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1160444
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1160444,1160825
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
-/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1090934-1125013,1125019-1160459
+/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1090934-1125013,1125019-1160459,1160827
 
/cassandra/branch

[jira] [Resolved] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-3071.
-

Resolution: Fixed

bq. If the node doesn't have a token, it doesn't matter if it's a gossip 
member, it won't be part of getWriteEndpoints and won't be hinted anyway.

I can't see a way for that either, but I'm still suspicious of the link to 
shouldHint.

bq. +1 on v1, -1 on conflating gossip membership w/ token ownership as in v2

Fair enough, committed v1.

> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071-v2.txt, 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3071) Gossip state is not removed after a new IP takes over a token

2011-08-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089648#comment-13089648
 ] 

Hudson commented on CASSANDRA-3071:
---

Integrated in Cassandra-0.7 #541 (See 
[https://builds.apache.org/job/Cassandra-0.7/541/])
Remove gossip state when a new IP takes over a token.
Patch by brandonwilliams, reviewed by jbellis for CASSANDRA-3071

brandonwilliams : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1160825
Files : 
* 
/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java


> Gossip state is not removed after a new IP takes over a token
> -
>
> Key: CASSANDRA-3071
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3071
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 3071-v2.txt, 3071.txt
>
>
> When a new node takes over a token, the endpoint state in the gossiper is 
> never removed for the old node.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1160855 - in /cassandra/branches: cassandra-0.7/CHANGES.txt cassandra-0.8/CHANGES.txt

2011-08-23 Thread brandonwilliams

Author: brandonwilliams
Date: Tue Aug 23 19:10:22 2011
New Revision: 1160855

URL: http://svn.apache.org/viewvc?rev=1160855&view=rev
Log:
Update CHANGES

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt
cassandra/branches/cassandra-0.8/CHANGES.txt

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1160855&r1=1160854&r2=1160855&view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Tue Aug 23 19:10:22 2011
@@ -7,6 +7,7 @@
 has not (CASSANDRA-2388)
  * avoid retaining references to dropped CFS objects in 
CompactionManager.estimatedCompactions (CASSANDRA-2708)
+ * remove gossip state when a new IP takes over a token (CASSANDRA-3071)
 
 
 0.7.8

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1160855&r1=1160854&r2=1160855&view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Tue Aug 23 19:10:22 2011
@@ -29,6 +29,7 @@
CompactionManager.estimatedCompactions (CASSANDRA-2708)
  * expose rpc timeouts per host in MessagingServiceMBean (CASSANDRA-2941)
  * avoid including cwd in classpath for deb and rpm packages (CASSANDRA-2881)
+ * remove gossip state when a new IP takes over a token (CASSANDRA-3071)
 
 
 0.8.4

[jira] [Reopened] (CASSANDRA-2868) Native Memory Leak

2011-08-23 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reopened CASSANDRA-2868:
-


Reopening to backport to 0.7

> Native Memory Leak
> --
>
> Key: CASSANDRA-2868
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2868
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Daniel Doubleday
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.8.5
>
> Attachments: 2868-v1.txt, 2868-v2.txt, 2868-v3.txt, 48hour_RES.png, 
> low-load-36-hours-initial-results.png
>
>
> We have memory issues with long running servers. These have been confirmed by 
> several users in the user list. That's why I report.
> The memory consumption of the cassandra java process increases steadily until 
> it's killed by the os because of oom (with no swap)
> Our server is started with -Xmx3000M and running for around 23 days.
> pmap -x shows
> Total SST: 1961616 (mem mapped data and index files)
> Anon  RSS: 6499640
> Total RSS: 8478376
> This shows that > 3G are 'overallocated'.
> We will use BRAF on one of our less important nodes to check wether it is 
> related to mmap and report back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2868) Native Memory Leak

2011-08-23 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2868:


Fix Version/s: 0.7.9

> Native Memory Leak
> --
>
> Key: CASSANDRA-2868
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2868
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Daniel Doubleday
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 2868-v1.txt, 2868-v2.txt, 2868-v3.txt, 48hour_RES.png, 
> low-load-36-hours-initial-results.png
>
>
> We have memory issues with long running servers. These have been confirmed by 
> several users in the user list. That's why I report.
> The memory consumption of the cassandra java process increases steadily until 
> it's killed by the os because of oom (with no swap)
> Our server is started with -Xmx3000M and running for around 23 days.
> pmap -x shows
> Total SST: 1961616 (mem mapped data and index files)
> Anon  RSS: 6499640
> Total RSS: 8478376
> This shows that > 3G are 'overallocated'.
> We will use BRAF on one of our less important nodes to check wether it is 
> related to mmap and report back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3025) PHP/PDO driver for Cassandra CQL

2011-08-23 Thread Mikko Koppanen (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089685#comment-13089685
 ] 

Mikko Koppanen commented on CASSANDRA-3025:
---

Hi,

added test for sparse columns:

https://github.com/mkoppanen/php-pdo_cassandra/blob/master/tests/017-sparsecolumns.phpt

Improved column metadata:

https://github.com/mkoppanen/php-pdo_cassandra/blob/master/tests/004-columnmeta.phpt

And added test for bigints:

https://github.com/mkoppanen/php-pdo_cassandra/blob/master/tests/018-long.phpt

A couple of questions rising from using the Thrift client more:

a) is there an easier way to get column metadata for a column than this:
https://github.com/mkoppanen/php-pdo_cassandra/blob/master/cassandra_statement.cpp#L106

b) Is there a way to query the currently "in use" keyspace? Currently I added 
following but it doesn't seem very clean:
https://github.com/mkoppanen/php-pdo_cassandra/blob/master/cassandra_driver.cpp#L317



> PHP/PDO driver for Cassandra CQL
> 
>
> Key: CASSANDRA-3025
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3025
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API
>Reporter: Mikko Koppanen
>  Labels: php
> Attachments: pdo_cassandra-0.1.0.tgz, pdo_cassandra-0.1.1.tgz, 
> php_test_results_20110818_2317.txt
>
>
> Hello,
> attached is the initial version of the PDO driver for Cassandra CQL language. 
> This is a native PHP extension written in what I would call a combination of 
> C and C++, due to PHP being C. The thrift API used is the C++.
> The API looks roughly following:
> {code}
>  $db = new PDO('cassandra:host=127.0.0.1;port=9160');
> $db->exec ("CREATE KEYSPACE mytest with strategy_class = 'SimpleStrategy' and 
> strategy_options:replication_factor=1;");
> $db->exec ("USE mytest");
> $db->exec ("CREATE COLUMNFAMILY users (
>   my_key varchar PRIMARY KEY,
>   full_name varchar );");
>   
> $stmt = $db->prepare ("INSERT INTO users (my_key, full_name) VALUES (:key, 
> :full_name);");
> $stmt->execute (array (':key' => 'mikko', ':full_name' => 'Mikko K' ));
> {code}
> Currently prepared statements are emulated on the client side but I 
> understand that there is a plan to add prepared statements to Cassandra CQL 
> API as well. I will add this feature in to the extension as soon as they are 
> implemented.
> Additional documentation can be found in github 
> https://github.com/mkoppanen/php-pdo_cassandra, in the form of rendered 
> MarkDown file. Tests are currently not included in the package file and they 
> can be found in the github for now as well.
> I have created documentation in docbook format as well, but have not yet 
> rendered it.
> Comments and feedback are welcome.
> Thanks,
> Mikko

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3025) PHP/PDO driver for Cassandra CQL

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089700#comment-13089700
 ] 

Jonathan Ellis commented on CASSANDRA-3025:
---

bq. Improved column metadata:

That's good to have (is there no place for "table" level metadata though?) but 
by "one demonstrating introspection of default_validation_class" I mean, I'd 
like to see it turning int data into int results, for instance.  But if you're 
planning to get to that later, that's fine.

bq. added test for bigints

- Suggest not calling the CF "standardlong" when it's really containing bigints
- should test an actual "big" int (> 64bit)

bq. is there an easier way to get column metadata

No.  The approach Java and Python take is to get the metadata once when the 
connection is opened and cache it.  This isn't entirely without its own 
problems (CASSANDRA-2734) but works reasonably well.

See also CASSANDRA-3002.

bq. Is there a way to query the currently "in use" keyspace? Currently I added 
following but it doesn't seem very clean

That's pretty much what Java and Python do, too.

> PHP/PDO driver for Cassandra CQL
> 
>
> Key: CASSANDRA-3025
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3025
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API
>Reporter: Mikko Koppanen
>  Labels: php
> Attachments: pdo_cassandra-0.1.0.tgz, pdo_cassandra-0.1.1.tgz, 
> php_test_results_20110818_2317.txt
>
>
> Hello,
> attached is the initial version of the PDO driver for Cassandra CQL language. 
> This is a native PHP extension written in what I would call a combination of 
> C and C++, due to PHP being C. The thrift API used is the C++.
> The API looks roughly following:
> {code}
>  $db = new PDO('cassandra:host=127.0.0.1;port=9160');
> $db->exec ("CREATE KEYSPACE mytest with strategy_class = 'SimpleStrategy' and 
> strategy_options:replication_factor=1;");
> $db->exec ("USE mytest");
> $db->exec ("CREATE COLUMNFAMILY users (
>   my_key varchar PRIMARY KEY,
>   full_name varchar );");
>   
> $stmt = $db->prepare ("INSERT INTO users (my_key, full_name) VALUES (:key, 
> :full_name);");
> $stmt->execute (array (':key' => 'mikko', ':full_name' => 'Mikko K' ));
> {code}
> Currently prepared statements are emulated on the client side but I 
> understand that there is a plan to add prepared statements to Cassandra CQL 
> API as well. I will add this feature in to the extension as soon as they are 
> implemented.
> Additional documentation can be found in github 
> https://github.com/mkoppanen/php-pdo_cassandra, in the form of rendered 
> MarkDown file. Tests are currently not included in the package file and they 
> can be found in the github for now as well.
> I have created documentation in docbook format as well, but have not yet 
> rendered it.
> Comments and feedback are welcome.
> Thanks,
> Mikko

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-3025) PHP/PDO driver for Cassandra CQL

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089700#comment-13089700
 ] 

Jonathan Ellis edited comment on CASSANDRA-3025 at 8/23/11 7:48 PM:


bq. Improved column metadata:

That's good to have (is there no place for "table" level metadata though?) but 
by "one demonstrating introspection of default_validation_class" I mean, I'd 
like to see it turning int data into int results, for instance.  But if you're 
planning to get to that later, that's fine.

bq. added test for bigints

- Suggest not calling the CF "standardlong" when it's really containing bigints
- should test an actual "big" int (> 64bit)

bq. is there an easier way to get column metadata

No.  The approach Java and Python take is to get the metadata once when the 
connection is opened and cache it.  This isn't entirely without its own 
problems (CASSANDRA-2734) but works reasonably well.

See also CASSANDRA-2477.

bq. Is there a way to query the currently "in use" keyspace? Currently I added 
following but it doesn't seem very clean

That's pretty much what Java and Python do, too.

  was (Author: jbellis):
bq. Improved column metadata:

That's good to have (is there no place for "table" level metadata though?) but 
by "one demonstrating introspection of default_validation_class" I mean, I'd 
like to see it turning int data into int results, for instance.  But if you're 
planning to get to that later, that's fine.

bq. added test for bigints

- Suggest not calling the CF "standardlong" when it's really containing bigints
- should test an actual "big" int (> 64bit)

bq. is there an easier way to get column metadata

No.  The approach Java and Python take is to get the metadata once when the 
connection is opened and cache it.  This isn't entirely without its own 
problems (CASSANDRA-2734) but works reasonably well.

See also CASSANDRA-3002.

bq. Is there a way to query the currently "in use" keyspace? Currently I added 
following but it doesn't seem very clean

That's pretty much what Java and Python do, too.
  
> PHP/PDO driver for Cassandra CQL
> 
>
> Key: CASSANDRA-3025
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3025
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API
>Reporter: Mikko Koppanen
>  Labels: php
> Attachments: pdo_cassandra-0.1.0.tgz, pdo_cassandra-0.1.1.tgz, 
> php_test_results_20110818_2317.txt
>
>
> Hello,
> attached is the initial version of the PDO driver for Cassandra CQL language. 
> This is a native PHP extension written in what I would call a combination of 
> C and C++, due to PHP being C. The thrift API used is the C++.
> The API looks roughly following:
> {code}
>  $db = new PDO('cassandra:host=127.0.0.1;port=9160');
> $db->exec ("CREATE KEYSPACE mytest with strategy_class = 'SimpleStrategy' and 
> strategy_options:replication_factor=1;");
> $db->exec ("USE mytest");
> $db->exec ("CREATE COLUMNFAMILY users (
>   my_key varchar PRIMARY KEY,
>   full_name varchar );");
>   
> $stmt = $db->prepare ("INSERT INTO users (my_key, full_name) VALUES (:key, 
> :full_name);");
> $stmt->execute (array (':key' => 'mikko', ':full_name' => 'Mikko K' ));
> {code}
> Currently prepared statements are emulated on the client side but I 
> understand that there is a plan to add prepared statements to Cassandra CQL 
> API as well. I will add this feature in to the extension as soon as they are 
> implemented.
> Additional documentation can be found in github 
> https://github.com/mkoppanen/php-pdo_cassandra, in the form of rendered 
> MarkDown file. Tests are currently not included in the package file and they 
> can be found in the github for now as well.
> I have created documentation in docbook format as well, but have not yet 
> rendered it.
> Comments and feedback are welcome.
> Thanks,
> Mikko

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

svn commit: r1160879 - in /cassandra/branches/cassandra-0.7: CHANGES.txt src/java/org/apache/cassandra/service/GCInspector.java

2011-08-23 Thread brandonwilliams

Author: brandonwilliams
Date: Tue Aug 23 20:05:38 2011
New Revision: 1160879

URL: http://svn.apache.org/viewvc?rev=1160879&view=rev
Log:
work around native memory leak in com.sun.management.GarbageCollectorMXBean
patch by brandonwilliams and jbellis for CASSANDRA-2868

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1160879&r1=1160878&r2=1160879&view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Tue Aug 23 20:05:38 2011
@@ -8,6 +8,8 @@
  * avoid retaining references to dropped CFS objects in 
CompactionManager.estimatedCompactions (CASSANDRA-2708)
  * remove gossip state when a new IP takes over a token (CASSANDRA-3071)
+ * work around native memory leak in com.sun.management.GarbageCollectorMXBean
+(CASSANDRA-2868)
 
 
 0.7.8

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java?rev=1160879&r1=1160878&r2=1160879&view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java
 Tue Aug 23 20:05:38 2011
@@ -20,11 +20,13 @@ package org.apache.cassandra.service;
  * 
  */
 
+import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
+import java.lang.management.MemoryMXBean;
 import java.lang.management.MemoryUsage;
-import java.lang.reflect.InvocationTargetException;
-import java.lang.reflect.Method;
-import java.util.*;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
 import java.util.concurrent.TimeUnit;
 import javax.management.MBeanServer;
 import javax.management.ObjectName;
@@ -45,32 +47,22 @@ public class GCInspector
 public static final GCInspector instance = new GCInspector();
 
 private HashMap gctimes = new HashMap();
+private HashMap gccounts = new HashMap();
+
+List beans = new 
ArrayList();
+MemoryMXBean membean = ManagementFactory.getMemoryMXBean();
 
-List beans = new ArrayList(); // these are instances of 
com.sun.management.GarbageCollectorMXBean
 private volatile boolean cacheSizesReduced;
 
 public GCInspector()
 {
-// we only want this class to do its thing on sun jdks, or when the 
sun classes are present.
-Class gcBeanClass = null;
-try
-{
-gcBeanClass = 
Class.forName("com.sun.management.GarbageCollectorMXBean");
-Class.forName("com.sun.management.GcInfo");
-}
-catch (ClassNotFoundException ex)
-{
-// this happens when using a non-sun jdk.
-logger.warn("Cannot load sun GC monitoring classes. GCInspector is 
disabled.");
-}
-
 MBeanServer server = ManagementFactory.getPlatformMBeanServer();
 try
 {
 ObjectName gcName = new 
ObjectName(ManagementFactory.GARBAGE_COLLECTOR_MXBEAN_DOMAIN_TYPE + ",*");
 for (ObjectName name : server.queryNames(gcName, null))
 {
-Object gc = ManagementFactory.newPlatformMXBeanProxy(server, 
name.getCanonicalName(), gcBeanClass);
+GarbageCollectorMXBean gc = 
ManagementFactory.newPlatformMXBeanProxy(server, name.getCanonicalName(), 
GarbageCollectorMXBean.class);
 beans.add(gc);
 }
 }
@@ -97,43 +89,42 @@ public class GCInspector
 
 private void logGCResults()
 {
-for (Object gc : beans)
+for (GarbageCollectorMXBean gc : beans)
 {
-SunGcWrapper gcw = new SunGcWrapper(gc);
-if (gcw.isLastGcInfoNull())
+Long previousTotal = gctimes.get(gc.getName());
+Long total = gc.getCollectionTime();
+if (previousTotal == null)
+previousTotal = 0L;
+if (previousTotal.equals(total))
 continue;
-
-Long previous = gctimes.get(gcw.getName());
-if (previous != null && previous.longValue() == 
gcw.getCollectionTime().longValue())
-continue;
-gctimes.put(gcw.getName(), gcw.getCollectionTime());
-
-long previousMemoryUsed = 0;
-long memoryUsed = 0;
-long memoryMax = 0;
-for (Map.Entry entry : 
gcw.getMemoryUsageBeforeGc().entrySet())
-{
-previousMemoryUsed += entry.getValue(

svn commit: r1160878 - in /cassandra/branches/cassandra-0.7: build.xml debian/changelog

2011-08-23 Thread eevans

Author: eevans
Date: Tue Aug 23 20:05:28 2011
New Revision: 1160878

URL: http://svn.apache.org/viewvc?rev=1160878&view=rev
Log:
update versioning for 0.7.9 release

Modified:
cassandra/branches/cassandra-0.7/build.xml
cassandra/branches/cassandra-0.7/debian/changelog

Modified: cassandra/branches/cassandra-0.7/build.xml
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/build.xml?rev=1160878&r1=1160877&r2=1160878&view=diff
==
--- cassandra/branches/cassandra-0.7/build.xml (original)
+++ cassandra/branches/cassandra-0.7/build.xml Tue Aug 23 20:05:28 2011
@@ -24,7 +24,7 @@
 
 
 
-
+
 
 http://svn.apache.org/repos/asf/${scm.default.path}"/>
 https://svn.apache.org/repos/asf/${scm.default.path}"/>

Modified: cassandra/branches/cassandra-0.7/debian/changelog
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/debian/changelog?rev=1160878&r1=1160877&r2=1160878&view=diff
==
--- cassandra/branches/cassandra-0.7/debian/changelog (original)
+++ cassandra/branches/cassandra-0.7/debian/changelog Tue Aug 23 20:05:28 2011
@@ -1,3 +1,9 @@
+cassandra (0.7.9) unstable; urgency=low
+
+  * New stable point release
+
+ -- Eric Evans   Tue, 23 Aug 2011 14:53:39 -0500
+
 cassandra (0.7.8) unstable; urgency=low
 
   * New stable point release

[jira] [Resolved] (CASSANDRA-2868) Native Memory Leak

2011-08-23 Thread Brandon Williams (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-2868.
-

Resolution: Fixed

Committed to 0.7 in r1160879

> Native Memory Leak
> --
>
> Key: CASSANDRA-2868
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2868
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Daniel Doubleday
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 2868-v1.txt, 2868-v2.txt, 2868-v3.txt, 48hour_RES.png, 
> low-load-36-hours-initial-results.png
>
>
> We have memory issues with long running servers. These have been confirmed by 
> several users in the user list. That's why I report.
> The memory consumption of the cassandra java process increases steadily until 
> it's killed by the os because of oom (with no swap)
> Our server is started with -Xmx3000M and running for around 23 days.
> pmap -x shows
> Total SST: 1961616 (mem mapped data and index files)
> Anon  RSS: 6499640
> Total RSS: 8478376
> This shows that > 3G are 'overallocated'.
> We will use BRAF on one of our less important nodes to check wether it is 
> related to mmap and report back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2868) Native Memory Leak

2011-08-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089721#comment-13089721
 ] 

Hudson commented on CASSANDRA-2868:
---

Integrated in Cassandra-0.7 #543 (See 
[https://builds.apache.org/job/Cassandra-0.7/543/])
work around native memory leak in com.sun.management.GarbageCollectorMXBean
patch by brandonwilliams and jbellis for CASSANDRA-2868

brandonwilliams : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1160879
Files : 
* /cassandra/branches/cassandra-0.7/CHANGES.txt
* 
/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/GCInspector.java


> Native Memory Leak
> --
>
> Key: CASSANDRA-2868
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2868
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Daniel Doubleday
>Assignee: Brandon Williams
>Priority: Minor
> Fix For: 0.7.9, 0.8.5
>
> Attachments: 2868-v1.txt, 2868-v2.txt, 2868-v3.txt, 48hour_RES.png, 
> low-load-36-hours-initial-results.png
>
>
> We have memory issues with long running servers. These have been confirmed by 
> several users in the user list. That's why I report.
> The memory consumption of the cassandra java process increases steadily until 
> it's killed by the os because of oom (with no swap)
> Our server is started with -Xmx3000M and running for around 23 days.
> pmap -x shows
> Total SST: 1961616 (mem mapped data and index files)
> Anon  RSS: 6499640
> Total RSS: 8478376
> This shows that > 3G are 'overallocated'.
> We will use BRAF on one of our less important nodes to check wether it is 
> related to mmap and report back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2802) Enable encryption for data across the DC only.

2011-08-23 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089747#comment-13089747
 ] 

Brandon Williams commented on CASSANDRA-2802:
-

Encryption is typo'd as 'encription' in many places in the first patch.  In the 
second, I'd like to see the unknown rack/dc strings made class constants 
instead.

> Enable encryption for data across the DC only.
> --
>
> Key: CASSANDRA-2802
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2802
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 0.8.0
> Environment: JVM
>Reporter: Vijay
>Assignee: Vijay
>Priority: Minor
> Fix For: 0.8.5
>
> Attachments: 0001-2802-Commiting-New-Port-For-SSL.patch, 
> 0002-2802-Changes-to-Snitch-to-avoid-NPE.patch
>
>
> Make DC level Encryption option
> 1) Modify EncryptionOptions to add inter_dc option.
> 2) Modify OutboundTCPConnection.connect() to check if it is in the same DC 
> and if the encryption option is enabled.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-3072) Add a wide-row slice system test

2011-08-23 Thread Stu Hood (JIRA)

Add a wide-row slice system test


 Key: CASSANDRA-3072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3072
 Project: Cassandra
  Issue Type: Test
  Components: Tests
Reporter: Stu Hood
Assignee: Stu Hood
Priority: Minor
 Fix For: 1.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3072) Add a wide-row slice system test

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-3072:


Attachment: 0001-CASSANDRA-3072-Add-a-wide-row-slice-system-test.txt

Adds a wide row slice system test: there are a few constant factors that affect 
the runtime of the test: as posted, it takes just under 2 minutes to run on my 
laptop.

> Add a wide-row slice system test
> 
>
> Key: CASSANDRA-3072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3072
> Project: Cassandra
>  Issue Type: Test
>  Components: Tests
>Reporter: Stu Hood
>Assignee: Stu Hood
>Priority: Minor
> Fix For: 1.0
>
> Attachments: 0001-CASSANDRA-3072-Add-a-wide-row-slice-system-test.txt
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-1608) Redesigned Compaction

2011-08-23 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1608:
--

Attachment: 1608-v4.txt

v4 attached.

Manifest


- I noticed that Manifest.generations and lastCompactedKeys could be simplified 
to arrays if we are willing to assume that no node will have more than a PB or 
so of data in a single CF.  Which feels reasonable to me even with capacity 
expanding as fast as it is. :)

- What is the 1.25 supposed to be doing here?
{code}
// skip newlevel if the resulting sstables exceed newlevel threshold
if (maxBytesForLevel(newLevel) < SSTableReader.getTotalBytes(added)
&& SSTableReader.getTotalBytes(getLevel(newLevel + 1)) == 0 * 1.25)
{code}

- Why the "all on the same level" special case?  Is this just saying "L0 
compactions must go into L1?"
{noformat}
// the level for the added sstables is the max of the removed ones,
// plus one if the removed were all on the same level
{noformat}

- removed this.  if L0 is large, it doesn't necessarily follow that L1 is large 
too.  I don't see a good reason to second-guess the scoring here.
{code}
if (candidates.size() > 32 && bestLevel == 0)
{
candidates = getCandidatesFor(1);
}
{code}

- redid L0 candidate selection to follow the LevelDB algorithm (pick one L0, 
add other L0s and L1s that overlap).  This means that if we're doing sequential 
writes we don't do "extra" work compacting non-overlapping L0s unnecessarily.  
(A niche use to be sure given our emphasis on RP but it's not a lot of code.)

- L0 only gets two sstables before it's overcapacity?  Are we still allowing L0 
sstables to be large?  if so it's not even two

- "Exposing number of SSTables in L0 as a JMX property probably isn't a bad 
idea."

- it's not correct for the create/load code to assume that the first data 
directory stays constant across restarts -- it should check all directories 
when loading

CFS
===
- not immediately clear to me if the TODOs in isKeyInRemainingSSTables are 
something i should be concerned about
- why do we need the reference mark/unmark now but not before?  is this a bug 
fix independent of 1608?
- are we losing a lot of cycles to markCurrentViewReferenced on the read path 
now that this is 1000s of sstables instead of 10s?

DataTracker
===
- followed todo's suggestion to move incrementallyBackup to another thread
- why do we use a LinkedList in buildIntervalTree when we know the size 
beforehand?
- suspect that it's going to be faster to use interval tree to prune the search 
space for CollationController.collectTimeOrderedData, then sort that subset by 
timestamp.  Which would simplify DataTracker by not having to keep a list of 
sstables around sorted-by-timestamp -- could get rid of that entirely in favor 
of the tree, I think.

Compaction
==
- Did this code get moved somewhere else so manual compaction request against a 
single sstable remains a no-op for SizeTiered?
{code}
if (toCompact.size() < 2)
{
logger.info("Nothing to compact in " + 
cfs.getColumnFamilyName() + "." +
"Use forceUserDefinedCompaction if you wish to 
force compaction of single sstables " +
"(e.g. for tombstone collection)");
return 0;
}
{code}



> Redesigned Compaction
> -
>
> Key: CASSANDRA-1608
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Chris Goffinet
>Assignee: Benjamin Coverston
> Attachments: 1608-22082011.txt, 1608-v2.txt, 1608-v4.txt
>
>
> After seeing the I/O issues in CASSANDRA-1470, I've been doing some more 
> thinking on this subject that I wanted to lay out.
> I propose we redo the concept of how compaction works in Cassandra. At the 
> moment, compaction is kicked off based on a write access pattern, not read 
> access pattern. In most cases, you want the opposite. You want to be able to 
> track how well each SSTable is performing in the system. If we were to keep 
> statistics in-memory of each SSTable, prioritize them based on most accessed, 
> and bloom filter hit/miss ratios, we could intelligently group sstables that 
> are being read most often and schedule them for compaction. We could also 
> schedule lower priority maintenance on SSTable's not often accessed.
> I also propose we limit the size of each SSTable to a fix sized, that gives 
> us the ability to  better utilize our bloom filters in a predictable manner. 
> At the moment after a certain size, the bloom filters become less reliable. 
> This would also allow us to group data most accessed. Currently the size of 
> an SST

[jira] [Updated] (CASSANDRA-2806) Expose gossip/FD info to JMX

2011-08-23 Thread Patricio Echague (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patricio Echague updated CASSANDRA-2806:


Attachment: CASSANDRA-2806-0.8-v4.patch

This patches replaces all previous including the one for CHANGES.txt.

> Expose gossip/FD info to JMX
> 
>
> Key: CASSANDRA-2806
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2806
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Brandon Williams
>Assignee: Patricio Echague
>Priority: Minor
> Fix For: 0.8.5
>
> Attachments: CASSANDRA-2806-0.8-CHANGES.patch, 
> CASSANDRA-2806-0.8-v1.patch, CASSANDRA-2806-0.8-v2.patch, 
> CASSANDRA-2806-0.8-v3.patch, CASSANDRA-2806-0.8-v4.patch, screenshot-1.jpg
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2398) Type specific compression

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2398:


Attachment: (was: 
0002-CASSANDRA-2398-Type-specific-compression-for-counters.txt)

> Type specific compression
> -
>
> Key: CASSANDRA-2398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2398
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Stu Hood
>  Labels: compression
> Fix For: 1.0
>
> Attachments: 
> 0001-CASSANDRA-2398-Add-type-specific-compression-to-Abstra.txt, 
> 0002-CASSANDRA-2398-Type-specific-compression-for-counters.txt, 
> compress-lzf-0.7.0.jar
>
>
> Cassandra has a lot of locations that are ripe for type specific compression. 
> A short list:
> Indexes
>  * Keys compressed as BytesType, which could default to LZO/LZMA
>  * Offsets (delta and varint encoding)
>  * Column names added by 2319
> Data
>  * Keys, columns, timestamps: see 
> http://wiki.apache.org/cassandra/FileFormatDesignDoc
> A basic interface for type specific compression could be as simple as:
> {code:java}
> public void compress(int version, final List from, DataOutput to) 
> throws IOException
> public void decompress(int version, DataInput from, List to) 
> throws IOException
> public void skip(int version, DataInput from) throws IOException
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2398) Type specific compression

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2398:


Attachment: (was: 
0001-CASSANDRA-2398-Add-type-specific-compression-to-Abstra.txt)

> Type specific compression
> -
>
> Key: CASSANDRA-2398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2398
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Stu Hood
>  Labels: compression
> Fix For: 1.0
>
> Attachments: 
> 0001-CASSANDRA-2398-Add-type-specific-compression-to-Abstra.txt, 
> 0002-CASSANDRA-2398-Type-specific-compression-for-counters.txt, 
> compress-lzf-0.7.0.jar
>
>
> Cassandra has a lot of locations that are ripe for type specific compression. 
> A short list:
> Indexes
>  * Keys compressed as BytesType, which could default to LZO/LZMA
>  * Offsets (delta and varint encoding)
>  * Column names added by 2319
> Data
>  * Keys, columns, timestamps: see 
> http://wiki.apache.org/cassandra/FileFormatDesignDoc
> A basic interface for type specific compression could be as simple as:
> {code:java}
> public void compress(int version, final List from, DataOutput to) 
> throws IOException
> public void decompress(int version, DataInput from, List to) 
> throws IOException
> public void skip(int version, DataInput from) throws IOException
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2398) Type specific compression

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2398:


Attachment: 0002-CASSANDRA-2398-Type-specific-compression-for-counters.txt
0001-CASSANDRA-2398-Add-type-specific-compression-to-Abstra.txt

Rebased for trunk.

> Type specific compression
> -
>
> Key: CASSANDRA-2398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2398
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Stu Hood
>  Labels: compression
> Fix For: 1.0
>
> Attachments: 
> 0001-CASSANDRA-2398-Add-type-specific-compression-to-Abstra.txt, 
> 0002-CASSANDRA-2398-Type-specific-compression-for-counters.txt, 
> compress-lzf-0.7.0.jar
>
>
> Cassandra has a lot of locations that are ripe for type specific compression. 
> A short list:
> Indexes
>  * Keys compressed as BytesType, which could default to LZO/LZMA
>  * Offsets (delta and varint encoding)
>  * Column names added by 2319
> Data
>  * Keys, columns, timestamps: see 
> http://wiki.apache.org/cassandra/FileFormatDesignDoc
> A basic interface for type specific compression could be as simple as:
> {code:java}
> public void compress(int version, final List from, DataOutput to) 
> throws IOException
> public void decompress(int version, DataInput from, List to) 
> throws IOException
> public void skip(int version, DataInput from) throws IOException
> {code} 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2145) Simplify ColumnSortedMap

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2145:


Attachment: (was: 
0001-CASSANDRA-2145-Extract-serialization-from-ColumnSorted.txt)

> Simplify ColumnSortedMap
> 
>
> Key: CASSANDRA-2145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2145
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Priority: Minor
>
> We can simplify ColumnSortedMap substantially by hijacking the shell of 
> another sorted map implementation, rather than having scads of methods 
> implemented as "UnsupportedOperation"s.
> Also, CASSANDRA-674 needs a way to feed a supercolumn an arbitrary sorted 
> iterator, rather than necessarily deserializing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2145) Simplify ColumnSortedMap

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2145:


Attachment: 0001-CASSANDRA-2145-Extract-serialization-from-ColumnSorted.txt

bq. Still prefer simplicity of design to raw loc as the right metric to apply 
here.
Ok, I yield... nonetheless, posting one last rebasing of the old style.

> Simplify ColumnSortedMap
> 
>
> Key: CASSANDRA-2145
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2145
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Priority: Minor
> Attachments: 
> 0001-CASSANDRA-2145-Extract-serialization-from-ColumnSorted.txt
>
>
> We can simplify ColumnSortedMap substantially by hijacking the shell of 
> another sorted map implementation, rather than having scads of methods 
> implemented as "UnsupportedOperation"s.
> Also, CASSANDRA-674 needs a way to feed a supercolumn an arbitrary sorted 
> iterator, rather than necessarily deserializing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2629) Move key reads into SSTableIterators

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2629:


Attachment: (was: 
0002-CASSANDRA-2629-Remove-the-retry-with-key-from-index-st.txt)

> Move key reads into SSTableIterators
> 
>
> Key: CASSANDRA-2629
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2629
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Assignee: Stu Hood
>Priority: Critical
> Fix For: 1.0
>
>
> All SSTableIterators have a constructor that assumes the key and length has 
> already been parsed. Moving this logic inside the iterator will improve 
> symmetry and allow the file format to change without iterator consumers 
> knowing it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2629) Move key reads into SSTableIterators

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2629:


Attachment: (was: 
0001-CASSANDRA-2629-Move-key-and-row-size-reading-into-the-.txt)

> Move key reads into SSTableIterators
> 
>
> Key: CASSANDRA-2629
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2629
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Assignee: Stu Hood
>Priority: Critical
> Fix For: 1.0
>
>
> All SSTableIterators have a constructor that assumes the key and length has 
> already been parsed. Moving this logic inside the iterator will improve 
> symmetry and allow the file format to change without iterator consumers 
> knowing it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2629) Move key reads into SSTableIterators

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2629:


Attachment: 0003-CASSANDRA-2629-Add-alternate-iteration-based-consumpti.txt
0002-CASSANDRA-2629-Remove-the-retry-with-key-from-index-st.txt
0001-CASSANDRA-2629-Move-key-and-row-size-reading-into-the-.txt

Rebased, but without the changes from Sylvain's review... I'll get back to this 
once 3067 has had some eyes on it.

> Move key reads into SSTableIterators
> 
>
> Key: CASSANDRA-2629
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2629
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Stu Hood
>Assignee: Stu Hood
>Priority: Critical
> Fix For: 1.0
>
> Attachments: 
> 0001-CASSANDRA-2629-Move-key-and-row-size-reading-into-the-.txt, 
> 0002-CASSANDRA-2629-Remove-the-retry-with-key-from-index-st.txt, 
> 0003-CASSANDRA-2629-Add-alternate-iteration-based-consumpti.txt
>
>
> All SSTableIterators have a constructor that assumes the key and length has 
> already been parsed. Moving this logic inside the iterator will improve 
> symmetry and allow the file format to change without iterator consumers 
> knowing it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3067) Simple SSTable Pluggability

2011-08-23 Thread Stu Hood (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-3067:


Attachment: 0006-CASSANDRA-3067-Allow-overriding-the-current-sstable-ve.txt
0005-CASSANDRA-3067-Create-ABCs-for-SSTableReader-and-KeyIt.txt
0004-CASSANDRA-3067-Rename-SSTable-Names-Slice-Iterator-to-.txt
0003-CASSANDRA-3067-Create-an-ABC-for-SSTableWriter.txt
0002-CASSANDRA-3067-Move-from-linear-SSTable-versions-to-fe.txt
0001-CASSANDRA-3067-Create-an-ABC-for-SSTableIdentityIterat.txt

Patchset to create all necessary abstract base classes for SSTable pluggability.

0002 and 0006 deal with allowing for non-linear SSTable versions, and 
overriding the default.

This set applies atop CASSANDRA-2629 but should be ready for high level review.

> Simple SSTable Pluggability
> ---
>
> Key: CASSANDRA-3067
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3067
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Stu Hood
>Assignee: Stu Hood
> Fix For: 1.0
>
> Attachments: 
> 0001-CASSANDRA-3067-Create-an-ABC-for-SSTableIdentityIterat.txt, 
> 0002-CASSANDRA-3067-Move-from-linear-SSTable-versions-to-fe.txt, 
> 0003-CASSANDRA-3067-Create-an-ABC-for-SSTableWriter.txt, 
> 0004-CASSANDRA-3067-Rename-SSTable-Names-Slice-Iterator-to-.txt, 
> 0005-CASSANDRA-3067-Create-ABCs-for-SSTableReader-and-KeyIt.txt, 
> 0006-CASSANDRA-3067-Allow-overriding-the-current-sstable-ve.txt
>
>
> CASSANDRA-2995 proposes full storage engine pluggability, which is probably 
> unavoidable in the long run. For now though, I'd like to propose an 
> incremental alternative that preserves the sstable model, but allows it to 
> evolve non-linearly.
> The sstable "version" field could allow for simple switching between writable 
> sstable types, without moving all the way to differentiating between engines 
> as CASSANDRA-2995 requires. This can be accomplished by moving towards a 
> "feature flags" model (with a mapping between versions and feature sets), 
> rather than a linear versions model (where versions can be strictly ordered 
> and all versions above X have a feature).
> There are restrictions on this approach:
> * It's sufficient for an alternate SSTable(Writer|Reader|*) set to require a 
> patch to enable (rather than a JAR)
> * Filenames/descriptors/components must conform to the existing conventions

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-3073) liveSize() calculation is wrong in case of overwrite

2011-08-23 Thread Yang Yang (JIRA)

liveSize() calculation is wrong in case of overwrite


 Key: CASSANDRA-3073
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3073
 Project: Cassandra
  Issue Type: Bug
Reporter: Yang Yang
Priority: Minor


currently liveSize() is the sum of currentThroughput.

this definition is wrong if most of the operations are overwrite, or counter 
(which is essentially overwrite).

for example, the following code should always keep a single entry in db, with 
one row, one cf, one column, and supposedly should have a size of only about 
100 bytes.
connect localhost/9160;  
create keyspace blah;
use blah;

create column family cf2 with memtable_throughput=1024 and 
memtable_operations=1  ;


set the cassandra.yaml 
memtable_total_space_in_mb: 20

to make the error appear faster (but if u set to default, still same issue will 
appear)

then we use a simple pycassa  script:

>>> pool = pycassa.connect('blah')
>>> mycf = pycassa.ColumnFamily(pool,"cf2");
>>> for x in range(1,1000) :
... xx = mycf.insert('key1',{'col1':"{}".format(x)})
... 



you will see sstables being generated with only sizes of a few k, though we set 
the CF options to get high SSTable sizes





--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3073) liveSize() calculation is wrong in case of overwrite

2011-08-23 Thread Yang Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-3073:
-

Attachment: 0001-liveSize-is-different-from-throughput-particularly-w.patch

simple fix.

> liveSize() calculation is wrong in case of overwrite
> 
>
> Key: CASSANDRA-3073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3073
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yang Yang
>Priority: Minor
> Attachments: 
> 0001-liveSize-is-different-from-throughput-particularly-w.patch
>
>
> currently liveSize() is the sum of currentThroughput.
> this definition is wrong if most of the operations are overwrite, or counter 
> (which is essentially overwrite).
> for example, the following code should always keep a single entry in db, with 
> one row, one cf, one column, and supposedly should have a size of only about 
> 100 bytes.
> connect localhost/9160;  
> create keyspace blah;
> use blah;
> create column family cf2 with memtable_throughput=1024 and 
> memtable_operations=1  ;
> set the cassandra.yaml 
> memtable_total_space_in_mb: 20
> to make the error appear faster (but if u set to default, still same issue 
> will appear)
> then we use a simple pycassa  script:
> >>> pool = pycassa.connect('blah')
> >>> mycf = pycassa.ColumnFamily(pool,"cf2");
> >>> for x in range(1,1000) :
> ... xx = mycf.insert('key1',{'col1':"{}".format(x)})
> ... 
> you will see sstables being generated with only sizes of a few k, though we 
> set the CF options to get high SSTable sizes

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2034) Make Read Repair unnecessary when Hinted Handoff is enabled

2011-08-23 Thread Patricio Echague (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patricio Echague updated CASSANDRA-2034:


Attachment: 2034-v19.txt

Rebase and consolidates patch files 17 and 18 

> Make Read Repair unnecessary when Hinted Handoff is enabled
> ---
>
> Key: CASSANDRA-2034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2034
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Patricio Echague
> Fix For: 1.0
>
> Attachments: 2034-formatting.txt, 2034-v16.txt, 2034-v17.txt, 
> 2034-v18.txt, 2034-v19.txt, CASSANDRA-2034-trunk-v10.patch, 
> CASSANDRA-2034-trunk-v11.patch, CASSANDRA-2034-trunk-v11.patch, 
> CASSANDRA-2034-trunk-v12.patch, CASSANDRA-2034-trunk-v13.patch, 
> CASSANDRA-2034-trunk-v14.patch, CASSANDRA-2034-trunk-v15.patch, 
> CASSANDRA-2034-trunk-v2.patch, CASSANDRA-2034-trunk-v3.patch, 
> CASSANDRA-2034-trunk-v4.patch, CASSANDRA-2034-trunk-v5.patch, 
> CASSANDRA-2034-trunk-v6.patch, CASSANDRA-2034-trunk-v7.patch, 
> CASSANDRA-2034-trunk-v8.patch, CASSANDRA-2034-trunk-v9.patch, 
> CASSANDRA-2034-trunk.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Currently, HH is purely an optimization -- if a machine goes down, enabling 
> HH means RR/AES will have less work to do, but you can't disable RR entirely 
> in most situations since HH doesn't kick in until the FailureDetector does.
> Let's add a scheduled task to the mutate path, such that we return to the 
> client normally after ConsistencyLevel is achieved, but after RpcTimeout we 
> check the responseHandler write acks and write local hints for any missing 
> targets.
> This would making disabling RR when HH is enabled a much more reasonable 
> option, which has a huge impact on read throughput.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2034) Make Read Repair unnecessary when Hinted Handoff is enabled

2011-08-23 Thread Patricio Echague (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089922#comment-13089922
 ] 

Patricio Echague commented on CASSANDRA-2034:
-

Jonathan, I noticed you modified a bit CallbackInfo.shoudHint()

{code}
public boolean shouldHint()
{
return message != null && StorageProxy.shouldHint(target);
}
{code}

Not sure if you meant to say that your changes addresses the issue of not 
hinting when CL is not reached.
The new "shoudHint" method you added should be ok as it is processed upon 
RPCTimeout disregard if the CL was achieved or not.


> Make Read Repair unnecessary when Hinted Handoff is enabled
> ---
>
> Key: CASSANDRA-2034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2034
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Patricio Echague
> Fix For: 1.0
>
> Attachments: 2034-formatting.txt, 2034-v16.txt, 2034-v17.txt, 
> 2034-v18.txt, 2034-v19.txt, CASSANDRA-2034-trunk-v10.patch, 
> CASSANDRA-2034-trunk-v11.patch, CASSANDRA-2034-trunk-v11.patch, 
> CASSANDRA-2034-trunk-v12.patch, CASSANDRA-2034-trunk-v13.patch, 
> CASSANDRA-2034-trunk-v14.patch, CASSANDRA-2034-trunk-v15.patch, 
> CASSANDRA-2034-trunk-v2.patch, CASSANDRA-2034-trunk-v3.patch, 
> CASSANDRA-2034-trunk-v4.patch, CASSANDRA-2034-trunk-v5.patch, 
> CASSANDRA-2034-trunk-v6.patch, CASSANDRA-2034-trunk-v7.patch, 
> CASSANDRA-2034-trunk-v8.patch, CASSANDRA-2034-trunk-v9.patch, 
> CASSANDRA-2034-trunk.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Currently, HH is purely an optimization -- if a machine goes down, enabling 
> HH means RR/AES will have less work to do, but you can't disable RR entirely 
> in most situations since HH doesn't kick in until the FailureDetector does.
> Let's add a scheduled task to the mutate path, such that we return to the 
> client normally after ConsistencyLevel is achieved, but after RpcTimeout we 
> check the responseHandler write acks and write local hints for any missing 
> targets.
> This would making disabling RR when HH is enabled a much more reasonable 
> option, which has a huge impact on read throughput.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3073) liveSize() calculation is wrong in case of overwrite

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089942#comment-13089942
 ] 

Jonathan Ellis commented on CASSANDRA-3073:
---

This will easily OOM you with the slab allocator, since that really does retain 
the "live size" (throughput) amount of memory until flush.

> liveSize() calculation is wrong in case of overwrite
> 
>
> Key: CASSANDRA-3073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3073
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yang Yang
>Priority: Minor
> Attachments: 
> 0001-liveSize-is-different-from-throughput-particularly-w.patch
>
>
> currently liveSize() is the sum of currentThroughput.
> this definition is wrong if most of the operations are overwrite, or counter 
> (which is essentially overwrite).
> for example, the following code should always keep a single entry in db, with 
> one row, one cf, one column, and supposedly should have a size of only about 
> 100 bytes.
> connect localhost/9160;  
> create keyspace blah;
> use blah;
> create column family cf2 with memtable_throughput=1024 and 
> memtable_operations=1  ;
> set the cassandra.yaml 
> memtable_total_space_in_mb: 20
> to make the error appear faster (but if u set to default, still same issue 
> will appear)
> then we use a simple pycassa  script:
> >>> pool = pycassa.connect('blah')
> >>> mycf = pycassa.ColumnFamily(pool,"cf2");
> >>> for x in range(1,1000) :
> ... xx = mycf.insert('key1',{'col1':"{}".format(x)})
> ... 
> you will see sstables being generated with only sizes of a few k, though we 
> set the CF options to get high SSTable sizes

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2034) Make Read Repair unnecessary when Hinted Handoff is enabled

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089943#comment-13089943
 ] 

Jonathan Ellis commented on CASSANDRA-2034:
---

Right, that's what I was referring to when I said "[the old shouldHint] does 
not achieve our goal of making read-repair unnecessary. For that, we need to 
always hint when an attempted write fails."

> Make Read Repair unnecessary when Hinted Handoff is enabled
> ---
>
> Key: CASSANDRA-2034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2034
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Patricio Echague
> Fix For: 1.0
>
> Attachments: 2034-formatting.txt, 2034-v16.txt, 2034-v17.txt, 
> 2034-v18.txt, 2034-v19.txt, CASSANDRA-2034-trunk-v10.patch, 
> CASSANDRA-2034-trunk-v11.patch, CASSANDRA-2034-trunk-v11.patch, 
> CASSANDRA-2034-trunk-v12.patch, CASSANDRA-2034-trunk-v13.patch, 
> CASSANDRA-2034-trunk-v14.patch, CASSANDRA-2034-trunk-v15.patch, 
> CASSANDRA-2034-trunk-v2.patch, CASSANDRA-2034-trunk-v3.patch, 
> CASSANDRA-2034-trunk-v4.patch, CASSANDRA-2034-trunk-v5.patch, 
> CASSANDRA-2034-trunk-v6.patch, CASSANDRA-2034-trunk-v7.patch, 
> CASSANDRA-2034-trunk-v8.patch, CASSANDRA-2034-trunk-v9.patch, 
> CASSANDRA-2034-trunk.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Currently, HH is purely an optimization -- if a machine goes down, enabling 
> HH means RR/AES will have less work to do, but you can't disable RR entirely 
> in most situations since HH doesn't kick in until the FailureDetector does.
> Let's add a scheduled task to the mutate path, such that we return to the 
> client normally after ConsistencyLevel is achieved, but after RpcTimeout we 
> check the responseHandler write acks and write local hints for any missing 
> targets.
> This would making disabling RR when HH is enabled a much more reasonable 
> option, which has a huge impact on read throughput.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3023) NPE in describe_ring

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089944#comment-13089944
 ] 

Jonathan Ellis commented on CASSANDRA-3023:
---

I think "return half-broken data" is the right solution, since pre-1777 clients 
aren't going to be looking for the new data anyway.

> NPE in describe_ring
> 
>
> Key: CASSANDRA-3023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.8.4
>Reporter: Eric Falcao
>Assignee: Brandon Williams
> Fix For: 0.8.5
>
>
> Not sure how much of the following is relevant besides the stack trace, but 
> here I go:
> I have a 2 DC, 2 node per DC cluster. DC1 had it's seed replaced but I hadn't 
> restarted. I upgraded to 0.8.4 in the following fashion:
> -edited seeds
> -stopped both DC1 nodes
> -upgraded jars
> -started both nodes at the same time
> The non-seed node came up first and showed the following error. Then when the 
> seed node came up, the error went away on the non-seed node but started 
> occurring on the seed node:
> ERROR [pool-2-thread-15] 2011-08-12 22:32:27,438 Cassandra.java (line 3668) 
> Internal error processing describe_ring
> java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageService.getRangeToRpcaddressMap(StorageService.java:623)
>   at 
> org.apache.cassandra.thrift.CassandraServer.describe_ring(CassandraServer.java:731)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$describe_ring.process(Cassandra.java:3664)
>   at org.apache.cassandra.thrift.Brisk$Processor.process(Brisk.java:464)
>   at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3023) NPE in describe_ring

2011-08-23 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3023:
--

Fix Version/s: 0.8.5

> NPE in describe_ring
> 
>
> Key: CASSANDRA-3023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 0.8.4
>Reporter: Eric Falcao
>Assignee: Brandon Williams
> Fix For: 0.8.5
>
>
> Not sure how much of the following is relevant besides the stack trace, but 
> here I go:
> I have a 2 DC, 2 node per DC cluster. DC1 had it's seed replaced but I hadn't 
> restarted. I upgraded to 0.8.4 in the following fashion:
> -edited seeds
> -stopped both DC1 nodes
> -upgraded jars
> -started both nodes at the same time
> The non-seed node came up first and showed the following error. Then when the 
> seed node came up, the error went away on the non-seed node but started 
> occurring on the seed node:
> ERROR [pool-2-thread-15] 2011-08-12 22:32:27,438 Cassandra.java (line 3668) 
> Internal error processing describe_ring
> java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageService.getRangeToRpcaddressMap(StorageService.java:623)
>   at 
> org.apache.cassandra.thrift.CassandraServer.describe_ring(CassandraServer.java:731)
>   at 
> org.apache.cassandra.thrift.Cassandra$Processor$describe_ring.process(Cassandra.java:3664)
>   at org.apache.cassandra.thrift.Brisk$Processor.process(Brisk.java:464)
>   at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CASSANDRA-2686) Distributed per row locks

2011-08-23 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2686.
---

Resolution: Not A Problem

I think there's definitely interest if you were to post this on github, but 
like Cages itself, it shouldn't go in core Cassandra.

> Distributed per row locks
> -
>
> Key: CASSANDRA-2686
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2686
> Project: Cassandra
>  Issue Type: Wish
>  Components: Core
> Environment: any
>Reporter: Luís Ferreira
>  Labels: api-addition, features
>
> Instead of using a centralized locking strategy like cages with zookeeper, I 
> would like to have it in a decentralized way. Even if it carries some 
> limitations. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2834) Avoid repair getting started twice at the same time for the same CF

2011-08-23 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2834:
--

Priority: Minor  (was: Major)
Assignee: Sylvain Lebresne

> Avoid repair getting started twice at the same time for the same CF
> ---
>
> Key: CASSANDRA-2834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2834
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Terje Marthinussen
>Assignee: Sylvain Lebresne
>Priority: Minor
>
> It may seem like it is possible to start repair twice at the same time on the 
> same CF?
> Not 100% verified, but if this is indeed the case, we may want to consider 
> avoiding that including making nodetool repair abort and return and error if 
> repair is attempted on the same CF as one which already have repair running.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2819) Split rpc timeout for read and write ops

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089951#comment-13089951
 ] 

Jonathan Ellis commented on CASSANDRA-2819:
---

Sounds reasonable to me.

> Split rpc timeout for read and write ops
> 
>
> Key: CASSANDRA-2819
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2819
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Stu Hood
>Assignee: Melvin Wang
> Fix For: 1.0
>
> Attachments: 2819-v4.txt, rpc-jira.patch
>
>
> Given the vastly different latency characteristics of reads and writes, it 
> makes sense for them to have independent rpc timeouts internally.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2774) one way to make counter delete work better

2011-08-23 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2774:
--

   Reviewer: slebresne
Component/s: Core
   Priority: Minor  (was: Major)
   Assignee: Yang Yang

> one way to make counter delete work better
> --
>
> Key: CASSANDRA-2774
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2774
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 0.8.0
>Reporter: Yang Yang
>Assignee: Yang Yang
>Priority: Minor
> Attachments: counter_delete.diff
>
>
> current Counter does not work with delete, because different merging order of 
> sstables would produces different result, for example:
> add 1
> delete 
> add 2
> if the merging happens by 1-2, (1,2)--3  order, the result we see will be 2
> if merging is: 1--3, (1,3)--2, the result will be 3.
> the issue is that delete now can not separate out previous adds and adds 
> later than the delete. supposedly a delete is to create a completely new 
> incarnation of the counter, or a new "lifetime", or "epoch". the new approach 
> utilizes the concept of "epoch number", so that each delete bumps up the 
> epoch number. since each write is replicated (replicate on write is almost 
> always enabled in practice, if this is a concern, we could further force ROW 
> in case of delete ), so the epoch number is global to a replica set
> changes are attached, existing tests pass fine, some tests are modified since 
> the semantic is changed a bit. some cql tests do not pass in the original 
> 0.8.0 source, that's not the fault of this change.
> see details at 
> http://mail-archives.apache.org/mod_mbox/cassandra-user/201106.mbox/%3cbanlktikqcglsnwtt-9hvqpseoo7sf58...@mail.gmail.com%3E
> the goal of this is to make delete work ( at least with consistent behavior, 
> yes in case of long network partition, the behavior is not ideal, but it's 
> consistent with the definition of logical clock), so that we could have 
> expiring Counters

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2942) If you drop a CF when one node is down the files are orphaned on the downed node

2011-08-23 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2942:
--

Attachment: 2942.txt

patch to wait for StorageService.tasks on shutdown.  Also moves CL segment 
deletion there.

> If you drop a CF when one node is down the files are orphaned on the downed 
> node
> 
>
> Key: CASSANDRA-2942
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2942
> Project: Cassandra
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Cathy Daw
>Assignee: Sylvain Lebresne
>Priority: Minor
> Fix For: 1.0
>
> Attachments: 2942.txt
>
>
> * Bring up 3 node cluster
> * From node1: Run Stress Tool
> {code} stress --num-keys=10 --columns=10 --consistency-level=ALL 
> --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
> * Shutdown node3
> * From node1: drop the Standard1 CF in Keyspace1
> * Shutdown node2 and node3
> * Bring up node1 and node2. Check that the Standard1 files are gone.
> {code}
> ls -al /var/lib/cassandra/data/Keyspace1/
> {code}
> * Bring up node3. The log file shows the drop column family occurs
> {code}
>  INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0--8901a7c5c9ce 
> Drop column family: Keyspace1.Standard1
> {code}
> * Restart node3 to clear out dropped tables from the filesystem
> {code}
> root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
> total 36
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
> drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
> -rw-r--r-- 1 root root0 Jul 23 00:51 Standard1-g-1-Compacted
> -rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
> -rw-r--r-- 2 root root   32 Jul 23 00:51 Standard1-g-1-Filter.db
> -rw-r--r-- 2 root root  120 Jul 23 00:51 Standard1-g-1-Index.db
> -rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
> {code}
> *Bug:  The files for Standard1 are orphaned on node3*

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3073) liveSize() calculation is wrong in case of overwrite

2011-08-23 Thread Yang Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089970#comment-13089970
 ] 

Yang Yang commented on CASSANDRA-3073:
--

I think OOM is an orthogonal issue. 

right now the limit is "memtable_total_space_in_mb"  ,  the natural semantic of 
this limit is what this fix describes,  the old implementation does not reflect 
this semantic, that's the problem.

if we want to avoid the throughput, we already have the memtable_throughput 
param (though only specific to CF level only). otherwise it is at least 
necessary to change the memtable_total_space_in_mb to some other name, such as 
"memtable_total_throughput_in_mb"  


if the OOM appears with SlabAllocator, but not with JVM native allocator, isn't 
that a problem with SlabAllocator itself?

> liveSize() calculation is wrong in case of overwrite
> 
>
> Key: CASSANDRA-3073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3073
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yang Yang
>Priority: Minor
> Attachments: 
> 0001-liveSize-is-different-from-throughput-particularly-w.patch
>
>
> currently liveSize() is the sum of currentThroughput.
> this definition is wrong if most of the operations are overwrite, or counter 
> (which is essentially overwrite).
> for example, the following code should always keep a single entry in db, with 
> one row, one cf, one column, and supposedly should have a size of only about 
> 100 bytes.
> connect localhost/9160;  
> create keyspace blah;
> use blah;
> create column family cf2 with memtable_throughput=1024 and 
> memtable_operations=1  ;
> set the cassandra.yaml 
> memtable_total_space_in_mb: 20
> to make the error appear faster (but if u set to default, still same issue 
> will appear)
> then we use a simple pycassa  script:
> >>> pool = pycassa.connect('blah')
> >>> mycf = pycassa.ColumnFamily(pool,"cf2");
> >>> for x in range(1,1000) :
> ... xx = mycf.insert('key1',{'col1':"{}".format(x)})
> ... 
> you will see sstables being generated with only sizes of a few k, though we 
> set the CF options to get high SSTable sizes

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3073) liveSize() calculation is wrong in case of overwrite

2011-08-23 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089981#comment-13089981
 ] 

Jonathan Ellis commented on CASSANDRA-3073:
---

No, it's a problem with your patch, because SlabAllocator does not OOM as 
written. :)

> liveSize() calculation is wrong in case of overwrite
> 
>
> Key: CASSANDRA-3073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3073
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yang Yang
>Priority: Minor
> Attachments: 
> 0001-liveSize-is-different-from-throughput-particularly-w.patch
>
>
> currently liveSize() is the sum of currentThroughput.
> this definition is wrong if most of the operations are overwrite, or counter 
> (which is essentially overwrite).
> for example, the following code should always keep a single entry in db, with 
> one row, one cf, one column, and supposedly should have a size of only about 
> 100 bytes.
> connect localhost/9160;  
> create keyspace blah;
> use blah;
> create column family cf2 with memtable_throughput=1024 and 
> memtable_operations=1  ;
> set the cassandra.yaml 
> memtable_total_space_in_mb: 20
> to make the error appear faster (but if u set to default, still same issue 
> will appear)
> then we use a simple pycassa  script:
> >>> pool = pycassa.connect('blah')
> >>> mycf = pycassa.ColumnFamily(pool,"cf2");
> >>> for x in range(1,1000) :
> ... xx = mycf.insert('key1',{'col1':"{}".format(x)})
> ... 
> you will see sstables being generated with only sizes of a few k, though we 
> set the CF options to get high SSTable sizes

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3073) liveSize() calculation is wrong in case of overwrite

2011-08-23 Thread Yang Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089986#comment-13089986
 ] 

Yang Yang commented on CASSANDRA-3073:
--

(we could discuss about the SlabAllocator further elsewhere, but I have been 
running a version after 0.8.2  which does not yet have the SlabAllocator, and I 
applied this patch, it works fine under heavy load without OOM, using the 
native JVM allocator)

put it in another way, this is just an issue of renaming the 
memtable_total_space_in_mb to "memtable_total_throughput_in_mb", right now the 
name is misleading

> liveSize() calculation is wrong in case of overwrite
> 
>
> Key: CASSANDRA-3073
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3073
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yang Yang
>Priority: Minor
> Attachments: 
> 0001-liveSize-is-different-from-throughput-particularly-w.patch
>
>
> currently liveSize() is the sum of currentThroughput.
> this definition is wrong if most of the operations are overwrite, or counter 
> (which is essentially overwrite).
> for example, the following code should always keep a single entry in db, with 
> one row, one cf, one column, and supposedly should have a size of only about 
> 100 bytes.
> connect localhost/9160;  
> create keyspace blah;
> use blah;
> create column family cf2 with memtable_throughput=1024 and 
> memtable_operations=1  ;
> set the cassandra.yaml 
> memtable_total_space_in_mb: 20
> to make the error appear faster (but if u set to default, still same issue 
> will appear)
> then we use a simple pycassa  script:
> >>> pool = pycassa.connect('blah')
> >>> mycf = pycassa.ColumnFamily(pool,"cf2");
> >>> for x in range(1,1000) :
> ... xx = mycf.insert('key1',{'col1':"{}".format(x)})
> ... 
> you will see sstables being generated with only sizes of a few k, though we 
> set the CF options to get high SSTable sizes

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-1608) Redesigned Compaction

2011-08-23 Thread Benjamin Coverston (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090005#comment-13090005
 ] 

Benjamin Coverston commented on CASSANDRA-1608:
---

{quote}
What is the 1.25 supposed to be doing here?
{quote}
dunno what I was thinking, I was screwing around with giving the promoted range 
a size, but it looks like that ended up in the wrong place.
{quote}
Why the "all on the same level" special case? Is this just saying "L0 
compactions must go into L1?"
{quote}
Yes, also when a compaction gets triggered into an empty target level the same 
logic applies.
{quote}
removed this. if L0 is large, it doesn't necessarily follow that L1 is large 
too. I don't see a good reason to second-guess the scoring here.
{quote}
Actually this was there to prevent an OOM exception when too many SSTables were 
participating in any given compaction. You are, however correct that it doesn't 
follow that L1 is large, not in all cases. I'll revise this to give an upper 
bound to the list of L0 candidates in a given compaction.

{quote}
L0 only gets two sstables before it's overcapacity? Are we still allowing L0 
sstables to be large? if so it's not even two
{quote}

I was screwing around with this threshold. One of the side effects of the 
dynamic flush thresholds was that I could end up with a substantial number of 
small SSTables "stuck" in L0. One way to fix this is to always give L0 a small 
positive score when there are any SSTables in L0 so that the SSTables get 
cleared out with the rest of the leveling has been done. Previously I was using 
the memtable flush threshold as the multiplier for L0, but with dynamic 
flushing and global memtable thresholds this doesn't mean much anymore. I'm 
included to leave it and perhaps raise the multiplier for L0 from 2 to 4.


.bq "Exposing number of SSTables in L0 as a JMX property probably isn't a bad 
idea."

I'll get this in

.bq it's not correct for the create/load code to assume that the first data 
directory stays constant across restarts – it should check all directories when 
loading

I'll fix this

{quote}
CFS
===
- not immediately clear to me if the TODOs in isKeyInRemainingSSTables are 
something i should be concerned about
{quote}
I cleaned this up

{quote}
- why do we need the reference mark/unmark now but not before?  is this a bug 
fix independent of 1608?
{quote}

.bq Use reference counting to delete sstables instead of relying on the GC 
patch by slebresne; reviewed by jbellis for CASSANDRA-2521 git-svn-id: 
https://svn.apache.org/repos/asf/cassandra/trunk@1149085 
13f79535-47bb-0310-9956-ffa450edef68

I assumed that as I was doing operations on these SSTables in the referenced 
views I would also need to use these referenced.

{quote}
- are we losing a lot of cycles to markCurrentViewReferenced on the read path 
now that this is 1000s of sstables instead of 10s?
{quote}

Yes this is a potentially serious issue. This code gets called on every read. A 
pretty heavy price to pay during each read.





> Redesigned Compaction
> -
>
> Key: CASSANDRA-1608
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Chris Goffinet
>Assignee: Benjamin Coverston
> Attachments: 1608-22082011.txt, 1608-v2.txt, 1608-v4.txt
>
>
> After seeing the I/O issues in CASSANDRA-1470, I've been doing some more 
> thinking on this subject that I wanted to lay out.
> I propose we redo the concept of how compaction works in Cassandra. At the 
> moment, compaction is kicked off based on a write access pattern, not read 
> access pattern. In most cases, you want the opposite. You want to be able to 
> track how well each SSTable is performing in the system. If we were to keep 
> statistics in-memory of each SSTable, prioritize them based on most accessed, 
> and bloom filter hit/miss ratios, we could intelligently group sstables that 
> are being read most often and schedule them for compaction. We could also 
> schedule lower priority maintenance on SSTable's not often accessed.
> I also propose we limit the size of each SSTable to a fix sized, that gives 
> us the ability to  better utilize our bloom filters in a predictable manner. 
> At the moment after a certain size, the bloom filters become less reliable. 
> This would also allow us to group data most accessed. Currently the size of 
> an SSTable can grow to a point where large portions of the data might not 
> actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

75 matches

Mail list logo