Re: Consistency Level throughput

2011-05-26 Thread Ryu Kobayashi
My question is my throughput per case.

> In general, cluster throughput = single node throughput * number of
> nodes / replication factor.

Yes, I think so too.
But I really want to ask is there are no results.

Could you look at the chart I made it?

http://goo.gl/mACQa


2011/5/27 Maki Watanabe :
> I assume your question is on that "how CL will affects on the throughput".
>
> In theory, I believe CL will not affect on the throughput of the
> Cassandra system.
> In any CL, the coordinator node needs to submit write/read requests
> along the RF specified for the KS.
> But for the latency, CL will affects on.  Stronger CL will cause larger 
> latency.
> In the real world, it will depends on system configuration,
> application design, data, and all of the environment.
> However if you found shorter latency with stronger CL, there must be
> some reason to explain the behavior.
>
> maki
>
> 2011/5/27 Ryu Kobayashi :
>> Hi,
>>
>> Question of Consistency Level throughput.
>>
>> Environment:
>> 6 nodes. Replication factor is 3.
>>
>> ONE and QUORUM it was not for the throughput difference.
>> ALL just extremely slow.
>> Not ONE had only half the throughput.
>> ONE, TWO and THREE were similar results.
>>
>> Is there any difference between 2 nodes and 3 nodes?
>>
>> --
>>  
>> twitter:@ryu_kobayashi
>>
>
>
>
> --
> w3m
>



-- 
 
twitter:@ryu_kobayashi


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeremy Hanna
For the purposes of clearing out disk space, you might also occasionally check 
to see if you have snapshots that you no longer need.  Certain operations 
create snapshots (point-in-time backups of sstables) in the (default) 
/var/lib/cassandra/data//snapshots directory.

If you are absolutely sure that you no longer need a particular snapshot of the 
sstables, you can reclaim a decent amount of space that way.

I'm not sure of all of the other GC discussion going on but that's one way to 
reclaim some space.

On May 26, 2011, at 1:09 PM, Konstantin Naryshkin wrote:

> I have a basic understanding of how Cassandra handles the file system 
> (flushes in Memtables out to SSTables, SSTables get compacted) and I 
> understand that old files are only deleted when a node is restarted, when 
> Java does a GC, or when Cassandra feels like it is running out of space.
> 
> My question is, is there some way for us to hurry the process along? We have 
> a data that we do a lot of inserts into and then delete the data several 
> hours later. We would like it if we could free up disk space (since our 
> disks, though large, are shared with other applications). So far, the action 
> sequence to accomplish this is:
> nodetoo flush -> nodetool repair -> nodetool compact -> ??
> 
> Is there a way for me to make (or even gently suggest to) Cassandra that it 
> may be a good time to free up some space?



Re: Consistency Level throughput

2011-05-26 Thread Maki Watanabe
I assume your question is on that "how CL will affects on the throughput".

In theory, I believe CL will not affect on the throughput of the
Cassandra system.
In any CL, the coordinator node needs to submit write/read requests
along the RF specified for the KS.
But for the latency, CL will affects on.  Stronger CL will cause larger latency.
In the real world, it will depends on system configuration,
application design, data, and all of the environment.
However if you found shorter latency with stronger CL, there must be
some reason to explain the behavior.

maki

2011/5/27 Ryu Kobayashi :
> Hi,
>
> Question of Consistency Level throughput.
>
> Environment:
> 6 nodes. Replication factor is 3.
>
> ONE and QUORUM it was not for the throughput difference.
> ALL just extremely slow.
> Not ONE had only half the throughput.
> ONE, TWO and THREE were similar results.
>
> Is there any difference between 2 nodes and 3 nodes?
>
> --
>  
> twitter:@ryu_kobayashi
>



-- 
w3m


Re: Consistency Level throughput

2011-05-26 Thread Jonathan Ellis
I'm afraid I don't quite understand the question.

In general, cluster throughput = single node throughput * number of
nodes / replication factor.

On Thu, May 26, 2011 at 9:39 PM, Ryu Kobayashi  wrote:
> Hi,
>
> Question of Consistency Level throughput.
>
> Environment:
> 6 nodes. Replication factor is 3.
>
> ONE and QUORUM it was not for the throughput difference.
> ALL just extremely slow.
> Not ONE had only half the throughput.
> ONE, TWO and THREE were similar results.
>
> Is there any difference between 2 nodes and 3 nodes?
>
> --
>  
> twitter:@ryu_kobayashi
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


ghost node?

2011-05-26 Thread jonathan . colby
A node with IP 10.46.108.102 was removed from the cluster several days ago  
but the cassandra logs are full of these messages!


Anyone know how to permanently remove this information? I\m beginning to  
think it is affecting the throughput of the live ndes.


INFO [FlushWriter:1] 2011-05-27 04:28:17,976 Memtable.java (line 164)  
Completed flushing  
/var/lib/cassandra/data/system/HintsColumnFamily-f-95-Data.db (63 bytes)
INFO [ScheduledTasks:1] 2011-05-27 04:29:18,386 Gossiper.java (line 437)  
FatClient /10.46.108.102 has been silent for 3ms, removing from gossip
INFO [GossipStage:1] 2011-05-27 04:30:19,902 Gossiper.java (line 610) Node  
/10.46.108.102 is now part of the cluster
INFO [ScheduledTasks:1] 2011-05-27 04:30:19,903 HintedHandOffManager.java  
(line 210) Deleting any stored hints for 10.46.108.102
INFO [GossipStage:1] 2011-05-27 04:30:19,903 StorageService.java (line 865)  
Removing token 42535295865117307932921825928971026432 for /10.46.108.102
INFO [ScheduledTasks:1] 2011-05-27 04:30:19,903 ColumnFamilyStore.java  
(line 1048) Enqueuing flush of Memtable-HintsColumnFamily@2051849391(0  
bytes, 0 operations)
INFO [FlushWriter:1] 2011-05-27 04:30:19,904 Memtable.java (line 157)  
Writing Memtable-HintsColumnFamily@2051849391(0 bytes, 0 operations)
INFO [FlushWriter:1] 2011-05-27 04:30:26,711 Memtable.java (line 164)  
Completed flushing  
/var/lib/cassandra/data/system/HintsColumnFamily-f-96-Data.db (63 bytes)
INFO [ScheduledTasks:1] 2011-05-27 04:31:21,420 Gossiper.java (line 437)  
FatClient /10.46.108.102 has been silent for 3ms, removing from gossip
INFO [GossipStage:1] 2011-05-27 04:32:23,098 Gossiper.java (line 610) Node  
/10.46.108.102 is now part of the cluster
INFO [ScheduledTasks:1] 2011-05-27 04:32:23,099 HintedHandOffManager.java  
(line 210) Deleting any stored hints for 10.46.108.102
INFO [GossipStage:1] 2011-05-27 04:32:23,100 StorageService.java (line 865)  
Removing token 42535295865117307932921825928971026432 for /10.46.108.102
INFO [ScheduledTasks:1] 2011-05-27 04:32:23,100 ColumnFamilyStore.java  
(line 1048) Enqueuing flush of Memtable-HintsColumnFamily@639962965(0  
bytes, 0 operations)
INFO [FlushWriter:1] 2011-05-27 04:32:23,100 Memtable.java (line 157)  
Writing Memtable-HintsColumnFamily@639962965(0 bytes, 0 operations)
INFO [FlushWriter:1] 2011-05-27 04:32:23,155 Memtable.java (line 164)  
Completed flushing  
/var/lib/cassandra/data/system/HintsColumnFamily-f-97-Data.db (63 bytes)
INFO [ScheduledTasks:1] 2011-05-27 04:33:24,457 Gossiper.java (line 437)  
FatClient /10.46.108.102 has been silent for 3ms, removing from gossip
INFO [GossipStage:1] 2011-05-27 04:34:25,231 Gossiper.java (line 610) Node  
/10.46.108.102 is now part of the cluster
INFO [ScheduledTasks:1] 2011-05-27 04:34:25,232 HintedHandOffManager.java  
(line 210) Deleting any stored hints for 10.46.108.102
INFO [GossipStage:1] 2011-05-27 04:34:25,233 StorageService.java (line 865)  
Removing token 42535295865117307932921825928971026432 for /10.46.108.102
INFO [ScheduledTasks:1] 2011-05-27 04:34:25,233 ColumnFamilyStore.java  
(line 1048) Enqueuing flush of Memtable-HintsColumnFamily@1211655714(0  
bytes, 0 operations)
INFO [FlushWriter:1] 2011-05-27 04:34:25,234 Memtable.java (line 157)  
Writing Memtable-HintsColumnFamily@1211655714(0 bytes, 0 operations)
INFO [FlushWriter:1] 2011-05-27 04:34:25,290 Memtable.java (line 164)  
Completed flushing  
/var/lib/cassandra/data/system/HintsColumnFamily-f-98-Data.db (63 bytes)
INFO [ScheduledTasks:1] 2011-05-27 04:35:26,497 Gossiper.java (line 437)  
FatClient /10.46.108.102 has been silent for 3ms, removing from gossip


Consistency Level throughput

2011-05-26 Thread Ryu Kobayashi
Hi,

Question of Consistency Level throughput.

Environment:
6 nodes. Replication factor is 3.

ONE and QUORUM it was not for the throughput difference.
ALL just extremely slow.
Not ONE had only half the throughput.
ONE, TWO and THREE were similar results.

Is there any difference between 2 nodes and 3 nodes?

-- 
 
twitter:@ryu_kobayashi


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Im also not sure that will guarantee all space is cleaned up.  It
really depends on what you are doing inside Cassandra.  If you have
your on garbage collect that is just in some way tied to the gc run,
then it will run when  it runs.

If otoh you are associating records in your storage with specific
objects in memory and using one of the post-mortem hooks (finalize or
PhantomReference) to tell you to clean up that particular record then
its quite possible they wont all get cleaned up.  In general hotspot
does not find and clean every candidate object on every GC run.  It
starts with the easiest/fastest to find and then sees what more it
thinks it needs to do to create enough memory for anticipated near
future needs.

On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis  wrote:
> In summary, system.gc works fine unless you've deliberately done
> something like setting the -XX:-DisableExplicitGC flag.
>
> On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
>  wrote:
>> So, in summary, there is no way to predictably and efficiently tell 
>> Cassandra to get rid of all of the extra space it is using on disk?
>>
>> - Original Message -
>> From: "Jeffrey Kesselman" 
>> To: user@cassandra.apache.org
>> Sent: Thursday, May 26, 2011 8:57:49 PM
>> Subject: Re: Forcing Cassandra to free up some space
>>
>> Which JVM?  Which collector?  There have been and continue to be many.
>>
>> Hotspot itself supports a number of different collectors with
>> different behaviors.   Many of them do not collect every candidate on
>> every gc, but merely the easiest ones to find.  This is why depending
>> on finalizers is a *bad* idea in java code.  They may well never get
>> run.  (Finalizer is one of a few features the Sun Java team always
>> regretted putting in Java to start with.  It has caused quite a few
>> application problems over the years)
>>
>> The really important thing is that NONE of these behaviors of the
>> colelctors are guaranteed by specification not to change from version
>> to version.  Basing your code on non-specified behaviors is a good way
>> to hit mysterious failures on updates.
>>
>> For instance, in the mid 90s, IBM had a mode of their Vm called
>> "infinite heap."  it *never* garbage collected, even if you called
>> System.gc.  Instead it just threw away address space and counted on
>> the total memory needs for the life of the program being less then the
>> total addressable space of the processor.
>>
>> It was *very* fast for certain kinds of applications.
>>
>> Far from being pedantic, not depending on undocumented behavior is
>> simply good engineering.
>>
>>
>> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis  wrote:
>>> I've read the relevant source. While you're pedantically correct re
>>> the spec, you're wrong as to what the JVM actually does.
>>>
>>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman  wrote:
 Some references...

 "An object enters an unreachable state when no more strong references
 to it exist. When an object is unreachable, it is a candidate for
 collection. Note the wording: Just because an object is a candidate
 for collection doesn't mean it will be immediately collected. The JVM
 is free to delay collection until there is an immediate need for the
 memory being consumed by the object."

 http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

 and "Calling the gc method suggests that the Java Virtual Machine
 expend effort toward recycling unused objects"

 http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()

 It goes on to say that the VM will make a "best effort", but "best
 effort" is *deliberately* left up to the definition of the gc
 implementor.

 I guess you missed the many lectures I have given on this subject over
 the years at Java One Conferences

 On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis  wrote:
> It's a common misunderstanding that system.gc is only a suggestion; on
> any VM you're likely to run Cassandra on, System.gc will actually
> invoke a full collection.
>
> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  
> wrote:
>> Actually this is no gaurantee.   Its a common misunderstanding that
>> System.gc "forces" gc.  It does not. It is a suggestion only. The vm 
>> always
>> has the option as to when and how much it gcs
>>
>> On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



 --
 It's always darkest just before you are eaten by a grue.

>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>>
>>
>

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
You really should qualify  that with "on all currently known versions
of Hotspot"

Not trying to give you grief, really, but its an important limitation
to understand.

On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis  wrote:
> In summary, system.gc works fine unless you've deliberately done
> something like setting the -XX:-DisableExplicitGC flag.
>
> On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
>  wrote:
>> So, in summary, there is no way to predictably and efficiently tell 
>> Cassandra to get rid of all of the extra space it is using on disk?
>>
>> - Original Message -
>> From: "Jeffrey Kesselman" 
>> To: user@cassandra.apache.org
>> Sent: Thursday, May 26, 2011 8:57:49 PM
>> Subject: Re: Forcing Cassandra to free up some space
>>
>> Which JVM?  Which collector?  There have been and continue to be many.
>>
>> Hotspot itself supports a number of different collectors with
>> different behaviors.   Many of them do not collect every candidate on
>> every gc, but merely the easiest ones to find.  This is why depending
>> on finalizers is a *bad* idea in java code.  They may well never get
>> run.  (Finalizer is one of a few features the Sun Java team always
>> regretted putting in Java to start with.  It has caused quite a few
>> application problems over the years)
>>
>> The really important thing is that NONE of these behaviors of the
>> colelctors are guaranteed by specification not to change from version
>> to version.  Basing your code on non-specified behaviors is a good way
>> to hit mysterious failures on updates.
>>
>> For instance, in the mid 90s, IBM had a mode of their Vm called
>> "infinite heap."  it *never* garbage collected, even if you called
>> System.gc.  Instead it just threw away address space and counted on
>> the total memory needs for the life of the program being less then the
>> total addressable space of the processor.
>>
>> It was *very* fast for certain kinds of applications.
>>
>> Far from being pedantic, not depending on undocumented behavior is
>> simply good engineering.
>>
>>
>> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis  wrote:
>>> I've read the relevant source. While you're pedantically correct re
>>> the spec, you're wrong as to what the JVM actually does.
>>>
>>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman  wrote:
 Some references...

 "An object enters an unreachable state when no more strong references
 to it exist. When an object is unreachable, it is a candidate for
 collection. Note the wording: Just because an object is a candidate
 for collection doesn't mean it will be immediately collected. The JVM
 is free to delay collection until there is an immediate need for the
 memory being consumed by the object."

 http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

 and "Calling the gc method suggests that the Java Virtual Machine
 expend effort toward recycling unused objects"

 http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()

 It goes on to say that the VM will make a "best effort", but "best
 effort" is *deliberately* left up to the definition of the gc
 implementor.

 I guess you missed the many lectures I have given on this subject over
 the years at Java One Conferences

 On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis  wrote:
> It's a common misunderstanding that system.gc is only a suggestion; on
> any VM you're likely to run Cassandra on, System.gc will actually
> invoke a full collection.
>
> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  
> wrote:
>> Actually this is no gaurantee.   Its a common misunderstanding that
>> System.gc "forces" gc.  It does not. It is a suggestion only. The vm 
>> always
>> has the option as to when and how much it gcs
>>
>> On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



 --
 It's always darkest just before you are eaten by a grue.

>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>>
>>
>>
>> --
>> It's always darkest just before you are eaten by a grue.
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
It's always darkest just before you are eaten by a grue.


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jonathan Ellis
In summary, system.gc works fine unless you've deliberately done
something like setting the -XX:-DisableExplicitGC flag.

On Thu, May 26, 2011 at 5:58 PM, Konstantin  Naryshkin
 wrote:
> So, in summary, there is no way to predictably and efficiently tell Cassandra 
> to get rid of all of the extra space it is using on disk?
>
> - Original Message -
> From: "Jeffrey Kesselman" 
> To: user@cassandra.apache.org
> Sent: Thursday, May 26, 2011 8:57:49 PM
> Subject: Re: Forcing Cassandra to free up some space
>
> Which JVM?  Which collector?  There have been and continue to be many.
>
> Hotspot itself supports a number of different collectors with
> different behaviors.   Many of them do not collect every candidate on
> every gc, but merely the easiest ones to find.  This is why depending
> on finalizers is a *bad* idea in java code.  They may well never get
> run.  (Finalizer is one of a few features the Sun Java team always
> regretted putting in Java to start with.  It has caused quite a few
> application problems over the years)
>
> The really important thing is that NONE of these behaviors of the
> colelctors are guaranteed by specification not to change from version
> to version.  Basing your code on non-specified behaviors is a good way
> to hit mysterious failures on updates.
>
> For instance, in the mid 90s, IBM had a mode of their Vm called
> "infinite heap."  it *never* garbage collected, even if you called
> System.gc.  Instead it just threw away address space and counted on
> the total memory needs for the life of the program being less then the
> total addressable space of the processor.
>
> It was *very* fast for certain kinds of applications.
>
> Far from being pedantic, not depending on undocumented behavior is
> simply good engineering.
>
>
> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis  wrote:
>> I've read the relevant source. While you're pedantically correct re
>> the spec, you're wrong as to what the JVM actually does.
>>
>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman  wrote:
>>> Some references...
>>>
>>> "An object enters an unreachable state when no more strong references
>>> to it exist. When an object is unreachable, it is a candidate for
>>> collection. Note the wording: Just because an object is a candidate
>>> for collection doesn't mean it will be immediately collected. The JVM
>>> is free to delay collection until there is an immediate need for the
>>> memory being consumed by the object."
>>>
>>> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
>>>
>>> and "Calling the gc method suggests that the Java Virtual Machine
>>> expend effort toward recycling unused objects"
>>>
>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
>>>
>>> It goes on to say that the VM will make a "best effort", but "best
>>> effort" is *deliberately* left up to the definition of the gc
>>> implementor.
>>>
>>> I guess you missed the many lectures I have given on this subject over
>>> the years at Java One Conferences
>>>
>>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis  wrote:
 It's a common misunderstanding that system.gc is only a suggestion; on
 any VM you're likely to run Cassandra on, System.gc will actually
 invoke a full collection.

 On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  
 wrote:
> Actually this is no gaurantee.   Its a common misunderstanding that
> System.gc "forces" gc.  It does not. It is a suggestion only. The vm 
> always
> has the option as to when and how much it gcs
>
> On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:
>



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com

>>>
>>>
>>>
>>> --
>>> It's always darkest just before you are eaten by a grue.
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> It's always darkest just before you are eaten by a grue.
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jake Luciani
"Is there a way for me to make (or even gently suggest to) Cassandra that it
may be a good time to free up some space?"

Disregarding what's been said and until ref-counting is implemented this is
a useful tool to gently suggest cleanup:

https://github.com/ceocoder/jmxgc



On Thu, May 26, 2011 at 2:09 PM, Konstantin Naryshkin
wrote:

> I have a basic understanding of how Cassandra handles the file system
> (flushes in Memtables out to SSTables, SSTables get compacted) and I
> understand that old files are only deleted when a node is restarted, when
> Java does a GC, or when Cassandra feels like it is running out of space.
>
> My question is, is there some way for us to hurry the process along? We
> have a data that we do a lot of inserts into and then delete the data
> several hours later. We would like it if we could free up disk space (since
> our disks, though large, are shared with other applications). So far, the
> action sequence to accomplish this is:
> nodetoo flush -> nodetool repair -> nodetool compact -> ??
>
> Is there a way for me to make (or even gently suggest to) Cassandra that it
> may be a good time to free up some space?
>



-- 
http://twitter.com/tjake


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Not if it depends on a side effect of garbage collection such as finalizers

It aught to publish its own JMX control to cause that to happen.



On Thu, May 26, 2011 at 6:58 PM, Konstantin  Naryshkin
 wrote:
> So, in summary, there is no way to predictably and efficiently tell Cassandra 
> to get rid of all of the extra space it is using on disk?
>
> - Original Message -
> From: "Jeffrey Kesselman" 
> To: user@cassandra.apache.org
> Sent: Thursday, May 26, 2011 8:57:49 PM
> Subject: Re: Forcing Cassandra to free up some space
>
> Which JVM?  Which collector?  There have been and continue to be many.
>
> Hotspot itself supports a number of different collectors with
> different behaviors.   Many of them do not collect every candidate on
> every gc, but merely the easiest ones to find.  This is why depending
> on finalizers is a *bad* idea in java code.  They may well never get
> run.  (Finalizer is one of a few features the Sun Java team always
> regretted putting in Java to start with.  It has caused quite a few
> application problems over the years)
>
> The really important thing is that NONE of these behaviors of the
> colelctors are guaranteed by specification not to change from version
> to version.  Basing your code on non-specified behaviors is a good way
> to hit mysterious failures on updates.
>
> For instance, in the mid 90s, IBM had a mode of their Vm called
> "infinite heap."  it *never* garbage collected, even if you called
> System.gc.  Instead it just threw away address space and counted on
> the total memory needs for the life of the program being less then the
> total addressable space of the processor.
>
> It was *very* fast for certain kinds of applications.
>
> Far from being pedantic, not depending on undocumented behavior is
> simply good engineering.
>
>
> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis  wrote:
>> I've read the relevant source. While you're pedantically correct re
>> the spec, you're wrong as to what the JVM actually does.
>>
>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman  wrote:
>>> Some references...
>>>
>>> "An object enters an unreachable state when no more strong references
>>> to it exist. When an object is unreachable, it is a candidate for
>>> collection. Note the wording: Just because an object is a candidate
>>> for collection doesn't mean it will be immediately collected. The JVM
>>> is free to delay collection until there is an immediate need for the
>>> memory being consumed by the object."
>>>
>>> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
>>>
>>> and "Calling the gc method suggests that the Java Virtual Machine
>>> expend effort toward recycling unused objects"
>>>
>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
>>>
>>> It goes on to say that the VM will make a "best effort", but "best
>>> effort" is *deliberately* left up to the definition of the gc
>>> implementor.
>>>
>>> I guess you missed the many lectures I have given on this subject over
>>> the years at Java One Conferences
>>>
>>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis  wrote:
 It's a common misunderstanding that system.gc is only a suggestion; on
 any VM you're likely to run Cassandra on, System.gc will actually
 invoke a full collection.

 On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  
 wrote:
> Actually this is no gaurantee.   Its a common misunderstanding that
> System.gc "forces" gc.  It does not. It is a suggestion only. The vm 
> always
> has the option as to when and how much it gcs
>
> On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:
>



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com

>>>
>>>
>>>
>>> --
>>> It's always darkest just before you are eaten by a grue.
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> It's always darkest just before you are eaten by a grue.
>



-- 
It's always darkest just before you are eaten by a grue.


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Konstantin Naryshkin
So, in summary, there is no way to predictably and efficiently tell Cassandra 
to get rid of all of the extra space it is using on disk?

- Original Message -
From: "Jeffrey Kesselman" 
To: user@cassandra.apache.org
Sent: Thursday, May 26, 2011 8:57:49 PM
Subject: Re: Forcing Cassandra to free up some space

Which JVM?  Which collector?  There have been and continue to be many.

Hotspot itself supports a number of different collectors with
different behaviors.   Many of them do not collect every candidate on
every gc, but merely the easiest ones to find.  This is why depending
on finalizers is a *bad* idea in java code.  They may well never get
run.  (Finalizer is one of a few features the Sun Java team always
regretted putting in Java to start with.  It has caused quite a few
application problems over the years)

The really important thing is that NONE of these behaviors of the
colelctors are guaranteed by specification not to change from version
to version.  Basing your code on non-specified behaviors is a good way
to hit mysterious failures on updates.

For instance, in the mid 90s, IBM had a mode of their Vm called
"infinite heap."  it *never* garbage collected, even if you called
System.gc.  Instead it just threw away address space and counted on
the total memory needs for the life of the program being less then the
total addressable space of the processor.

It was *very* fast for certain kinds of applications.

Far from being pedantic, not depending on undocumented behavior is
simply good engineering.


On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis  wrote:
> I've read the relevant source. While you're pedantically correct re
> the spec, you're wrong as to what the JVM actually does.
>
> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman  wrote:
>> Some references...
>>
>> "An object enters an unreachable state when no more strong references
>> to it exist. When an object is unreachable, it is a candidate for
>> collection. Note the wording: Just because an object is a candidate
>> for collection doesn't mean it will be immediately collected. The JVM
>> is free to delay collection until there is an immediate need for the
>> memory being consumed by the object."
>>
>> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
>>
>> and "Calling the gc method suggests that the Java Virtual Machine
>> expend effort toward recycling unused objects"
>>
>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
>>
>> It goes on to say that the VM will make a "best effort", but "best
>> effort" is *deliberately* left up to the definition of the gc
>> implementor.
>>
>> I guess you missed the many lectures I have given on this subject over
>> the years at Java One Conferences
>>
>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis  wrote:
>>> It's a common misunderstanding that system.gc is only a suggestion; on
>>> any VM you're likely to run Cassandra on, System.gc will actually
>>> invoke a full collection.
>>>
>>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  wrote:
 Actually this is no gaurantee.   Its a common misunderstanding that
 System.gc "forces" gc.  It does not. It is a suggestion only. The vm always
 has the option as to when and how much it gcs

 On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:

>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>>
>>
>>
>> --
>> It's always darkest just before you are eaten by a grue.
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
It's always darkest just before you are eaten by a grue.


Re: Re: nodetool move trying to stream data to node no longer in cluster

2011-05-26 Thread jonathan . colby
Hi Aaron - Thanks alot for the great feedback. I'll try your suggestion on  
removing it as an endpoint with jmx.


On , aaron morton  wrote:
Off the top of my head the simple way to stop invalid end point state  
been passed around is a full cluster stop. Obviously thats not an option.  
The problem is if one node has the IP is will share it around with the  
others.




Out of interest take a look at the oacdb.FailureDetector MBean  
getAllEndpointStates() function. That returns the end point state held by  
the Gossiper. I think you should see the Phantom IP listed in there.




If it's only on some nodes *perhaps* restarting the node with the JVM  
option -Dcassandra.load_ring_state=false *may* help. That will stop the  
node from loading it's save ring state and force it to get it via gossip.  
Again, if there are other nodes with the phantom IP it may just get it  
again.




I'll do some digging and try to get back to you. This pops up from time  
to time and thinking out loud I wonder if it would be possible to add a  
new application state that purges an IP from the ring. eg  
VersionedValue.STATUS_PURGED that works with a ttl so it goes through X  
number of gossip rounds and then disappears.





Hope that helps.







-



Aaron Morton



Freelance Cassandra Developer



@aaronmorton



http://www.thelastpickle.com





On 26 May 2011, at 19:58, Jonathan Colby wrote:





> @Aaron -



>


> Unfortunately I'm still seeing message like: " is down", removing from  
gossip, although with not the same frequency.



>


> And repair/move jobs don't seem to try to stream data to the removed  
node anymore.



>


> Anyone know how to totally purge any stored gossip/endpoint data on  
nodes that were removed from the cluster. Or what might be happening here  
otherwise?



>



>



> On May 26, 2011, at 9:10 AM, aaron morton wrote:



>


>> cool. I was going to suggest that but as you already had the move  
running I thought it may be a little drastic.



>>


>> Did it show any progress ? If the IP address is not responding there  
should have been some sort of error.



>>



>> Cheers



>>



>> -



>> Aaron Morton



>> Freelance Cassandra Developer



>> @aaronmorton



>> http://www.thelastpickle.com



>>



>> On 26 May 2011, at 15:28, jonathan.co...@gmail.com wrote:



>>


>>> Seems like it had something to do with stale endpoint information. I  
did a rolling restart of the whole cluster and that seemed to trigger the  
nodes to remove the node that was decommissioned.



>>>



>>> On , aaron morton aa...@thelastpickle.com> wrote:


 Is it showing progress ? It may just be a problem with the  
information printed out.














 Can you check from the other nodes in the cluster to see if they are  
receiving the stream ?















 cheers















 -







 Aaron Morton







 Freelance Cassandra Developer







 @aaronmorton







 http://www.thelastpickle.com















 On 26 May 2011, at 00:42, Jonathan Colby wrote:















> I recently removed a node (with decommission) from our cluster.







>






> I added a couple new nodes and am now trying to rebalance the  
cluster using nodetool move.







>






> However, netstats shows that the node being "moved" is trying to  
stream data to the node that I already decommissioned yesterday.







>






> The removed node was powered-off, taken out of dns, its IP is not  
even pingable. It was never a seed neither.







>






> This is cassandra 0.7.5 on 64bit linux. How do I tell the cluster  
that this node is gone? Gossip should have detected this. The ring  
commands shows the correct cluster IPs.







>






> Here is a portion of netstats. 10.46.108.102 is the node which was  
removed.







>







> Mode: Leaving: streaming data to other nodes







> Streaming to: /10.46.108.102






>  
/var/lib/cassandra/data/DFS/main-f-1064-Data.db/(4681027,5195491),(5195491,15308570),(15308570,15891710),(16336750,20558705),(20558705,29112203),(29112203,36279329),(36465942,36623223),(36740457,37227058),(37227058,42206994),(42206994,47380294),(47635053,47709813),(47709813,48353944),(48621287,49406499),(53330048,53571312),(53571312,54153922),(54153922,59857615),(59857615,61029910),(61029910,61871509),(62190800,62498605),(62824281,62964830),(63511604,64353114),(64353114,64760400),(65174702,65919771),(65919771,66435630),(81440029,81725949),(81725949,83313847),(83313847,83908709),(88983863,89237303),(89237303,89934199),(89934199,97







> ...






>  
5693491,14795861666),(14795861666,14796105318),(14796105318,14796366886),(14796699825,14803874941),(14803874941,14808898331),(14808898331,1481

Re: EC2 node adding trouble

2011-05-26 Thread aaron morton
This is the *most* useful page on the wiki 
http://wiki.apache.org/cassandra/Operations 


Hope that helps. 

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 27 May 2011, at 02:06, Marcus Bointon wrote:

> On 26 May 2011, at 15:21, Sasha Dolgy wrote:
> 
>> Turn the node off, remove the node from the ring using nodetool and
>> removetoken  i've found this to be the best problem-free way.
>> Maybe it's better now ...
>> http://blog.sasha.dolgy.com/2011/03/apache-cassandra-nodetool.html
> 
> So I'd need to have at least replication=2 in order to do that safely? Your 
> article makes it sound like draining/decommission doesn't work?
> 
> Has anyone automated node addition/removal using chef or similar?
> 
> Marcus



Re: EC2 node adding trouble

2011-05-26 Thread aaron morton
This ticket may be just the ticket :)

https://issues.apache.org/jira/browse/CASSANDRA-2452

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 27 May 2011, at 01:16, Sasha Dolgy wrote:

> As an aside, you can also use that command to pull meta-data about
> instances in AWS.  I have implemented this to maintain a list of seed
> nodes.  This way, when a new instance is brought online, the default
> cassandra.yaml is `enhanced` to contain a dynamic list of valid seeds,
> proper hostname and a few other bits of useful information.
> 
> Finally, if you aren't using a single security group for all of your
> cassandra instances, maybe this may be of help to you.  When we add
> new nodes to our ring, we add them to a single cassandra security
> group.  No messing about with security groups per instance...
> 
> -sd
> 
> On Thu, May 26, 2011 at 2:36 PM, Marcus Bointon
>  wrote:
>> Thanks for all your helpful suggestions - I've now got it working. It was 
>> down to a combination of things.
>> 
>> 1. A missing rule in a security group
>> 2. A missing DNS name for the new node, so its default name was defaulting 
>> to localhost
>> 3. Google DNS caching the failed DNS lookup for the full duration of the 
>> SOA's TTL
>> 
>> In order to avoid the whole problem with assigning IPs using the 
>> internal/external trick and using up elastic IPs, I found this service which 
>> I'd not seen before: 
>> http://www.ducea.com/2009/06/01/howto-update-dns-hostnames-automatically-for-your-amazon-ec2-instances/
>> 
>> This means you can reliably set (and reset as necessary) a listen address 
>> with this command:
>> 
>> sed -i "s/^listen_address:.*/listen_address: `curl 
>> http://169.254.169.254/latest/meta-data/local-ipv4`/"; 
>> /etc/cassandra/cassandra.yaml
>> 
>> It's not quite as good as having a true dynamic hostname, but at least you 
>> can drop it in a startup script and forget it.
>> 
>> Marcus



Re: nodetool move trying to stream data to node no longer in cluster

2011-05-26 Thread aaron morton
Off the top of my head the simple way to stop invalid end point state been 
passed around is a full cluster stop. Obviously thats not an option. The 
problem is if one node has the IP is will share it around with the others.  

Out of interest take a look at the o.a.c.db.FailureDetector MBean 
getAllEndpointStates() function. That returns the end point state held by the 
Gossiper. I think you should see the Phantom IP listed in there. 

If it's only on some nodes *perhaps* restarting the node with the JVM option 
-Dcassandra.load_ring_state=false *may* help. That will stop the node from 
loading it's save ring state and force it to get it via gossip. Again, if there 
are other nodes with the phantom IP it may just get it again. 

I'll do some digging and try to get back to you. This pops up from time to time 
and thinking out loud I wonder if it would be possible to add a new application 
state that purges an IP from the ring. e.g. VersionedValue.STATUS_PURGED that 
works with a ttl so it goes through X number of gossip rounds and then 
disappears.  

Hope that helps. 

   
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 26 May 2011, at 19:58, Jonathan Colby wrote:

> @Aaron -
> 
> Unfortunately I'm still seeing message like:  " is down", 
> removing from gossip, although with not the same frequency.  
> 
> And repair/move jobs don't seem to try to stream data to the removed node 
> anymore.
> 
> Anyone know how to totally purge any stored gossip/endpoint data on nodes 
> that were removed from the cluster.  Or what might be happening here 
> otherwise?
> 
> 
> On May 26, 2011, at 9:10 AM, aaron morton wrote:
> 
>> cool. I was going to suggest that but as you already had the move running I 
>> thought it may be a little drastic. 
>> 
>> Did it show any progress ? If the IP address is not responding there should 
>> have been some sort of error. 
>> 
>> Cheers
>> 
>> -
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 26 May 2011, at 15:28, jonathan.co...@gmail.com wrote:
>> 
>>> Seems like it had something to do with stale endpoint information. I did a 
>>> rolling restart of the whole cluster and that seemed to trigger the nodes 
>>> to remove the node that was decommissioned.
>>> 
>>> On , aaron morton  wrote:
 Is it showing progress ? It may just be a problem with the information 
 printed out.
 
 
 
 Can you check from the other nodes in the cluster to see if they are 
 receiving the stream ?
 
 
 
 cheers
 
 
 
 -
 
 Aaron Morton
 
 Freelance Cassandra Developer
 
 @aaronmorton
 
 http://www.thelastpickle.com
 
 
 
 On 26 May 2011, at 00:42, Jonathan Colby wrote:
 
 
 
> I recently removed a node (with decommission) from our cluster.
 
> 
 
> I added a couple new nodes and am now trying to rebalance the cluster 
> using nodetool move.
 
> 
 
> However,  netstats shows that the node being "moved" is trying to stream 
> data to the node that I already decommissioned yesterday.
 
> 
 
> The removed node was powered-off, taken out of dns, its IP is not even 
> pingable.   It was never a seed neither.
 
> 
 
> This is cassandra 0.7.5 on 64bit linux.   How do I tell the cluster that 
> this node is gone?  Gossip should have detected this.  The ring commands 
> shows the correct cluster IPs.
 
> 
 
> Here is a portion of netstats. 10.46.108.102 is the node which was 
> removed.
 
> 
 
> Mode: Leaving: streaming data to other nodes
 
> Streaming to: /10.46.108.102
 
> /var/lib/cassandra/data/DFS/main-f-1064-Data.db/(4681027,5195491),(5195491,15308570),(15308570,15891710),(16336750,20558705),(20558705,29112203),(29112203,36279329),(36465942,36623223),(36740457,37227058),(37227058,42206994),(42206994,47380294),(47635053,47709813),(47709813,48353944),(48621287,49406499),(53330048,53571312),(53571312,54153922),(54153922,59857615),(59857615,61029910),(61029910,61871509),(62190800,62498605),(62824281,62964830),(63511604,64353114),(64353114,64760400),(65174702,65919771),(65919771,66435630),(81440029,81725949),(81725949,83313847),(83313847,83908709),(88983863,89237303),(89237303,89934199),(89934199,97
 
> ...
 
> 5693491,14795861666),(14795861666,14796105318),(14796105318,14796366886),(14796699825,14803874941),(14803874941,14808898331),(14808898331,14811670699),(14811670699,14815125177),(14815125177,14819765003),(14820229433,14820858266)
 
>   progress=280574376402/12434049900 - 2256%
 
> .
 
> 
 
> 
 
> Note 10.46.108.102 is NOT part of the ring.
 
> 
 
> Address Status State   LoadOwnsToken
 

Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Which JVM?  Which collector?  There have been and continue to be many.

Hotspot itself supports a number of different collectors with
different behaviors.   Many of them do not collect every candidate on
every gc, but merely the easiest ones to find.  This is why depending
on finalizers is a *bad* idea in java code.  They may well never get
run.  (Finalizer is one of a few features the Sun Java team always
regretted putting in Java to start with.  It has caused quite a few
application problems over the years)

The really important thing is that NONE of these behaviors of the
colelctors are guaranteed by specification not to change from version
to version.  Basing your code on non-specified behaviors is a good way
to hit mysterious failures on updates.

For instance, in the mid 90s, IBM had a mode of their Vm called
"infinite heap."  it *never* garbage collected, even if you called
System.gc.  Instead it just threw away address space and counted on
the total memory needs for the life of the program being less then the
total addressable space of the processor.

It was *very* fast for certain kinds of applications.

Far from being pedantic, not depending on undocumented behavior is
simply good engineering.


On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis  wrote:
> I've read the relevant source. While you're pedantically correct re
> the spec, you're wrong as to what the JVM actually does.
>
> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman  wrote:
>> Some references...
>>
>> "An object enters an unreachable state when no more strong references
>> to it exist. When an object is unreachable, it is a candidate for
>> collection. Note the wording: Just because an object is a candidate
>> for collection doesn't mean it will be immediately collected. The JVM
>> is free to delay collection until there is an immediate need for the
>> memory being consumed by the object."
>>
>> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
>>
>> and "Calling the gc method suggests that the Java Virtual Machine
>> expend effort toward recycling unused objects"
>>
>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
>>
>> It goes on to say that the VM will make a "best effort", but "best
>> effort" is *deliberately* left up to the definition of the gc
>> implementor.
>>
>> I guess you missed the many lectures I have given on this subject over
>> the years at Java One Conferences
>>
>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis  wrote:
>>> It's a common misunderstanding that system.gc is only a suggestion; on
>>> any VM you're likely to run Cassandra on, System.gc will actually
>>> invoke a full collection.
>>>
>>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  wrote:
 Actually this is no gaurantee.   Its a common misunderstanding that
 System.gc "forces" gc.  It does not. It is a suggestion only. The vm always
 has the option as to when and how much it gcs

 On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:

>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of DataStax, the source for professional Cassandra support
>>> http://www.datastax.com
>>>
>>
>>
>>
>> --
>> It's always darkest just before you are eaten by a grue.
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
It's always darkest just before you are eaten by a grue.


Re: OOM recovering failed node with many CFs

2011-05-26 Thread Jonathan Ellis
We've applied a fix to the 0.7 branch in
https://issues.apache.org/jira/browse/CASSANDRA-2714.  The patch
probably applies to 0.7.6 as well.

On Thu, May 26, 2011 at 11:36 AM, Flavio Baronti
 wrote:
> I tried the manual copy you suggest, but the SystemTable.checkHealth()
> function
> complains it can't load the system files. Log follows, I will gather some
> more
> info and create a ticket as soon as possible.
>
>  INFO [main] 2011-05-26 18:25:36,147 AbstractCassandraDaemon.java Logging
> initialized
>  INFO [main] 2011-05-26 18:25:36,172 AbstractCassandraDaemon.java Heap size:
> 4277534720/4277534720
>  INFO [main] 2011-05-26 18:25:36,174 CLibrary.java JNA not found. Native
> methods will be disabled.
>  INFO [main] 2011-05-26 18:25:36,190 DatabaseDescriptor.java Loading
> settings from file:/C:/Cassandra/conf/hscassandra9170.yaml
>  INFO [main] 2011-05-26 18:25:36,344 DatabaseDescriptor.java DiskAccessMode
> 'auto' determined to be mmap, indexAccessMode is mmap
>  INFO [main] 2011-05-26 18:25:36,532 SSTableReader.java Opening
> G:\Cassandra\data\system\Schema-f-2746
>  INFO [main] 2011-05-26 18:25:36,577 SSTableReader.java Opening
> G:\Cassandra\data\system\Schema-f-2729
>  INFO [main] 2011-05-26 18:25:36,590 SSTableReader.java Opening
> G:\Cassandra\data\system\Schema-f-2745
>  INFO [main] 2011-05-26 18:25:36,599 SSTableReader.java Opening
> G:\Cassandra\data\system\Migrations-f-2167
>  INFO [main] 2011-05-26 18:25:36,600 SSTableReader.java Opening
> G:\Cassandra\data\system\Migrations-f-2131
>  INFO [main] 2011-05-26 18:25:36,602 SSTableReader.java Opening
> G:\Cassandra\data\system\Migrations-f-1041
>  INFO [main] 2011-05-26 18:25:36,603 SSTableReader.java Opening
> G:\Cassandra\data\system\Migrations-f-1695
> ERROR [main] 2011-05-26 18:25:36,634 AbstractCassandraDaemon.java Fatal
> exception during initialization
> org.apache.cassandra.config.ConfigurationException: Found system table
> files, but they couldn't be loaded. Did you change the partitioner?
>        at
> org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:236)
>        at
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:127)
>        at
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
>        at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
>
>
> Il 5/26/2011 6:04 PM, Jonathan Ellis ha scritto:
>>
>> Sounds like a legitimate bug, although looking through the code I'm
>> not sure what would cause a tight retry loop on migration
>> announce/rectify. Can you create a ticket at
>> https://issues.apache.org/jira/browse/CASSANDRA ?
>>
>> As a workaround, I would try manually copying the Migrations and
>> Schema sstable files from the system keyspace of the live node, then
>> restart the recovering one.
>>
>> On Thu, May 26, 2011 at 9:27 AM, Flavio Baronti
>>   wrote:
>>>
>>> I can't seem to be able to recover a failed node on a database where i
>>> did
>>> many updates to the schema.
>>>
>>> I have a small cluster with 2 nodes, around 1000 CF (I know it's a lot,
>>> but
>>> it can't be changed right now), and ReplicationFactor=2.
>>> I shut down a node and cleaned its data entirely, then tried to bring it
>>> back up. The node starts fetching schema updates from the live node, but
>>> the
>>> operation fails halfway with an OOME.
>>> After some investigation, what I found is that:
>>>
>>> - I have a lot of schema updates (there are 2067 rows in the
>>> system.Schema
>>> CF).
>>> - The live node loads migrations 1-1000, and sends them to the recovering
>>> node (Migration.getLocalMigrations())
>>> - Soon afterwards, the live node checks the schema version on the
>>> recovering
>>> node and finds it has moved by a little - say it has applied the first 3
>>> migrations. It then loads migrations 3-1003, and sends them to the node.
>>> - This process is repeated very quickly (sends migrations 6-1006, 9-1009,
>>> etc).
>>>
>>> Analyzing the memory dump and the logs, it looks like each of these 1000
>>> migration blocks are composed in a single message and sent to the
>>> OutboundTcpConnection queue. However, since the schema is big, the
>>> messages
>>> occupy a lot of space, and are built faster than the connection can send
>>> them. Therefore, they accumulate in OutboundTcpConnection.queue, until
>>> memory is completely filled.
>>>
>>> Any suggestions? Can I change something to make this work, apart from
>>> reducing the number of CFs?
>>>
>>> Flavio
>>>
>>
>>
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jonathan Ellis
I've read the relevant source. While you're pedantically correct re
the spec, you're wrong as to what the JVM actually does.

On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman  wrote:
> Some references...
>
> "An object enters an unreachable state when no more strong references
> to it exist. When an object is unreachable, it is a candidate for
> collection. Note the wording: Just because an object is a candidate
> for collection doesn't mean it will be immediately collected. The JVM
> is free to delay collection until there is an immediate need for the
> memory being consumed by the object."
>
> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394
>
> and "Calling the gc method suggests that the Java Virtual Machine
> expend effort toward recycling unused objects"
>
> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()
>
> It goes on to say that the VM will make a "best effort", but "best
> effort" is *deliberately* left up to the definition of the gc
> implementor.
>
> I guess you missed the many lectures I have given on this subject over
> the years at Java One Conferences
>
> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis  wrote:
>> It's a common misunderstanding that system.gc is only a suggestion; on
>> any VM you're likely to run Cassandra on, System.gc will actually
>> invoke a full collection.
>>
>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  wrote:
>>> Actually this is no gaurantee.   Its a common misunderstanding that
>>> System.gc "forces" gc.  It does not. It is a suggestion only. The vm always
>>> has the option as to when and how much it gcs
>>>
>>> On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> It's always darkest just before you are eaten by a grue.
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Some references...

"An object enters an unreachable state when no more strong references
to it exist. When an object is unreachable, it is a candidate for
collection. Note the wording: Just because an object is a candidate
for collection doesn't mean it will be immediately collected. The JVM
is free to delay collection until there is an immediate need for the
memory being consumed by the object."

http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

and "Calling the gc method suggests that the Java Virtual Machine
expend effort toward recycling unused objects"

http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc()

It goes on to say that the VM will make a "best effort", but "best
effort" is *deliberately* left up to the definition of the gc
implementor.

I guess you missed the many lectures I have given on this subject over
the years at Java One Conferences

On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis  wrote:
> It's a common misunderstanding that system.gc is only a suggestion; on
> any VM you're likely to run Cassandra on, System.gc will actually
> invoke a full collection.
>
> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  wrote:
>> Actually this is no gaurantee.   Its a common misunderstanding that
>> System.gc "forces" gc.  It does not. It is a suggestion only. The vm always
>> has the option as to when and how much it gcs
>>
>> On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
It's always darkest just before you are eaten by a grue.


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Im sorry.  This was my business at Sun.  You are certainly wrong about
the Hotspot VM.

See this chapter of my book

http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.html#998394

On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis  wrote:
> It's a common misunderstanding that system.gc is only a suggestion; on
> any VM you're likely to run Cassandra on, System.gc will actually
> invoke a full collection.
>
> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  wrote:
>> Actually this is no gaurantee.   Its a common misunderstanding that
>> System.gc "forces" gc.  It does not. It is a suggestion only. The vm always
>> has the option as to when and how much it gcs
>>
>> On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
It's always darkest just before you are eaten by a grue.


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jonathan Ellis
It's a common misunderstanding that system.gc is only a suggestion; on
any VM you're likely to run Cassandra on, System.gc will actually
invoke a full collection.

On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman  wrote:
> Actually this is no gaurantee.   Its a common misunderstanding that
> System.gc "forces" gc.  It does not. It is a suggestion only. The vm always
> has the option as to when and how much it gcs
>
> On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jeffrey Kesselman
Actually this is no gaurantee.   Its a common misunderstanding that
System.gc "forces" gc.  It does not. It is a suggestion only. The vm always
has the option as to when and how much it gcs
 On May 26, 2011 2:51 PM, "Jonathan Ellis"  wrote:


Re: PHP CQL Driver

2011-05-26 Thread Kwasi Gyasi - Agyei
yep, works perfectly @ http://caqel.deadcafe.org/

I will try my luck @ phpcassa.

Thanks for your time gentlemen.

On Thu, May 26, 2011 at 8:59 PM, Sasha Dolgy  wrote:

> maybe you'd have more luck discussing this on the phpcassa list?
> https://groups.google.com/forum/#!forum/phpcassa
>
> more experience there with PHP and Cassandra ...
>
> Are you able to validate the query works when not using PHP?
>
> On Thu, May 26, 2011 at 8:51 PM, Kwasi Gyasi - Agyei
>  wrote:
> > got system in debug mode
> >
> > the following query fails
> > ---
> >
> > CREATE COLUMNFAMILY magic (KEY text PRIMARY KEY, monkey ) WITH comparator
> =
> > text AND default_validation = text
> >
> > PHP error reads
> > -
> >
> > #0
> >
> /Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/cassandra.Cassandra_execute_cql_query_result.php(52):
> > TBase->_read('Cassandra_execu...', Array, Object(TBinaryProtocol)) #1
> >
> /Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/cassandra.Cassandra.client.php(1771):
> >
> cassandra_Cassandra_execute_cql_query_result->read(Object(TBinaryProtocol))
> > #2
> >
> /Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/cassandra.Cassandra.client.php(1731):
> > CassandraClient->recv_execute_cql_query() #3
> > /Volumes/DATA/Project/libs/php/phpCQL/test/index.php(34):
> > CassandraClient->execute_cql_query('CREATE COLUMNFA...', 2) #4 {main}
> >
> > Cassandra logs read
> > --
> >
> > DEBUG 20:48:10,659 Disseminating load info ...
> > DEBUG 20:49:10,661 Disseminating load info ...
> > DEBUG 20:49:22,867 CQL statement type: USE
> > DEBUG 20:49:22,870 logged out: #
> >
> >
> > here is the code I'm using to test
> > 
> >
> > phpCQLAutoloader::register();
> >
> > $socketPool  = new TSocketPool();
> > $socketPool->addServer( "127.0.0.1", 9160 );
> > $socketPool->setDebug( true );
> >
> > $framedTransport  = new TFramedTransport( $socketPool, true, true );
> > $bufferedProtocol = new TBinaryProtocol( $framedTransport, true, true );
> > //new TBinaryProtocolAccelerated( $framedTransport );
> > $cassandraClient  = new CassandraClient( $bufferedProtocol,
> > $bufferedProtocol );
> >
> > try{
> >
> > echo "opening connection ";
> > $framedTransport->open();
> >
> > try{
> >
> > echo "settign keyspace to use ";
> > $result = $cassandraClient->execute_cql_query( "use nnduronic" ,
> > cassandra_Compression::NONE);
> >  print_r( $result );
> >
> > }catch( cassandra_InvalidRequestException $exrs ){
> >
> > echo "USE error occuired --  " . $exrs->getTraceAsString() .
> "
> > ";
> > }
> >
> > try{
> >
> > echo "Executing create column query ";
> > $query  = "CREATE COLUMNFAMILY magic (KEY text PRIMARY KEY,
> > monkey ) WITH comparator = text AND default_validation = text";
> > $result = $cassandraClient->execute_cql_query( $query ,
> > cassandra_Compression::NONE );
> >
> > echo "|". print_r($result) . "|" . "";
> >
> > }catch( cassandra_InvalidRequestException $exrs ){
> > echo "COLUMNFAMILY error occuired --  " .
> > $exrs->getTraceAsString() . " ";
> > }
> > echo "closing connnection ";
> > $framedTransport->close();
> >
> >
> > I'm lost :(
> >
> > On Thu, May 26, 2011 at 9:17 AM, aaron morton 
> > wrote:
> >>
> >> Cool, this may be a better discussion for the client-dev list
> >> http://www.mail-archive.com/client-dev@cassandra.apache.org/
> >>
> >> I would start by turning up the server logging to DEBUG and watching
> your
> >> update / select queries.
> >>
> >> Cheers
> >> -
> >> Aaron Morton
> >> Freelance Cassandra Developer
> >> @aaronmorton
> >> http://www.thelastpickle.com
> >> On 26 May 2011, at 16:15, Kwasi Gyasi - Agyei wrote:
> >>
> >> Hi,
> >>
> >> I have manged to generate thrift interface for php along with
> implementing
> >> auto-loading of both Cassandra and thrift core class.
> >>
> >> However during my testing the only query that works as expected is the
> >> create keyspace cql query... all other queries don't do or return any
> >> results nor do they throw exceptions even in try catch statement I get
> >> nothing.
> >>
> >> --
> >> 4Things
> >> Multimedia and Communication | Property | Entertainment
> >> Kwasi Owusu Gyasi - Agyei
> >>
> >> cell(+27) (0) 76 466 4488
> >> website www.4things.co.za
> >> email kwasi.gyasiag...@4things.co.za
> >> skypekwasi.gyasiagyei
> >> roleDeveloper.Designer.Software Architect
> >>
> >
> >
> >
> > --
> > 4Things
> > Multimedia and Communication | Property | Entertainment
> > Kwasi Owusu Gyasi - Agyei
> >
> > cell(+27) (0) 76 466 4488
> > website www.4things.co.za
> > email kwasi.gyasiag...@4things.co.za
> > skypekwasi.gyasiagyei
> > roleDeveloper.Designer.Software Architect
> >
>
>
>
> --
> Sasha Dolgy
> sasha.do...@gmail.com
>



-- 
*4T

Re: PHP CQL Driver

2011-05-26 Thread Sasha Dolgy
maybe you'd have more luck discussing this on the phpcassa list?
https://groups.google.com/forum/#!forum/phpcassa

more experience there with PHP and Cassandra ...

Are you able to validate the query works when not using PHP?

On Thu, May 26, 2011 at 8:51 PM, Kwasi Gyasi - Agyei
 wrote:
> got system in debug mode
>
> the following query fails
> ---
>
> CREATE COLUMNFAMILY magic (KEY text PRIMARY KEY, monkey ) WITH comparator =
> text AND default_validation = text
>
> PHP error reads
> -
>
> #0
> /Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/cassandra.Cassandra_execute_cql_query_result.php(52):
> TBase->_read('Cassandra_execu...', Array, Object(TBinaryProtocol)) #1
> /Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/cassandra.Cassandra.client.php(1771):
> cassandra_Cassandra_execute_cql_query_result->read(Object(TBinaryProtocol))
> #2
> /Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/cassandra.Cassandra.client.php(1731):
> CassandraClient->recv_execute_cql_query() #3
> /Volumes/DATA/Project/libs/php/phpCQL/test/index.php(34):
> CassandraClient->execute_cql_query('CREATE COLUMNFA...', 2) #4 {main}
>
> Cassandra logs read
> --
>
> DEBUG 20:48:10,659 Disseminating load info ...
> DEBUG 20:49:10,661 Disseminating load info ...
> DEBUG 20:49:22,867 CQL statement type: USE
> DEBUG 20:49:22,870 logged out: #
>
>
> here is the code I'm using to test
> 
>
> phpCQLAutoloader::register();
>
> $socketPool          = new TSocketPool();
> $socketPool->addServer( "127.0.0.1", 9160 );
> $socketPool->setDebug( true );
>
> $framedTransport  = new TFramedTransport( $socketPool, true, true );
> $bufferedProtocol = new TBinaryProtocol( $framedTransport, true, true );
> //new TBinaryProtocolAccelerated( $framedTransport );
> $cassandraClient  = new CassandraClient( $bufferedProtocol,
> $bufferedProtocol );
>
> try{
>
>     echo "opening connection ";
>     $framedTransport->open();
>
>     try{
>
>         echo "settign keyspace to use ";
>         $result = $cassandraClient->execute_cql_query( "use nnduronic" ,
> cassandra_Compression::NONE);
>      print_r( $result );
>
>     }catch( cassandra_InvalidRequestException $exrs ){
>
>         echo "USE error occuired --  " . $exrs->getTraceAsString() . "
> ";
>     }
>
>     try{
>
>             echo "Executing create column query ";
>             $query  = "CREATE COLUMNFAMILY magic (KEY text PRIMARY KEY,
> monkey ) WITH comparator = text AND default_validation = text";
>             $result = $cassandraClient->execute_cql_query( $query ,
> cassandra_Compression::NONE );
>
>             echo "|". print_r($result) . "|" . "";
>
>     }catch( cassandra_InvalidRequestException $exrs ){
>         echo "COLUMNFAMILY error occuired --  " .
> $exrs->getTraceAsString() . " ";
>     }
>         echo "closing connnection ";
>     $framedTransport->close();
>
>
> I'm lost :(
>
> On Thu, May 26, 2011 at 9:17 AM, aaron morton 
> wrote:
>>
>> Cool, this may be a better discussion for the client-dev list
>> http://www.mail-archive.com/client-dev@cassandra.apache.org/
>>
>> I would start by turning up the server logging to DEBUG and watching your
>> update / select queries.
>>
>> Cheers
>> -
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> On 26 May 2011, at 16:15, Kwasi Gyasi - Agyei wrote:
>>
>> Hi,
>>
>> I have manged to generate thrift interface for php along with implementing
>> auto-loading of both Cassandra and thrift core class.
>>
>> However during my testing the only query that works as expected is the
>> create keyspace cql query... all other queries don't do or return any
>> results nor do they throw exceptions even in try catch statement I get
>> nothing.
>>
>> --
>> 4Things
>> Multimedia and Communication | Property | Entertainment
>> Kwasi Owusu Gyasi - Agyei
>>
>> cell    (+27) (0) 76 466 4488
>> website www.4things.co.za
>> email kwasi.gyasiag...@4things.co.za
>> skype    kwasi.gyasiagyei
>> role    Developer.Designer.Software Architect
>>
>
>
>
> --
> 4Things
> Multimedia and Communication | Property | Entertainment
> Kwasi Owusu Gyasi - Agyei
>
> cell    (+27) (0) 76 466 4488
> website www.4things.co.za
> email kwasi.gyasiag...@4things.co.za
> skype    kwasi.gyasiagyei
> role    Developer.Designer.Software Architect
>



-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: PHP CQL Driver

2011-05-26 Thread Kwasi Gyasi - Agyei
got system in debug mode

the following query fails
---

CREATE COLUMNFAMILY magic (KEY text PRIMARY KEY, monkey ) WITH comparator =
text AND default_validation = text

PHP error reads
-

#0
/Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/cassandra.Cassandra_execute_cql_query_result.php(52):
TBase->_read('Cassandra_execu...', Array, Object(TBinaryProtocol)) #1
/Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/cassandra.Cassandra.client.php(1771):
cassandra_Cassandra_execute_cql_query_result->read(Object(TBinaryProtocol))
#2
/Volumes/DATA/Project/libs/php/phpCQL/vendor/cassandra/cassandra.Cassandra.client.php(1731):
CassandraClient->recv_execute_cql_query() #3
/Volumes/DATA/Project/libs/php/phpCQL/test/index.php(34):
CassandraClient->execute_cql_query('CREATE COLUMNFA...', 2) #4 {main}

Cassandra logs read
--

DEBUG 20:48:10,659 Disseminating load info ...
DEBUG 20:49:10,661 Disseminating load info ...
DEBUG 20:49:22,867 CQL statement type: USE
DEBUG 20:49:22,870 logged out: #


here is the code I'm using to test


phpCQLAutoloader::register();

$socketPool  = new TSocketPool();
$socketPool->addServer( "127.0.0.1", 9160 );
$socketPool->setDebug( true );

$framedTransport  = new TFramedTransport( $socketPool, true, true );
$bufferedProtocol = new TBinaryProtocol( $framedTransport, true, true );
//new TBinaryProtocolAccelerated( $framedTransport );
$cassandraClient  = new CassandraClient( $bufferedProtocol,
$bufferedProtocol );

try{

echo "opening connection ";
$framedTransport->open();

try{

echo "settign keyspace to use ";
$result = $cassandraClient->execute_cql_query( "use nnduronic" ,
cassandra_Compression::NONE);
 print_r( $result );

}catch( cassandra_InvalidRequestException $exrs ){

echo "USE error occuired --  " . $exrs->getTraceAsString() . "
";
}

try{

echo "Executing create column query ";
$query  = "CREATE COLUMNFAMILY magic (KEY text PRIMARY KEY,
monkey ) WITH comparator = text AND default_validation = text";
$result = $cassandraClient->execute_cql_query( $query ,
cassandra_Compression::NONE );

echo "|". print_r($result) . "|" . "";

}catch( cassandra_InvalidRequestException $exrs ){
echo "COLUMNFAMILY error occuired --  " .
$exrs->getTraceAsString() . " ";
}
echo "closing connnection ";
$framedTransport->close();


I'm lost :(

On Thu, May 26, 2011 at 9:17 AM, aaron morton wrote:

> Cool, this may be a better discussion for the client-dev list
> http://www.mail-archive.com/client-dev@cassandra.apache.org/
>
> I would start by turning up the server logging to DEBUG and watching your
> update / select queries.
>
> Cheers
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 26 May 2011, at 16:15, Kwasi Gyasi - Agyei wrote:
>
> Hi,
>
> I have manged to generate thrift interface for php along with implementing
> auto-loading of both Cassandra and thrift core class.
>
> However during my testing the only query that works as expected is the
> create keyspace cql query... all other queries don't do or return any
> results nor do they throw exceptions even in try catch statement I get
> nothing.
>
> --
> *4Things*
> Multimedia and Communication | Property | Entertainment
> Kwasi Owusu Gyasi - Agyei
>
> *cell*(+27) (0) 76 466 4488
> *website *www.4things.co.za
> *email *kwasi.gyasiag...@4things.co.za
> *skype*kwasi.gyasiagyei
> *role*Developer.Designer.Software Architect
>
>
>


-- 
*4Things*
Multimedia and Communication | Property | Entertainment
Kwasi Owusu Gyasi - Agyei

*cell*(+27) (0) 76 466 4488
*website *www.4things.co.za
*email *kwasi.gyasiag...@4things.co.za
*skype*kwasi.gyasiagyei
*role*Developer.Designer.Software Architect


Re: Forcing Cassandra to free up some space

2011-05-26 Thread Jonathan Ellis
You'd have to call system.gc via JMX.

https://issues.apache.org/jira/browse/CASSANDRA-2521 is open to
address this, btw.

On Thu, May 26, 2011 at 1:09 PM, Konstantin  Naryshkin
 wrote:
> I have a basic understanding of how Cassandra handles the file system
> (flushes in Memtables out to SSTables, SSTables get compacted) and I
> understand that old files are only deleted when a node is restarted, when
> Java does a GC, or when Cassandra feels like it is running out of space.
>
> My question is, is there some way for us to hurry the process along? We have
> a data that we do a lot of inserts into and then delete the data several
> hours later. We would like it if we could free up disk space (since our
> disks, though large, are shared with other applications). So far, the action
> sequence to accomplish this is:
> nodetoo flush -> nodetool repair -> nodetool compact -> ??
>
> Is there a way for me to make (or even gently suggest to) Cassandra that it
> may be a good time to free up some space?
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Forcing Cassandra to free up some space

2011-05-26 Thread Konstantin Naryshkin
I have a basic understanding of how Cassandra handles the file system (flushes 
in Memtables out to SSTables, SSTables get compacted) and I understand that old 
files are only deleted when a node is restarted, when Java does a GC, or when 
Cassandra feels like it is running out of space. 

My question is, is there some way for us to hurry the process along? We have a 
data that we do a lot of inserts into and then delete the data several hours 
later. We would like it if we could free up disk space (since our disks, though 
large, are shared with other applications). So far, the action sequence to 
accomplish this is: 
nodetoo flush -> nodetool repair -> nodetool compact -> ?? 

Is there a way for me to make (or even gently suggest to) Cassandra that it may 
be a good time to free up some space? 


Re: OOM recovering failed node with many CFs

2011-05-26 Thread Flavio Baronti

I tried the manual copy you suggest, but the SystemTable.checkHealth() function
complains it can't load the system files. Log follows, I will gather some more
info and create a ticket as soon as possible.

 INFO [main] 2011-05-26 18:25:36,147 AbstractCassandraDaemon.java Logging 
initialized
 INFO [main] 2011-05-26 18:25:36,172 AbstractCassandraDaemon.java Heap size: 
4277534720/4277534720
 INFO [main] 2011-05-26 18:25:36,174 CLibrary.java JNA not found. Native 
methods will be disabled.
 INFO [main] 2011-05-26 18:25:36,190 DatabaseDescriptor.java Loading settings from 
file:/C:/Cassandra/conf/hscassandra9170.yaml
 INFO [main] 2011-05-26 18:25:36,344 DatabaseDescriptor.java DiskAccessMode 'auto' determined to be mmap, 
indexAccessMode is mmap

 INFO [main] 2011-05-26 18:25:36,532 SSTableReader.java Opening 
G:\Cassandra\data\system\Schema-f-2746
 INFO [main] 2011-05-26 18:25:36,577 SSTableReader.java Opening 
G:\Cassandra\data\system\Schema-f-2729
 INFO [main] 2011-05-26 18:25:36,590 SSTableReader.java Opening 
G:\Cassandra\data\system\Schema-f-2745
 INFO [main] 2011-05-26 18:25:36,599 SSTableReader.java Opening 
G:\Cassandra\data\system\Migrations-f-2167
 INFO [main] 2011-05-26 18:25:36,600 SSTableReader.java Opening 
G:\Cassandra\data\system\Migrations-f-2131
 INFO [main] 2011-05-26 18:25:36,602 SSTableReader.java Opening 
G:\Cassandra\data\system\Migrations-f-1041
 INFO [main] 2011-05-26 18:25:36,603 SSTableReader.java Opening 
G:\Cassandra\data\system\Migrations-f-1695
ERROR [main] 2011-05-26 18:25:36,634 AbstractCassandraDaemon.java Fatal 
exception during initialization
org.apache.cassandra.config.ConfigurationException: Found system table files, but they couldn't be loaded. Did you 
change the partitioner?

at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:236)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:127)
 
at 
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)


Il 5/26/2011 6:04 PM, Jonathan Ellis ha scritto:

Sounds like a legitimate bug, although looking through the code I'm
not sure what would cause a tight retry loop on migration
announce/rectify. Can you create a ticket at
https://issues.apache.org/jira/browse/CASSANDRA ?

As a workaround, I would try manually copying the Migrations and
Schema sstable files from the system keyspace of the live node, then
restart the recovering one.

On Thu, May 26, 2011 at 9:27 AM, Flavio Baronti
  wrote:

I can't seem to be able to recover a failed node on a database where i did
many updates to the schema.

I have a small cluster with 2 nodes, around 1000 CF (I know it's a lot, but
it can't be changed right now), and ReplicationFactor=2.
I shut down a node and cleaned its data entirely, then tried to bring it
back up. The node starts fetching schema updates from the live node, but the
operation fails halfway with an OOME.
After some investigation, what I found is that:

- I have a lot of schema updates (there are 2067 rows in the system.Schema
CF).
- The live node loads migrations 1-1000, and sends them to the recovering
node (Migration.getLocalMigrations())
- Soon afterwards, the live node checks the schema version on the recovering
node and finds it has moved by a little - say it has applied the first 3
migrations. It then loads migrations 3-1003, and sends them to the node.
- This process is repeated very quickly (sends migrations 6-1006, 9-1009,
etc).

Analyzing the memory dump and the logs, it looks like each of these 1000
migration blocks are composed in a single message and sent to the
OutboundTcpConnection queue. However, since the schema is big, the messages
occupy a lot of space, and are built faster than the connection can send
them. Therefore, they accumulate in OutboundTcpConnection.queue, until
memory is completely filled.

Any suggestions? Can I change something to make this work, apart from
reducing the number of CFs?

Flavio









Re: OOM recovering failed node with many CFs

2011-05-26 Thread Jonathan Ellis
Sounds like a legitimate bug, although looking through the code I'm
not sure what would cause a tight retry loop on migration
announce/rectify. Can you create a ticket at
https://issues.apache.org/jira/browse/CASSANDRA ?

As a workaround, I would try manually copying the Migrations and
Schema sstable files from the system keyspace of the live node, then
restart the recovering one.

On Thu, May 26, 2011 at 9:27 AM, Flavio Baronti
 wrote:
> I can't seem to be able to recover a failed node on a database where i did
> many updates to the schema.
>
> I have a small cluster with 2 nodes, around 1000 CF (I know it's a lot, but
> it can't be changed right now), and ReplicationFactor=2.
> I shut down a node and cleaned its data entirely, then tried to bring it
> back up. The node starts fetching schema updates from the live node, but the
> operation fails halfway with an OOME.
> After some investigation, what I found is that:
>
> - I have a lot of schema updates (there are 2067 rows in the system.Schema
> CF).
> - The live node loads migrations 1-1000, and sends them to the recovering
> node (Migration.getLocalMigrations())
> - Soon afterwards, the live node checks the schema version on the recovering
> node and finds it has moved by a little - say it has applied the first 3
> migrations. It then loads migrations 3-1003, and sends them to the node.
> - This process is repeated very quickly (sends migrations 6-1006, 9-1009,
> etc).
>
> Analyzing the memory dump and the logs, it looks like each of these 1000
> migration blocks are composed in a single message and sent to the
> OutboundTcpConnection queue. However, since the schema is big, the messages
> occupy a lot of space, and are built faster than the connection can send
> them. Therefore, they accumulate in OutboundTcpConnection.queue, until
> memory is completely filled.
>
> Any suggestions? Can I change something to make this work, apart from
> reducing the number of CFs?
>
> Flavio
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


OOM recovering failed node with many CFs

2011-05-26 Thread Flavio Baronti
I can't seem to be able to recover a failed node on a database where i 
did many updates to the schema.


I have a small cluster with 2 nodes, around 1000 CF (I know it's a lot, 
but it can't be changed right now), and ReplicationFactor=2.
I shut down a node and cleaned its data entirely, then tried to bring it 
back up. The node starts fetching schema updates from the live node, but 
the operation fails halfway with an OOME.

After some investigation, what I found is that:

- I have a lot of schema updates (there are 2067 rows in the 
system.Schema CF).
- The live node loads migrations 1-1000, and sends them to the 
recovering node (Migration.getLocalMigrations())
- Soon afterwards, the live node checks the schema version on the 
recovering node and finds it has moved by a little - say it has applied 
the first 3 migrations. It then loads migrations 3-1003, and sends them 
to the node.
- This process is repeated very quickly (sends migrations 6-1006, 
9-1009, etc).


Analyzing the memory dump and the logs, it looks like each of these 1000 
migration blocks are composed in a single message and sent to the 
OutboundTcpConnection queue. However, since the schema is big, the 
messages occupy a lot of space, and are built faster than the connection 
can send them. Therefore, they accumulate in 
OutboundTcpConnection.queue, until memory is completely filled.


Any suggestions? Can I change something to make this work, apart from 
reducing the number of CFs?


Flavio


Re: EC2 node adding trouble

2011-05-26 Thread Marcus Bointon
On 26 May 2011, at 15:21, Sasha Dolgy wrote:

> Turn the node off, remove the node from the ring using nodetool and
> removetoken  i've found this to be the best problem-free way.
> Maybe it's better now ...
> http://blog.sasha.dolgy.com/2011/03/apache-cassandra-nodetool.html

So I'd need to have at least replication=2 in order to do that safely? Your 
article makes it sound like draining/decommission doesn't work?

Has anyone automated node addition/removal using chef or similar?

Marcus

Re: EC2 node adding trouble

2011-05-26 Thread Sasha Dolgy
On Thu, May 26, 2011 at 3:12 PM, Marcus Bointon
 wrote:
> I'd like to make sure I've got the right sequence of operations for adding a 
> node without downtime. If I'm going from 2 to 3 nodes:
>
> 1 Calculate new initial_token values using the python script
> 2 Change token values in existing nodes and restart them
> 3 Install/configure new node
> 4 Insert new node's token value
> 5 Set new node to auto-bootstrap
> 6 Start cassandra on new node
> 7 Wait for the ring to rebalance
>
> With token changes (using values from the python script), it's clear that all 
> nodes will have some data moved. Does this mean that there's a possibility of 
> overlap between regions if token changes are not absolutely simultaneous on 
> all nodes? That sounds dangerous to me... Or shouldn't token values be 
> changed on nodes containing data?
>

nodetool repair is good.  when we add new nodes, we add a new one
without specifying the new token.  after everything is up and healthy,
we determine new tokens and see if there is a need to renumber nodes.
if we do, we do one at a time and wait until the nodetool repair is
finished on one node before moving to another

> Is there a corresponding sequence for removing nodes? I'm guessing draining 
> is involved.

Turn the node off, remove the node from the ring using nodetool and
removetoken  i've found this to be the best problem-free way.
Maybe it's better now ...
http://blog.sasha.dolgy.com/2011/03/apache-cassandra-nodetool.html


Re: EC2 node adding trouble

2011-05-26 Thread Sasha Dolgy
As an aside, you can also use that command to pull meta-data about
instances in AWS.  I have implemented this to maintain a list of seed
nodes.  This way, when a new instance is brought online, the default
cassandra.yaml is `enhanced` to contain a dynamic list of valid seeds,
proper hostname and a few other bits of useful information.

Finally, if you aren't using a single security group for all of your
cassandra instances, maybe this may be of help to you.  When we add
new nodes to our ring, we add them to a single cassandra security
group.  No messing about with security groups per instance...

-sd

On Thu, May 26, 2011 at 2:36 PM, Marcus Bointon
 wrote:
> Thanks for all your helpful suggestions - I've now got it working. It was 
> down to a combination of things.
>
> 1. A missing rule in a security group
> 2. A missing DNS name for the new node, so its default name was defaulting to 
> localhost
> 3. Google DNS caching the failed DNS lookup for the full duration of the 
> SOA's TTL
>
> In order to avoid the whole problem with assigning IPs using the 
> internal/external trick and using up elastic IPs, I found this service which 
> I'd not seen before: 
> http://www.ducea.com/2009/06/01/howto-update-dns-hostnames-automatically-for-your-amazon-ec2-instances/
>
> This means you can reliably set (and reset as necessary) a listen address 
> with this command:
>
> sed -i "s/^listen_address:.*/listen_address: `curl 
> http://169.254.169.254/latest/meta-data/local-ipv4`/"; 
> /etc/cassandra/cassandra.yaml
>
> It's not quite as good as having a true dynamic hostname, but at least you 
> can drop it in a startup script and forget it.
>
> Marcus


Re: EC2 node adding trouble

2011-05-26 Thread Marcus Bointon
On 24 May 2011, at 23:58, Sameer Farooqui wrote:

> So, once you know what token each of the 3 nodes should have, shut down the 
> first two nodes, change their tokens and add the correct token to the 3rd 
> node (in the YAML file).

I'd like to make sure I've got the right sequence of operations for adding a 
node without downtime. If I'm going from 2 to 3 nodes:

1 Calculate new initial_token values using the python script
2 Change token values in existing nodes and restart them
3 Install/configure new node
4 Insert new node's token value
5 Set new node to auto-bootstrap
6 Start cassandra on new node
7 Wait for the ring to rebalance

With token changes (using values from the python script), it's clear that all 
nodes will have some data moved. Does this mean that there's a possibility of 
overlap between regions if token changes are not absolutely simultaneous on all 
nodes? That sounds dangerous to me... Or shouldn't token values be changed on 
nodes containing data?

Can cassandra nodes restart without downtime?

I'm looking at http://wiki.apache.org/cassandra/MultinodeCluster but as it says 
it's deliberately simplistic.

Is there a corresponding sequence for removing nodes? I'm guessing draining is 
involved.

Marcus

Re: EC2 node adding trouble

2011-05-26 Thread Marcus Bointon
Thanks for all your helpful suggestions - I've now got it working. It was down 
to a combination of things.

1. A missing rule in a security group
2. A missing DNS name for the new node, so its default name was defaulting to 
localhost
3. Google DNS caching the failed DNS lookup for the full duration of the SOA's 
TTL

In order to avoid the whole problem with assigning IPs using the 
internal/external trick and using up elastic IPs, I found this service which 
I'd not seen before: 
http://www.ducea.com/2009/06/01/howto-update-dns-hostnames-automatically-for-your-amazon-ec2-instances/

This means you can reliably set (and reset as necessary) a listen address with 
this command:

sed -i "s/^listen_address:.*/listen_address: `curl 
http://169.254.169.254/latest/meta-data/local-ipv4`/"; 
/etc/cassandra/cassandra.yaml

It's not quite as good as having a true dynamic hostname, but at least you can 
drop it in a startup script and forget it.

Marcus

Re: Corrupted Counter Columns

2011-05-26 Thread Utku Can Topçu
Some additional information on the settings:

I'm using CL.ONE for both reading and writing; and replicate_on_write is
true on the Counters CF.

I think the problem occurs after a restart when the commitlogs are read.

On Thu, May 26, 2011 at 2:21 PM, Utku Can Topçu  wrote:

> Hello,
>
> I'm using the the 0.8.0-rc1, with RF=2 and 4 nodes.
>
> Strangely counters are corrupted. Say, the actual value should be : 51664
> and the value that cassandra sometimes outputs is: either 51664 or 18651001.
>
> And I have no idea on how to diagnose the problem or reproduce it.
>
> Can you help me in fixing this issue?
>
> Regards,
> Utku
>


Corrupted Counter Columns

2011-05-26 Thread Utku Can Topçu
Hello,

I'm using the the 0.8.0-rc1, with RF=2 and 4 nodes.

Strangely counters are corrupted. Say, the actual value should be : 51664
and the value that cassandra sometimes outputs is: either 51664 or 18651001.

And I have no idea on how to diagnose the problem or reproduce it.

Can you help me in fixing this issue?

Regards,
Utku


Re: Priority queue in a single row - performance falls over time

2011-05-26 Thread Paul Loy
persistent [priority] queues are better suited to something like HornetQ
than Cassandra.

On Wed, May 25, 2011 at 9:10 PM, Dan Kuebrich wrote:

> It sounds like the problem is that the row is getting filled up with
> tombstones and becoming enormous?  Another idea then, which might not be
> worth the added complexity, is to progressively use new rows.  Depending on
> volume, this could mean having 5-minute-window rows, or 1 minute, or
> whatever works best.
>
> Read: Assuming you're not falling behind, you only need to query the row
> that the current time falls in and the one immediately prior.  If you do
> fall behind, you'll have to walk backwards in buckets until you find them
> empty.
> Write: Write column to the bucket (row) that corresponds to the correct
> time window.
> Delete: Delete the column from the row it was read from.  When all columns
> in the row are deleted the row can GC.
>
> Again, cassandra might not be the correct datastore.
>
> On Wed, May 25, 2011 at 3:56 PM, Jonathan Ellis  wrote:
>
>> You're basically intentionally inflicting the worst case scenario on
>> the Cassandra storage engine:
>> http://wiki.apache.org/cassandra/DistributedDeletes
>>
>> You could play around with reducing gc_grace_seconds but a PQ with
>> "millions" of items is something you should probably just do in memory
>> these days.
>>
>> On Wed, May 25, 2011 at 10:43 AM,   wrote:
>> >
>> >
>> > Hi all,
>> >
>> > I'm trying to implement a priority queue for holding a large number
>> (millions)
>> > of items that need to be processed in time order. My solution works -
>> but gets
>> > slower and slower until performance is unacceptable - even with a small
>> number
>> > of items.
>> >
>> > Each item essentially needs to be popped off the queue (some arbitrary
>> work is
>> > then done) and then the item is returned to the queue with a new
>> timestamp
>> > indicating when it should be processed again. We thus cycle through all
>> work
>> > items eventually, but some may come around more frequently than others.
>> >
>> > I am implementing this as a single Cassandra row, in a CF with a
>> TimeUUID
>> > comparator.
>> >
>> > Each column name is a TimeUUID, with an arbitrary column value
>> describing the
>> > work item; the columns are thus sorted in time order.
>> >
>> > To pop items, I do a get() such as:
>> >
>> >  cf.get(row_key, column_finish=now, column_start=yesterday,
>> column_count=1000)
>> >
>> > to get all the items at the head of the queue (if any) whose time
>> exceeds the
>> > current system time.
>> >
>> > For each item retrieved, I do a delete to remove the old column, then an
>> insert
>> > with a fresh TimeUUID column name (system time + arbitrary increment),
>> thus
>> > putting the item back somewhere in the queue (currently, the back of the
>> queue)
>> >
>> > I do a batch_mutate for all these deletes and inserts, with a queue size
>> of
>> > 2000. These are currently interleaved i.e.
>> delete1-insert1-delete2-insert2...
>> >
>> > This all appears to work correctly, but the performance starts at around
>> 8000
>> > cycles/sec, falls to around 1800/sec over the first 250K cycles, and
>> continues
>> > to fall over time, down to about 150/sec, after a few million cycles.
>> This
>> > happens regardless of the overall size of the row (I have tried sizes
>> from 1000
>> > to 100,000 items). My target performance is 1000 cycles/sec (but my data
>> store
>> > will need to handle other work concurrently).
>> >
>> > I am currently using just a single node running on localhost, using a
>> pycassa
>> > client. 4 core, 4GB machine, Fedora 14.
>> >
>> > Is this expected behaviour (is there just too much churn for a single
>> row to
>> > perform well), or am I doing something wrong?
>> >
>> > Would https://issues.apache.org/jira/browse/CASSANDRA-2583 in version
>> 0.8.1 fix
>> > this problem (I am using version 0.7.6)?
>> >
>> > Thanks!
>> >
>> > David.
>> >
>> > 
>> > This message was sent using IMP, the Internet Messaging Program.
>> >
>> > This email and any attachments to it may be confidential and are
>> > intended solely for the use of the individual to whom it is addressed.
>> > If you are not the intended recipient of this email, you must neither
>> > take any action based upon its contents, nor copy or show it to anyone.
>> > Please contact the sender if you believe you have received this email in
>> > error. QinetiQ may monitor email traffic data and also the content of
>> > email for the purposes of security. QinetiQ Limited (Registered in
>> > England & Wales: Company Number: 3796233) Registered office: Cody
>> Technology
>> > Park, Ively Road, Farnborough, Hampshire, GU14 0LX
>> http://www.qinetiq.com.
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>


-- 
-
Pa

Re: How to programmatically index an existed column?

2011-05-26 Thread Dikang Gu
Hi Aaron,

Thank you for your reminder. I've found out the solution myself, and I share it 
here:

KeyspaceDefinition keyspaceDefinition = cluster.describeKeyspace(KEYSPACE);
ColumnFamilyDefinition cdf = keyspaceDefinition.getCfDefs().get(0);

BasicColumnFamilyDefinition columnFamilyDefinition = new 
BasicColumnFamilyDefinition(cdf);

BasicColumnDefinition bcdf = new BasicColumnDefinition();
bcdf.setName(StringSerializer.get().toByteBuffer("birthyear"));
bcdf.setIndexName("birthyearidx");
bcdf.setIndexType(ColumnIndexType.KEYS);
bcdf.setValidationClass(ComparatorType.LONGTYPE.getClassName());

columnFamilyDefinition.addColumnDefinition(bcdf);

cluster.updateColumnFamily(new ThriftCfDef(columnFamilyDefinition)); 


-- 
Dikang Gu
0086 - 18611140205
On Thursday, May 26, 2011 at 3:16 PM, aaron morton wrote: 
> Please post to one list at a time. Otherwise people may spend their time 
> helping you when someone already has. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 26 May 2011, at 17:35, Dikang Gu wrote:
> 
> > 
> > I want to build a secondary index on an existed column, how to 
> > programmatically do this using hector API?
> > 
> > Thanks.
> > 
> > -- 
> > Dikang Gu
> > 0086 - 18611140205
> 


Re: nodetool move trying to stream data to node no longer in cluster

2011-05-26 Thread Jonathan Colby
@Aaron -

Unfortunately I'm still seeing message like:  " is down", 
removing from gossip, although with not the same frequency.  

And repair/move jobs don't seem to try to stream data to the removed node 
anymore.

Anyone know how to totally purge any stored gossip/endpoint data on nodes that 
were removed from the cluster.  Or what might be happening here otherwise?


On May 26, 2011, at 9:10 AM, aaron morton wrote:

> cool. I was going to suggest that but as you already had the move running I 
> thought it may be a little drastic. 
> 
> Did it show any progress ? If the IP address is not responding there should 
> have been some sort of error. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 26 May 2011, at 15:28, jonathan.co...@gmail.com wrote:
> 
>> Seems like it had something to do with stale endpoint information. I did a 
>> rolling restart of the whole cluster and that seemed to trigger the nodes to 
>> remove the node that was decommissioned.
>> 
>> On , aaron morton  wrote:
>>> Is it showing progress ? It may just be a problem with the information 
>>> printed out.
>>> 
>>> 
>>> 
>>> Can you check from the other nodes in the cluster to see if they are 
>>> receiving the stream ?
>>> 
>>> 
>>> 
>>> cheers
>>> 
>>> 
>>> 
>>> -
>>> 
>>> Aaron Morton
>>> 
>>> Freelance Cassandra Developer
>>> 
>>> @aaronmorton
>>> 
>>> http://www.thelastpickle.com
>>> 
>>> 
>>> 
>>> On 26 May 2011, at 00:42, Jonathan Colby wrote:
>>> 
>>> 
>>> 
 I recently removed a node (with decommission) from our cluster.
>>> 
 
>>> 
 I added a couple new nodes and am now trying to rebalance the cluster 
 using nodetool move.
>>> 
 
>>> 
 However,  netstats shows that the node being "moved" is trying to stream 
 data to the node that I already decommissioned yesterday.
>>> 
 
>>> 
 The removed node was powered-off, taken out of dns, its IP is not even 
 pingable.   It was never a seed neither.
>>> 
 
>>> 
 This is cassandra 0.7.5 on 64bit linux.   How do I tell the cluster that 
 this node is gone?  Gossip should have detected this.  The ring commands 
 shows the correct cluster IPs.
>>> 
 
>>> 
 Here is a portion of netstats. 10.46.108.102 is the node which was removed.
>>> 
 
>>> 
 Mode: Leaving: streaming data to other nodes
>>> 
 Streaming to: /10.46.108.102
>>> 
  
 /var/lib/cassandra/data/DFS/main-f-1064-Data.db/(4681027,5195491),(5195491,15308570),(15308570,15891710),(16336750,20558705),(20558705,29112203),(29112203,36279329),(36465942,36623223),(36740457,37227058),(37227058,42206994),(42206994,47380294),(47635053,47709813),(47709813,48353944),(48621287,49406499),(53330048,53571312),(53571312,54153922),(54153922,59857615),(59857615,61029910),(61029910,61871509),(62190800,62498605),(62824281,62964830),(63511604,64353114),(64353114,64760400),(65174702,65919771),(65919771,66435630),(81440029,81725949),(81725949,83313847),(83313847,83908709),(88983863,89237303),(89237303,89934199),(89934199,97
>>> 
 ...
>>> 
 5693491,14795861666),(14795861666,14796105318),(14796105318,14796366886),(14796699825,14803874941),(14803874941,14808898331),(14808898331,14811670699),(14811670699,14815125177),(14815125177,14819765003),(14820229433,14820858266)
>>> 
progress=280574376402/12434049900 - 2256%
>>> 
 .
>>> 
 
>>> 
 
>>> 
 Note 10.46.108.102 is NOT part of the ring.
>>> 
 
>>> 
 Address Status State   LoadOwnsToken
>>> 
  
 148873535527910577765226390751398592512
>>> 
 10.46.108.100   Up Normal  71.73 GB12.50%  0
>>> 
 10.46.108.101   Up Normal  109.69 GB   12.50%  
 21267647932558653966460912964485513216
>>> 
 10.47.108.100   Up Leaving 281.95 GB   37.50%  
 85070591730234615865843651857942052863   
 10.47.108.102   Up Normal  210.77 GB   0.00%   
 85070591730234615865843651857942052864
>>> 
 10.47.108.101   Up Normal  289.59 GB   16.67%  
 113427455640312821154458202477256070484
>>> 
 10.46.108.103   Up Normal  299.87 GB   8.33%   
 127605887595351923798765477786913079296
>>> 
 10.47.108.103   Up Normal  94.99 GB12.50%  
 148873535527910577765226390751398592511
>>> 
 10.46.108.104   Up Normal  103.01 GB   0.00%   
 148873535527910577765226390751398592512
>>> 
 
>>> 
 
>>> 
 
>>> 
>>> 
>>> 
> 



Re: EC2 node adding trouble

2011-05-26 Thread Marcus Bointon
On 26 May 2011, at 00:17, aaron morton wrote:

> I've seen discussion of using the EIP but I do not have direct experience. 

The idea is not to use the external IP, but the external DNS name because of 
this very useful trick (please excuse me if you already know this!):

Say the DNS name of an elastic IP assigned to an instance is 
ec2-50-18-223-109.compute-1.amazonaws.com, then from outside EC2:

#host ec2-50-18-223-109.compute-1.amazonaws.com
ec2-50-18-223-109.compute-1.amazonaws.com has address 50.18.223.109

But from inside EC2:

#host ec2-50-18-223-109.compute-1.amazonaws.com
ec2-50-18-223-109.compute-1.amazonaws.com has address 10.126.13.22

If you suspend and resume an instance, its internal IP will change, but the 
external will not (if it's assigned the same elastic IP), but if you use the 
external name, you'll get consistent behaviour whatever happens. This is 
extremely useful!

Of course it would be extremely useful if we could get behaviour like this 
without having to assign an elastic IP, as this is a waste of IPs otherwise.

Marcus

Re: PHP CQL Driver

2011-05-26 Thread aaron morton
Cool, this may be a better discussion for the client-dev list 
http://www.mail-archive.com/client-dev@cassandra.apache.org/

I would start by turning up the server logging to DEBUG and watching your 
update / select queries. 

Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 26 May 2011, at 16:15, Kwasi Gyasi - Agyei wrote:

> Hi,
> 
> I have manged to generate thrift interface for php along with implementing 
> auto-loading of both Cassandra and thrift core class. 
> 
> However during my testing the only query that works as expected is the create 
> keyspace cql query... all other queries don't do or return any results nor do 
> they throw exceptions even in try catch statement I get nothing.
> 
> -- 
> 4Things
> Multimedia and Communication | Property | Entertainment
> Kwasi Owusu Gyasi - Agyei
> 
> cell(+27) (0) 76 466 4488
> website www.4things.co.za
> email kwasi.gyasiag...@4things.co.za
> skypekwasi.gyasiagyei
> roleDeveloper.Designer.Software Architect



Re: How to programmatically index an existed column?

2011-05-26 Thread aaron morton
Please post to one list at a time. Otherwise people may spend their time 
helping you when someone already has. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 26 May 2011, at 17:35, Dikang Gu wrote:

> 
> I want to build a secondary index on an existed column, how to 
> programmatically do this using hector API?
> 
> Thanks.
> 
> -- 
> Dikang Gu
> 0086 - 18611140205



Re: nodetool move trying to stream data to node no longer in cluster

2011-05-26 Thread aaron morton
cool. I was going to suggest that but as you already had the move running I 
thought it may be a little drastic. 

Did it show any progress ? If the IP address is not responding there should 
have been some sort of error. 

Cheers
 
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 26 May 2011, at 15:28, jonathan.co...@gmail.com wrote:

> Seems like it had something to do with stale endpoint information. I did a 
> rolling restart of the whole cluster and that seemed to trigger the nodes to 
> remove the node that was decommissioned.
> 
> On , aaron morton  wrote:
> > Is it showing progress ? It may just be a problem with the information 
> > printed out.
> > 
> > 
> > 
> > Can you check from the other nodes in the cluster to see if they are 
> > receiving the stream ?
> > 
> > 
> > 
> > cheers
> > 
> > 
> > 
> > -
> > 
> > Aaron Morton
> > 
> > Freelance Cassandra Developer
> > 
> > @aaronmorton
> > 
> > http://www.thelastpickle.com
> > 
> > 
> > 
> > On 26 May 2011, at 00:42, Jonathan Colby wrote:
> > 
> > 
> > 
> > > I recently removed a node (with decommission) from our cluster.
> > 
> > >
> > 
> > > I added a couple new nodes and am now trying to rebalance the cluster 
> > > using nodetool move.
> > 
> > >
> > 
> > > However,  netstats shows that the node being "moved" is trying to stream 
> > > data to the node that I already decommissioned yesterday.
> > 
> > >
> > 
> > > The removed node was powered-off, taken out of dns, its IP is not even 
> > > pingable.   It was never a seed neither.
> > 
> > >
> > 
> > > This is cassandra 0.7.5 on 64bit linux.   How do I tell the cluster that 
> > > this node is gone?  Gossip should have detected this.  The ring commands 
> > > shows the correct cluster IPs.
> > 
> > >
> > 
> > > Here is a portion of netstats. 10.46.108.102 is the node which was 
> > > removed.
> > 
> > >
> > 
> > > Mode: Leaving: streaming data to other nodes
> > 
> > > Streaming to: /10.46.108.102
> > 
> > >   
> > > /var/lib/cassandra/data/DFS/main-f-1064-Data.db/(4681027,5195491),(5195491,15308570),(15308570,15891710),(16336750,20558705),(20558705,29112203),(29112203,36279329),(36465942,36623223),(36740457,37227058),(37227058,42206994),(42206994,47380294),(47635053,47709813),(47709813,48353944),(48621287,49406499),(53330048,53571312),(53571312,54153922),(54153922,59857615),(59857615,61029910),(61029910,61871509),(62190800,62498605),(62824281,62964830),(63511604,64353114),(64353114,64760400),(65174702,65919771),(65919771,66435630),(81440029,81725949),(81725949,83313847),(83313847,83908709),(88983863,89237303),(89237303,89934199),(89934199,97
> > 
> > > ...
> > 
> > > 5693491,14795861666),(14795861666,14796105318),(14796105318,14796366886),(14796699825,14803874941),(14803874941,14808898331),(14808898331,14811670699),(14811670699,14815125177),(14815125177,14819765003),(14820229433,14820858266)
> > 
> > > progress=280574376402/12434049900 - 2256%
> > 
> > > .
> > 
> > >
> > 
> > >
> > 
> > > Note 10.46.108.102 is NOT part of the ring.
> > 
> > >
> > 
> > > Address Status State   LoadOwnsToken
> > 
> > >   
> > > 148873535527910577765226390751398592512
> > 
> > > 10.46.108.100   Up Normal  71.73 GB12.50%  0
> > 
> > > 10.46.108.101   Up Normal  109.69 GB   12.50%  
> > > 21267647932558653966460912964485513216
> > 
> > > 10.47.108.100   Up Leaving 281.95 GB   37.50%  
> > > 85070591730234615865843651857942052863   
> > > 10.47.108.102   Up Normal  210.77 GB   0.00%   
> > > 85070591730234615865843651857942052864
> > 
> > > 10.47.108.101   Up Normal  289.59 GB   16.67%  
> > > 113427455640312821154458202477256070484
> > 
> > > 10.46.108.103   Up Normal  299.87 GB   8.33%   
> > > 127605887595351923798765477786913079296
> > 
> > > 10.47.108.103   Up Normal  94.99 GB12.50%  
> > > 148873535527910577765226390751398592511
> > 
> > > 10.46.108.104   Up Normal  103.01 GB   0.00%   
> > > 148873535527910577765226390751398592512
> > 
> > >
> > 
> > >
> > 
> > >
> > 
> > 
> >