date:20110426

Re: JNA C library errors on OSX

2011-04-26 Thread John Lennard

Thanks for that. 

If it is a known issue I will leave it be as everything is working fine without 
it. Have no intention of deploying on OSX, was more that seeing the error 
puzzled me some what.

Cheers
John



On 26/04/2011, at 2:05 AM, Jonathan Ellis wrote:

> Pretty sure this is b/c OS X doesn't support posix_fadvise.
> 
> Since you shouldn't be running OS X as a server OS in production
> anyway, I wouldn't worry much. Cassandra will still work fine for
> development w/o native methods.
> 
> On Mon, Apr 25, 2011 at 7:52 AM, John Lennard  wrote:
>> Hi,
>> 
>> I am currently testing the current 0.8 beta on my OSX development machine 
>> and when cassandra is starting up i am seeing errors from the JNA code as 
>> below.
>> 
>> 
>> john@balorama bin $ sudo -u cassandra ./cassandra -f
>> Password:
>>  INFO 00:04:18,013 Logging initialized
>>  INFO 00:04:18,027 Heap size: 2126512128/2126512128
>>  INFO 00:04:18,078 Unable to link C library. Native methods will be disabled.
>> 
>> ...
>> 
>> From here Cassandra continues to run, however, without native method support.
>> 
>> 
>> I have tested running Cassandra as root, the logged in user and the 
>> cassandra user along with the most recent JNA and a version compiled from 
>> trunk. I have also tried running cassandra with out any of the start up 
>> scripts and still the same error, simple tests where by i have removed the 
>> CLibrary class and code that causes the fault seem to allow the c library to 
>> be linked.
>> 
>> I am about to build a version from source to add in some stack traces where 
>> the error is handled but was wondering if anyone else has seen this.
>> 
>> 
>> Thanks
>> John
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com

Re: encryption_options & 0.8

2011-04-26 Thread David Strauss

On Tue, 2011-04-26 at 08:57 +0200, Sasha Dolgy wrote:
> Is it possible to store an encrypted keystore_password and
> truststore_password in the cassandra.yaml?  I see that the defaults
> allow cleartext which isn't suitable when negotiating with security
> specialists for sign-off of a solution...

If the passwords are encrypted, when and how would they be decrypted?


signature.asc
Description: This is a digitally signed message part

Re: Apt repositories

2011-04-26 Thread Eric Evans

On Sat, 2011-04-23 at 16:49 -0700, David Strauss wrote:
> I just noticed that, following the Cassandra 0.8 beta release, the Apt
> repository is encouraging servers in my clusters to upgrade. Beta
> releases should probably be on different channels (or named
> differently) than stable ones.

There was already a repo for cassandra-0.6 (called 06x), it just fell
through the cracks with the last release.

There is one for each version now (06x, 07x, and 08x). The unstable
suite continues to point to latest-and-greatest.  The wiki has been
updated.

-- 
Eric Evans
eev...@rackspace.com

Re: practice failure recovery

2011-04-26 Thread William Oberman

Done and done.  I'm really loving how easy the nuclear option has been (it
was what I tested first).

will

On Tue, Apr 26, 2011 at 5:09 PM, aaron morton wrote:

> In 0.7.X the cli waits for the schema to agree before returning, you should
> see...
>
> Waiting for schema agreement...
> ... schemas agree across the cluster
>
> Or if things fail
> The schema has not settled in %d seconds; further migrations are
> ill-advised until it does.%nVersions are %s%n
>
> WRT the error, first guess is something in the schema has changed it's
> upsetting the log replay. Given all the crazy i'd go with the nuclear
> option.
>
> Aaron
>
> On 27 Apr 2011, at 07:11, William Oberman wrote:
>
> > In my test cluster I manged to jam up a cassandra server.  I figure the
> easy & failsafe solution is to just boot a replacement node, but I thought
> I'd try a minute to either figure out what I did, or try to figure out how
> to properly recover it before I lose my current state.
> >
> > The symptom = on startup I get an exception:
> > ERROR 11:58:34,567 Exception encountered during startup.
> > java.lang.IndexOutOfBoundsException: 6
> > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121)
> > at
> org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:56)
> > at
> org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
> > at
> org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
> > at
> java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(ConcurrentSkipListMap.java:606)
> > at
> java.util.concurrent.ConcurrentSkipListMap.findPredecessor(ConcurrentSkipListMap.java:685)
> > at
> java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:864)
> > at
> java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1893)
> > at
> org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:216)
> > at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:130)
> > at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
> > at
> org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:380)
> > at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:253)
> > at
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:156)
> > at
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:173)
> > at
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
> > at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> >
> > Where things went wrong = I had been doing various testing and unit
> testing, as this is my "proof of concept" cluster.  The unit tests in
> particular work by cloning a keyspace as "keyspace_UUID" (to get a blank
> slate).  Because of various bugs in my code and configuration, this left a
> fair amount of crud keyspaces by the time I got everything to pass.  So, I
> wrote a script to drop all of the test keyspaces (the script had worked on a
> single node environment, which was my first step before the cluster).  I
> think the CLI doesn't wait for schema propagation, so the script confused
> the node I was talking to, as after it ran the schema UUIDs of that node vs.
> the rest of the cluster didn't agree ("describe cluster" in the CLI).  And,
> it wasn't fixing itself.  "nodetool loadbalance" said it would do a
> decommission/bootstrap, which I thought might give the bad node a kick in
> the pants, so I tried it.  Afterwards, I ran "nodetool ring" against all
> nodes and the problem node claimed all was "UP", but everything else listed
> the problem node as "?" and everything else as UP (sadly, I either didn't
> check or can't remember what "nodetool ring" said before loadbalance).  So,
> I shut down the problem node.  But, when I tried to restart it, I got the
> error you see above.
> >
> > Not sure what was the worst/dumbest thing I did, but it's definitely
> unhappy now!
>
>


-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com

Re: advice for EC2 deployment

2011-04-26 Thread William Oberman

I see what you're saying.  I was able to control write latency on mysql
using insert vs insert delayed (what I feel is MySQLs poor man's eventual
consistency option) + the fact that replication was a background
asynchronous process.  In terms of read latency, I was able to do up to a
few hundred well indexed mysql queries (across AZs) on a view while keeping
the overall latency of the page around or less than a second.

I basically am replacing two use cases, the cases with difficult to scale
anticipated write volumes.  The first case was previously using insert
delayed (which I'm doing in cassandra as ONE) as I wasn't getting consistent
write/read operations before anyways.  The second case was using traditional
insert (which I was going to replace with some QUORUM-like level, I was
assuming LOCAL_QUORUM).  But, the latter case uses a write through memory
cache (memcache), so I don't know how often it really reads data from the
persistent store.  But I definitely need to make sure it is consistent.

In any case, it sounds like I'd be best served treating AZs as DCs, but then
I don't know what to make racks?  Or do racks not matter in a single AZ?
That way I can get an ack from a LOCAL_QUORUM read/write before the
(slightly) slower read/write to/from the other AZ (for redundancy).  Then
I'm only screwed if Amazon has a multi-AZ failure (so far, they've kept it
to "only" one!) :-)

will

On Tue, Apr 26, 2011 at 5:01 PM, aaron morton wrote:

> One difference between Cassandra and MySQL replication may be when the
> network IO happens. Was the MySQL replication synchronous on transaction
> commit ?  I was only aware that it had async replication, which means the
> client is not exposed to the network latency. In cassandra the network
> latency is exposed to the client as it needs to wait for the CL number of
> nodes to respond.
>
> If you use the PropertyFilePartitioner with the NetworkTopology you can
> manually assign machines to racks / dc's based on IP.
> See conf/cassandra-topology.property file there is also an Ec2Snitch which
> (from the code)
> /**
>  * A snitch that assumes an EC2 region is a DC and an EC2 availability_zone
>  *  is a rack. This information is available in the config for the node.
>
> Recent discussion on DC aware CL levels
> http://www.mail-archive.com/user@cassandra.apache.org/msg11414.html
>
> Hope that helps.
>  
> Aaron
>
>
> On 27 Apr 2011, at 01:18, William Oberman wrote:
>
> Thanks Aaron!
>
> Unless no one on this list uses EC2, there were a few minor troubles end of
> last week through the weekend which taught me a lot about obscure failure
> modes in various applications I use :-)  My original post was trying to be
> more redundant than fast, which has been by overall goal from even before
> moving to Cassandra (my downtime from the EC2 madness was minimal, and due
> to only having one single point of failure == the amazon load balancer).  My
> secondary goal was  trying to make moving to a second region easier, but is
> that is causing problems I can drop the idea.
>
> I might be downplaying the cost of inter-AZ communication, but I've lived
> with that for quite some time, for example my current setup of MySQL in
> Master-Master replication is split over zones, and my webservers live in yet
> different zones.  Maybe Cassandra is "chattier" than I'm used to?  (again,
> I'm fairly new to cassandra)
>
> Based on that article, the discussion, and the recent EC2 issues, it sounds
> like it would be better to start with:
> -6 nodes split in two AZs 3/3
> -Configure replication to do 2 in one AZ and one in the other
> (NetworkTopology treats AZs as racks, so does RF=3,us-east=3 make this
> happen naturally?)
> -What does LOCAL_QUORUM do in this case?  Is there a "rack quorum"?  Or
> does the natural latencies of AZs make LOCAL_QUORUM behave like a rack
> quorum?
>
> will
>
> On Tue, Apr 26, 2011 at 1:14 AM, aaron morton wrote:
>
>> For background see this article:
>>
>> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>
>>
>> And
>> this recent discussion
>> http://www.mail-archive.com/user@cassandra.apache.org/msg12502.html
>>
>> Issues
>> that may be a concern:
>> - lots of cross AZ latency in us-east, e.g. LOCAL_QUORUM ops must wait
>> cross AZ . Also consider it during maintenance tasks, how much of a pain is
>> it going to be to have latency between every node.
>> - IMHO not having sufficient (by that I mean 3) replicas in a cassandra DC
>> to handle a single node failure when working at Quorum reduces the utility
>> of the DC. e.g. with a local RF of 2 in the west, the quorum is 2, and if
>> you lose one node from the replica set you will not be able to use local
>> QUORUM for keys in that range. Or consider a failure mode

Re: Cluster Installation Verification

2011-04-26 Thread aaron morton

Does not look like there is much data in there :)

Also don't forget to use the datatype functions in the cli to match what your 
app is doing, see help for more details. 
e.g. get MyCf[uuid('something-that-looks-llike-a-uuid')]

Also the ring is unbalanced (the Owns column), you will want to assign the 
nodes an initial token so they each take the same portion of the data.
see http://wiki.apache.org/cassandra/Operations#Load_balancing 

Hope that helps. 
Aaron

On 27 Apr 2011, at 07:14, Brad Willard wrote:

> The setup is 10.11.6.9 as the seed and the other three nodes
> bootstrapped. I attached two cassandra.yaml files, the config of the
> seed, and the config of one of the cluster nodes.
> 
> Ring output
> /opt/cassandra/apache-cassandra-0.7.4# ./bin/nodetool -h 10.11.6.9 ring
> Address Status State   LoadOwnsToken
> 
> 133836233891526335447940652806240328892
> 10.11.6.9   Up Normal  42.53 KB50.00%
> 48765642161291719582097000948298276028
> 10.11.6.26  Up Normal  42.71 KB12.50%
> 70033290093850373548557913912783789244
> 10.11.6.11  Up Normal  42.67 KB12.50%
> 91300938026409027515018826877269302460
> 10.11.6.10  Up Normal  42.64 KB25.00%
> 133836233891526335447940652806240328892
> 
> Replicate Strategy is whatever is default, I'm not sure how to set it.
> The consistency leve in the client is set to one.
> 
> Thanks,
> Brad
> 
> On Tue, Apr 26, 2011 at 2:25 PM, Jonathan Colby
>  wrote:
>> What replication strategy did you use?  how does the ring look?  were the 
>> newly added nodes bootstrapped? is 1 or more nodes listed as a seed?
>> 
>> Lots of questions.  but maybe you could post your cassandra.yaml here and we 
>> can take a look at it.
>> 
>> The output of nodetool ring would also be good.
>> 
>> Jon
>> 
>> On Apr 26, 2011, at 7:27 PM, Brad Willard wrote:
>> 
>>> I'm trying to setup a cassandra cluster with 0.7.4 on 4 nodes. I
>>> initially did a single server test that went beautifully with a test
>>> that inserted 16 million rows with no issues. However when I tried to
>>> create a 4 node cluster I've been seeing weird behavior. I seem to be
>>> able to run my same test without errors, however when I use the
>>> cassanda-cli to look at the data, it appears as though nothing has
>>> been inserted. I verified the server I've connected to, verified the
>>> correct keyspace and column family. I've already used the nodetool to
>>> verify all the other servers are listed in the ring.
>>> 
>>> I followed these instructions for the setup:
>>> http://wiki.apache.org/cassandra/MultinodeCluster
>>> 
>>> So how can I verify my cluster is working correctly?  Any help would
>>> be amazing as I'm evaluating this for my company.
>>> 
>>> Thanks,
>>> Brad
>> 
>> 
>

Re: practice failure recovery

2011-04-26 Thread aaron morton

In 0.7.X the cli waits for the schema to agree before returning, you should 
see...

Waiting for schema agreement...
... schemas agree across the cluster

Or if things fail
The schema has not settled in %d seconds; further migrations are ill-advised 
until it does.%nVersions are %s%n

WRT the error, first guess is something in the schema has changed it's 
upsetting the log replay. Given all the crazy i'd go with the nuclear option. 

Aaron
 
On 27 Apr 2011, at 07:11, William Oberman wrote:

> In my test cluster I manged to jam up a cassandra server.  I figure the easy 
> & failsafe solution is to just boot a replacement node, but I thought I'd try 
> a minute to either figure out what I did, or try to figure out how to 
> properly recover it before I lose my current state.
> 
> The symptom = on startup I get an exception:
> ERROR 11:58:34,567 Exception encountered during startup.
> java.lang.IndexOutOfBoundsException: 6
> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121)
> at 
> org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:56)
> at 
> org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
> at 
> org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
> at 
> java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(ConcurrentSkipListMap.java:606)
> at 
> java.util.concurrent.ConcurrentSkipListMap.findPredecessor(ConcurrentSkipListMap.java:685)
> at 
> java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:864)
> at 
> java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1893)
> at 
> org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:216)
> at 
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:130)
> at 
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
> at 
> org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:380)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:253)
> at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:156)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:173)
> at 
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
> at 
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
> 
> Where things went wrong = I had been doing various testing and unit testing, 
> as this is my "proof of concept" cluster.  The unit tests in particular work 
> by cloning a keyspace as "keyspace_UUID" (to get a blank slate).  Because of 
> various bugs in my code and configuration, this left a fair amount of crud 
> keyspaces by the time I got everything to pass.  So, I wrote a script to drop 
> all of the test keyspaces (the script had worked on a single node 
> environment, which was my first step before the cluster).  I think the CLI 
> doesn't wait for schema propagation, so the script confused the node I was 
> talking to, as after it ran the schema UUIDs of that node vs. the rest of the 
> cluster didn't agree ("describe cluster" in the CLI).  And, it wasn't fixing 
> itself.  "nodetool loadbalance" said it would do a decommission/bootstrap, 
> which I thought might give the bad node a kick in the pants, so I tried it.  
> Afterwards, I ran "nodetool ring" against all nodes and the problem node 
> claimed all was "UP", but everything else listed the problem node as "?" and 
> everything else as UP (sadly, I either didn't check or can't remember what 
> "nodetool ring" said before loadbalance).  So, I shut down the problem node.  
> But, when I tried to restart it, I got the error you see above.
> 
> Not sure what was the worst/dumbest thing I did, but it's definitely unhappy 
> now!

Re: advice for EC2 deployment

2011-04-26 Thread aaron morton

One difference between Cassandra and MySQL replication may be when the network 
IO happens. Was the MySQL replication synchronous on transaction commit ?  I 
was only aware that it had async replication, which means the client is not 
exposed to the network latency. In cassandra the network latency is exposed to 
the client as it needs to wait for the CL number of nodes to respond. 

If you use the PropertyFilePartitioner with the NetworkTopology you can 
manually assign machines to racks / dc's based on IP. 
See conf/cassandra-topology.property file there is also an Ec2Snitch which 
(from the code) 
/**
 * A snitch that assumes an EC2 region is a DC and an EC2 availability_zone
 *  is a rack. This information is available in the config for the node.

Recent discussion on DC aware CL levels 
http://www.mail-archive.com/user@cassandra.apache.org/msg11414.html

Hope that helps.
Aaron
 

On 27 Apr 2011, at 01:18, William Oberman wrote:

> Thanks Aaron!
> 
> Unless no one on this list uses EC2, there were a few minor troubles end of 
> last week through the weekend which taught me a lot about obscure failure 
> modes in various applications I use :-)  My original post was trying to be 
> more redundant than fast, which has been by overall goal from even before 
> moving to Cassandra (my downtime from the EC2 madness was minimal, and due to 
> only having one single point of failure == the amazon load balancer).  My 
> secondary goal was  trying to make moving to a second region easier, but is 
> that is causing problems I can drop the idea.
> 
> I might be downplaying the cost of inter-AZ communication, but I've lived 
> with that for quite some time, for example my current setup of MySQL in 
> Master-Master replication is split over zones, and my webservers live in yet 
> different zones.  Maybe Cassandra is "chattier" than I'm used to?  (again, 
> I'm fairly new to cassandra)
> 
> Based on that article, the discussion, and the recent EC2 issues, it sounds 
> like it would be better to start with:
> -6 nodes split in two AZs 3/3
> -Configure replication to do 2 in one AZ and one in the other 
> (NetworkTopology treats AZs as racks, so does RF=3,us-east=3 make this happen 
> naturally?)
> -What does LOCAL_QUORUM do in this case?  Is there a "rack quorum"?  Or does 
> the natural latencies of AZs make LOCAL_QUORUM behave like a rack quorum?
> 
> will
> 
> On Tue, Apr 26, 2011 at 1:14 AM, aaron morton  wrote:
> For background see this article:
> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
> 
> And this recent discussion 
> http://www.mail-archive.com/user@cassandra.apache.org/msg12502.html
> 
> Issues that may be a concern:
> - lots of cross AZ latency in us-east, e.g. LOCAL_QUORUM ops must wait cross 
> AZ . Also consider it during maintenance tasks, how much of a pain is it 
> going to be to have latency between every node.   
> - IMHO not having sufficient (by that I mean 3) replicas in a cassandra DC to 
> handle a single node failure when working at Quorum reduces the utility of 
> the DC. e.g. with a local RF of 2 in the west, the quorum is 2, and if you 
> lose one node from the replica set you will not be able to use local QUORUM 
> for keys in that range. Or consider a failure mode where the west is 
> disconnected from the east.
> 
> Could you start simple with 3 replicas in one AZ in us-east and 3 replicas in 
> an AZ+Region ?  Then work through some failure scenarios.  
> 
> Hope that helps. 
> Aaron
>   
> 
> On 22 Apr 2011, at 03:28, William Oberman wrote:
> 
>> Hi,
>> 
>> My service is not yet ready to be fully multi-DC, due to how some of my 
>> legacy MySQL stuff works.  But, I wanted to get cassandra going ASAP and 
>> work towards multi-DC.  I have two main cassandra use cases: one where I can 
>> handle eventual consistency (and all of the writes/reads are currently ONE), 
>> and one where I can't (writes/reads are currently QUORUM).  My test cluster 
>> is currently 4 smalls all in us-east with RF=3 (more to prove I can 
>> clustering, than to have an exact production replica).  All of my unit 
>> tests, and "load tests" (again, not to prove true max load, but to more to 
>> tease out concurrency issues) are passing now.
>> 
>> For production, I was thinking of doing:
>> -4 cassandra larges in us-east (where I am now), once in each AZ
>> -1 cassandra large in us-west (where I have nothing)
>> For now, my data can fit into a single large's 2 disk ephemeral using RAID0, 
>> and I was then thinking of doing a RF=3 with us-east=2 and us-west=1.  If I 
>> do eventual consistency at ONE, and consistency at LOCAL_QUORUM, I was 
>> hoping:
>> -eventual consistency ops would be really fast
>> -consistent ops would be pretty fast (what does LOCAL_QUORUM do in this 
>> case?  return after 1 or 2 us-east nodes ack?)
>> -us-west would contain a complete copy of my data, so it's a good eventually 
>> consistent "close to real time" backup  (assuming it can keep up ove

Re: 0.8 loosing nodes?

2011-04-26 Thread Brandon Williams

On Mon, Apr 25, 2011 at 12:21 PM, Jonathan Ellis  wrote:
> I bet the problem is with the other tasks on the executor that Gossip
> heartbeat runs on.
>
> I see at least two that could cause blocking: hint cleanup
> post-delivery and flush-expired-memtables, both of which call
> forceFlush which will block if the flush queue + threads are full.
>
> We've run into this before (CASSANDRA-2253); we should move Gossip
> back to its own dedicated executor or it will keep happening whenever
> someone accidentally puts something on the "shared" executor that can
> block.
>
> Created https://issues.apache.org/jira/browse/CASSANDRA-2554 to fix
> this.  Thanks for tracking down the problem!

This is good to have too, but isn't the problem: I broke it in the
gossiper refactoring.

https://issues.apache.org/jira/browse/CASSANDRA-2565

-Brandon

Re: Cluster Installation Verification

2011-04-26 Thread Brad Willard

The setup is 10.11.6.9 as the seed and the other three nodes
bootstrapped. I attached two cassandra.yaml files, the config of the
seed, and the config of one of the cluster nodes.

Ring output
/opt/cassandra/apache-cassandra-0.7.4# ./bin/nodetool -h 10.11.6.9 ring
Address Status State   LoadOwnsToken

133836233891526335447940652806240328892
10.11.6.9   Up Normal  42.53 KB50.00%
48765642161291719582097000948298276028
10.11.6.26  Up Normal  42.71 KB12.50%
70033290093850373548557913912783789244
10.11.6.11  Up Normal  42.67 KB12.50%
91300938026409027515018826877269302460
10.11.6.10  Up Normal  42.64 KB25.00%
133836233891526335447940652806240328892

Replicate Strategy is whatever is default, I'm not sure how to set it.
The consistency leve in the client is set to one.

Thanks,
Brad

On Tue, Apr 26, 2011 at 2:25 PM, Jonathan Colby
 wrote:
> What replication strategy did you use?  how does the ring look?  were the 
> newly added nodes bootstrapped? is 1 or more nodes listed as a seed?
>
> Lots of questions.  but maybe you could post your cassandra.yaml here and we 
> can take a look at it.
>
> The output of nodetool ring would also be good.
>
> Jon
>
> On Apr 26, 2011, at 7:27 PM, Brad Willard wrote:
>
>> I'm trying to setup a cassandra cluster with 0.7.4 on 4 nodes. I
>> initially did a single server test that went beautifully with a test
>> that inserted 16 million rows with no issues. However when I tried to
>> create a 4 node cluster I've been seeing weird behavior. I seem to be
>> able to run my same test without errors, however when I use the
>> cassanda-cli to look at the data, it appears as though nothing has
>> been inserted. I verified the server I've connected to, verified the
>> correct keyspace and column family. I've already used the nodetool to
>> verify all the other servers are listed in the ring.
>>
>> I followed these instructions for the setup:
>> http://wiki.apache.org/cassandra/MultinodeCluster
>>
>> So how can I verify my cluster is working correctly?  Any help would
>> be amazing as I'm evaluating this for my company.
>>
>> Thanks,
>> Brad
>
>


cassandra.nonseed.yaml
Description: Binary data


cassandra.seed.yaml
Description: Binary data

practice failure recovery

2011-04-26 Thread William Oberman

In my test cluster I manged to jam up a cassandra server.  I figure the easy
& failsafe solution is to just boot a replacement node, but I thought I'd
try a minute to either figure out what I did, or try to figure out how to
properly recover it before I lose my current state.

The symptom = on startup I get an exception:
ERROR 11:58:34,567 Exception encountered during startup.
java.lang.IndexOutOfBoundsException: 6
at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:121)
at
org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:56)
at
org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
at
org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
at
java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(ConcurrentSkipListMap.java:606)
at
java.util.concurrent.ConcurrentSkipListMap.findPredecessor(ConcurrentSkipListMap.java:685)
at
java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:864)
at
java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1893)
at
org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:216)
at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:130)
at
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:120)
at
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:380)
at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:253)
at
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:156)
at
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:173)
at
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)

Where things went wrong = I had been doing various testing and unit testing,
as this is my "proof of concept" cluster.  The unit tests in particular work
by cloning a keyspace as "keyspace_UUID" (to get a blank slate).  Because of
various bugs in my code and configuration, this left a fair amount of crud
keyspaces by the time I got everything to pass.  So, I wrote a script to
drop all of the test keyspaces (the script had worked on a single node
environment, which was my first step before the cluster).  I think the CLI
doesn't wait for schema propagation, so the script confused the node I was
talking to, as after it ran the schema UUIDs of that node vs. the rest of
the cluster didn't agree ("describe cluster" in the CLI).  And, it wasn't
fixing itself.  "nodetool loadbalance" said it would do a
decommission/bootstrap, which I thought might give the bad node a kick in
the pants, so I tried it.  Afterwards, I ran "nodetool ring" against all
nodes and the problem node claimed all was "UP", but everything else listed
the problem node as "?" and everything else as UP (sadly, I either didn't
check or can't remember what "nodetool ring" said before loadbalance).  So,
I shut down the problem node.  But, when I tried to restart it, I got the
error you see above.

Not sure what was the worst/dumbest thing I did, but it's definitely unhappy
now!

Re: decommissioning a wrong node

2011-04-26 Thread Tomas Vondra

Dne 26.4.2011 09:21, aaron morton napsal(a):
> There is the fabled
> 
> java.rmi.server.hostname 
> 
> http://blog.reactive.org/2011/02/connecting-to-cassandra-jmx-via-ssh.html
> http://download.oracle.com/javase/1.4.2/docs/guide/rmi/javarmiproperties.html
> 
> Not sure if anyone has got it working correctly. 

Thanks, at least I know I'm not the only one who noticed this issue.

Tomas

Re: Cluster Installation Verification

2011-04-26 Thread Jonathan Colby

What replication strategy did you use?  how does the ring look?  were the newly 
added nodes bootstrapped? is 1 or more nodes listed as a seed?

Lots of questions.  but maybe you could post your cassandra.yaml here and we 
can take a look at it.

The output of nodetool ring would also be good.

Jon 

On Apr 26, 2011, at 7:27 PM, Brad Willard wrote:

> I'm trying to setup a cassandra cluster with 0.7.4 on 4 nodes. I
> initially did a single server test that went beautifully with a test
> that inserted 16 million rows with no issues. However when I tried to
> create a 4 node cluster I've been seeing weird behavior. I seem to be
> able to run my same test without errors, however when I use the
> cassanda-cli to look at the data, it appears as though nothing has
> been inserted. I verified the server I've connected to, verified the
> correct keyspace and column family. I've already used the nodetool to
> verify all the other servers are listed in the ring.
> 
> I followed these instructions for the setup:
> http://wiki.apache.org/cassandra/MultinodeCluster
> 
> So how can I verify my cluster is working correctly?  Any help would
> be amazing as I'm evaluating this for my company.
> 
> Thanks,
> Brad

Re: IP address resolution in MultiDC setup (EC2)/VIP

2011-04-26 Thread Milind Parikh

At the risk of repeating the previous conclusions:

(a) This configuration obviates the need for a patch that I had posted
earlier. This is a good thing.
(b) The reported latency(@Sasha) is less than ordinary latencies in EC2. The
reasons behind this are not well understood. However I wouldn't look a gift
horse in the mouth.
(c) This configuration provides cloud provider independence for those
interested in such things; although YMMV in context of (b).
(d) This configuration can be run instead of the security configurations in
C0.8 for certain use-cases for secure communications.

Regards
Milind

On Tue, Apr 26, 2011 at 1:20 PM, Sasha Dolgy  wrote:

> Ok, on each node, I have configured the listen address for cassandra
> as the VIP interface (tunXXX). This allows other cassandra instances
> to connect ONLY through the VPN network. The listen address is not
> configured for the eth0 interface (EC2).
>
> rpc_address is set to 0.0.0.0 so that it can listen on all interfaces.
> if it's left blank, it will default to the value of the listen
> configuration ... which would
> mean all appserver -> cassandra traffic would be routed through the
> VPN connection (not what I want).
>
> When looking at netstat, I see the following on a node:
>
> tcp 0 0 0.0.0.0:9160 0.0.0.0:* LISTEN
> tcp 0 0 172.16.1.7:7000 0.0.0.0:* LISTEN
>
> 9160 allows clients to connect to the environment to GET/PUT data
> while the VPN interface is for node to node, secured, communication.
>
> As you see, I'm not referencing the EC2 IP anywhere in the
> configuration.  This allows me to leverage rackspace, amazon or any
> other services provider ... so long as my vpn tunnels are configured
> appropriate for each endpoint / environment.
>
> -sd
>
>
> On Tue, Apr 26, 2011 at 3:55 PM, pankaj soni 
> wrote:
> > Hi,
> > I have a question regarding Vyatta or any providing VIP in general. While
> > routing through gateway do we bind it to ec2 nodes private IP or public
> IP?
> > Also, in general could you explain how VIP might help for I am new
> towards
> > this side of field.
> >
> > thanks
>

Cluster Installation Verification

2011-04-26 Thread Brad Willard

I'm trying to setup a cassandra cluster with 0.7.4 on 4 nodes. I
initially did a single server test that went beautifully with a test
that inserted 16 million rows with no issues. However when I tried to
create a 4 node cluster I've been seeing weird behavior. I seem to be
able to run my same test without errors, however when I use the
cassanda-cli to look at the data, it appears as though nothing has
been inserted. I verified the server I've connected to, verified the
correct keyspace and column family. I've already used the nodetool to
verify all the other servers are listed in the ring.

I followed these instructions for the setup:
http://wiki.apache.org/cassandra/MultinodeCluster

So how can I verify my cluster is working correctly?  Any help would
be amazing as I'm evaluating this for my company.

Thanks,
Brad

Re: IP address resolution in MultiDC setup (EC2)/VIP

2011-04-26 Thread Sasha Dolgy

Ok, on each node, I have configured the listen address for cassandra
as the VIP interface (tunXXX). This allows other cassandra instances
to connect ONLY through the VPN network. The listen address is not
configured for the eth0 interface (EC2).

rpc_address is set to 0.0.0.0 so that it can listen on all interfaces.
if it's left blank, it will default to the value of the listen
configuration ... which would
mean all appserver -> cassandra traffic would be routed through the
VPN connection (not what I want).

When looking at netstat, I see the following on a node:

tcp 0 0 0.0.0.0:9160 0.0.0.0:* LISTEN
tcp 0 0 172.16.1.7:7000 0.0.0.0:* LISTEN

9160 allows clients to connect to the environment to GET/PUT data
while the VPN interface is for node to node, secured, communication.

As you see, I'm not referencing the EC2 IP anywhere in the
configuration.  This allows me to leverage rackspace, amazon or any
other services provider ... so long as my vpn tunnels are configured
appropriate for each endpoint / environment.

-sd

On Tue, Apr 26, 2011 at 3:55 PM, pankaj soni  wrote:
> Hi,
> I have a question regarding Vyatta or any providing VIP in general. While
> routing through gateway do we bind it to ec2 nodes private IP or public IP?
> Also, in general could you explain how VIP might help for I am new towards
> this side of field.
>
> thanks

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-26 Thread Jonathan Ellis

https://issues.apache.org/jira/browse/CASSANDRA-2549 is open to fix this

On Tue, Apr 26, 2011 at 9:41 AM, Pierre-Yves Ritschard
 wrote:
>
> On ven., 2011-04-22 at 16:49 -0500, Eric Evans wrote:
>> I am pleased to announce the release of Apache Cassandra 0.8.0 beta1.
>>
>
> Hi,
>
> First of all thanks for this release, here are a few annoyances I
> spotted while trying it out the published debian packages:
>
> The cassandra-env.sh is ran by /bin/sh and uses constructs not available
> in plain sh. this is fixed in 76fa3204 by Jonathan Ellis
>
> Starting the daemon provided in the Debian package will constantly fail
> due to thrift exception classes not being found. This stems from the
> fact that the generated thrift classes are not included in the jar
> available in the Debian package, comparing the 0.7.3 package and the
> 0.8.0-beta1 reveals the following difference:
>
> On 0.7.3:
> #unzip -l /usr/share/cassandra/apache-cassandra.jar | grep
> InvalidRequestException.class
>     1426  2011-03-11 18:23
> org/apache/cassandra/avro/InvalidRequestException.class
>     8752  2011-03-11 18:23
> org/apache/cassandra/thrift/InvalidRequestException.class
>
> On 0.8.0-beta1:
> # unzip -l /usr/share/cassandra/apache-cassandra.jar | grep
> InvalidRequestException.class
>
> I hope this helps.
>
> - pyr
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-26 Thread Pierre-Yves Ritschard

On ven., 2011-04-22 at 16:49 -0500, Eric Evans wrote: 
> I am pleased to announce the release of Apache Cassandra 0.8.0 beta1.
> 

Hi,

First of all thanks for this release, here are a few annoyances I
spotted while trying it out the published debian packages:

The cassandra-env.sh is ran by /bin/sh and uses constructs not available
in plain sh. this is fixed in 76fa3204 by Jonathan Ellis

Starting the daemon provided in the Debian package will constantly fail
due to thrift exception classes not being found. This stems from the
fact that the generated thrift classes are not included in the jar
available in the Debian package, comparing the 0.7.3 package and the
0.8.0-beta1 reveals the following difference:

On 0.7.3:
#unzip -l /usr/share/cassandra/apache-cassandra.jar | grep
InvalidRequestException.class
 1426  2011-03-11 18:23
org/apache/cassandra/avro/InvalidRequestException.class
 8752  2011-03-11 18:23
org/apache/cassandra/thrift/InvalidRequestException.class

On 0.8.0-beta1:
# unzip -l /usr/share/cassandra/apache-cassandra.jar | grep
InvalidRequestException.class

I hope this helps.

- pyr

Re: IP address resolution in MultiDC setup

2011-04-26 Thread pankaj soni

Hi,

I have a question regarding Vyatta or any providing VIP in general. While
routing through gateway do we bind it to ec2 nodes private IP or public IP?

Also, in general could you explain how VIP might help for I am new towards
this side of field.


thanks

On Mon, Apr 25, 2011 at 9:47 PM, Sasha Dolgy  wrote:

> honest opinion?  smoke and mirrors.  i really have no idea.  i was
> surprised to see the latency drop when we started using the VIP's we
> assigned routing through our ec2 vyatta gateways.  it makes it nice
> because it unties you from being 100% stuck on amazon.  you can design
> your environment for cassandra with local nodes in an office if you
> wanted ... it also solved the security problems i was coming across in
> that before cassandra 0.8, intra-node communication IS NOT encrypted
> or secured
>
> anyway .. the biggest thing for me was to ensure we are not tied to
> one provider.  this was the best for my business casealso allowed
> us to not be harmed by the
> https://twitter.com/#!/search/amazonpocalypse ...
>
> -sd
>
>
> On Mon, Apr 25, 2011 at 6:11 PM, Milind Parikh 
> wrote:
> > @Sasha
> > Very interesting that you find a big difference in latency between nodes.
> > Any hypothesis on what is going on in internal aws routing that makes it
> > inefficient?
> > Milind
>

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-26 Thread Stephen Connolly

I will be calling the release vote on dev when I get a chance.

In the meantime, the staged artifacts are at
https://repository.apache.org/content/repositories/orgapachecassandra-114/

On 26 April 2011 13:27, Stephen Connolly
 wrote:
> beta versions will be available from releases repo.
>
> You can help validate the poms when I call the release vote.
>
> On 26 April 2011 13:15, Mck  wrote:
>> On Tue, 2011-04-26 at 12:53 +0100, Stephen Connolly wrote:
>>> (or did you want 20million unneeded deps for the
>>> client jars?)
>>
>> Yes that's a good reason :-)
>> If there anything i can help with?
>>
>> Will beta versions be available under releases repository?
>>
>>
>> ~mck
>>
>>
>

Re: advice for EC2 deployment

2011-04-26 Thread William Oberman

Thanks Aaron!

Unless no one on this list uses EC2, there were a few minor troubles end of
last week through the weekend which taught me a lot about obscure failure
modes in various applications I use :-)  My original post was trying to be
more redundant than fast, which has been by overall goal from even before
moving to Cassandra (my downtime from the EC2 madness was minimal, and due
to only having one single point of failure == the amazon load balancer).  My
secondary goal was  trying to make moving to a second region easier, but is
that is causing problems I can drop the idea.

I might be downplaying the cost of inter-AZ communication, but I've lived
with that for quite some time, for example my current setup of MySQL in
Master-Master replication is split over zones, and my webservers live in yet
different zones.  Maybe Cassandra is "chattier" than I'm used to?  (again,
I'm fairly new to cassandra)

Based on that article, the discussion, and the recent EC2 issues, it sounds
like it would be better to start with:
-6 nodes split in two AZs 3/3
-Configure replication to do 2 in one AZ and one in the other
(NetworkTopology treats AZs as racks, so does RF=3,us-east=3 make this
happen naturally?)
-What does LOCAL_QUORUM do in this case?  Is there a "rack quorum"?  Or does
the natural latencies of AZs make LOCAL_QUORUM behave like a rack quorum?

will

On Tue, Apr 26, 2011 at 1:14 AM, aaron morton wrote:

> For background see this article:
>
> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>
>
> And
> this recent discussion
> http://www.mail-archive.com/user@cassandra.apache.org/msg12502.html
>
> Issues
> that may be a concern:
> - lots of cross AZ latency in us-east, e.g. LOCAL_QUORUM ops must wait
> cross AZ . Also consider it during maintenance tasks, how much of a pain is
> it going to be to have latency between every node.
> - IMHO not having sufficient (by that I mean 3) replicas in a cassandra DC
> to handle a single node failure when working at Quorum reduces the utility
> of the DC. e.g. with a local RF of 2 in the west, the quorum is 2, and if
> you lose one node from the replica set you will not be able to use local
> QUORUM for keys in that range. Or consider a failure mode where the west is
> disconnected from the east.
>
> Could you start simple with 3 replicas in one AZ in us-east and 3 replicas
> in an AZ+Region ?  Then work through some failure scenarios.
>
> Hope that helps.
> Aaron
>
>
> On 22 Apr 2011, at 03:28, William Oberman wrote:
>
> Hi,
>
> My service is not yet ready to be fully multi-DC, due to how some of my
> legacy MySQL stuff works.  But, I wanted to get cassandra going ASAP and
> work towards multi-DC.  I have two main cassandra use cases: one where I can
> handle eventual consistency (and all of the writes/reads are currently ONE),
> and one where I can't (writes/reads are currently QUORUM).  My test cluster
> is currently 4 smalls all in us-east with RF=3 (more to prove I can
> clustering, than to have an exact production replica).  All of my unit
> tests, and "load tests" (again, not to prove true max load, but to more to
> tease out concurrency issues) are passing now.
>
> For production, I was thinking of doing:
> -4 cassandra larges in us-east (where I am now), once in each AZ
> -1 cassandra large in us-west (where I have nothing)
> For now, my data can fit into a single large's 2 disk ephemeral using
> RAID0, and I was then thinking of doing a RF=3 with us-east=2 and
> us-west=1.  If I do eventual consistency at ONE, and consistency at
> LOCAL_QUORUM, I was hoping:
> -eventual consistency ops would be really fast
> -consistent ops would be pretty fast (what does LOCAL_QUORUM do in this
> case?  return after 1 or 2 us-east nodes ack?)
> -us-west would contain a complete copy of my data, so it's a good
> eventually consistent "close to real time" backup  (assuming it can keep up
> over long periods of time, but I think it should)
> -eventually, when I'm ready to roll out in us-west I'll be able to change
> the replication settings and that server in us-west could help seed new
> cassandra instances faster than the ones in us-east
>
> Or am I missing something really fundamental about how cassandra works
> making this a terrible plan?  I should have plenty of time to get my
> multi-DC working before the instance in us-west fills up (but even then, I
> should be able to add instances over there to stall fairly trivially,
> right?).
>
> Thanks!
>
> will
>
>
>


-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-26 Thread Stephen Connolly

beta versions will be available from releases repo.

You can help validate the poms when I call the release vote.

On 26 April 2011 13:15, Mck  wrote:
> On Tue, 2011-04-26 at 12:53 +0100, Stephen Connolly wrote:
>> (or did you want 20million unneeded deps for the
>> client jars?)
>
> Yes that's a good reason :-)
> If there anything i can help with?
>
> Will beta versions be available under releases repository?
>
>
> ~mck
>
>

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-26 Thread Mck

On Tue, 2011-04-26 at 12:53 +0100, Stephen Connolly wrote:
> (or did you want 20million unneeded deps for the
> client jars?) 

Yes that's a good reason :-)
If there anything i can help with?

Will beta versions be available under releases repository?


~mck

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-26 Thread Stephen Connolly

On 26 April 2011 10:37, Mck  wrote:
> On Fri, 2011-04-22 at 16:49 -0500, Eric Evans wrote:
>> I am pleased to announce the release of Apache Cassandra 0.8.0 beta1.
>
>
> *Truly Awesome!*
>  CQL rocks in so many ways.
>
>
> Is 0.8.0-beta1 available in apache's maven repository?
>  And if not, why not?

Because i'm still validating that I have the poms minimized for the
multiple artifacts (or did you want 20million unneeded deps for the
client jars?)

>
> ~mck
>
>
>
>

Re: IP address resolution in MultiDC setup

2011-04-26 Thread Milind Parikh

You can't route traffic over private ips across data centers.this is the
point of the patch.

/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/

On Apr 26, 2011 6:59 AM, "pankaj soni"  wrote:

one last doubt is pending after reading your document:

1. when deploying cassandra across multiple dcs using your patch, is it
possible to have internal network of nodes in each data center talking over
private ip? then I assume the node with public ip will act as coordinator.
But if it goes down the link between data centers will be down?

could you clear this one.

thnks
pankaj

On Mon, Apr 25, 2011 at 7:00 PM, pankaj soni 
wrote:
>
> scrap the last ...

RE: 0.7.4 no longer installable?

2011-04-26 Thread Gert van der Spoel

Alternatively you could get the deb file at:
http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/

and go for dpkg -i cassandra_0.7.4_all.deb

CU,

Gert



> -Original Message-
> From: Watanabe Maki [mailto:watanabe.m...@gmail.com]
> Sent: dinsdag 26 april 2011 13:46
> To: user@cassandra.apache.org
> Cc: user@cassandra.apache.org
> Subject: Re: 0.7.4 no longer installable?
> 
> How about to download the binary kit manually?
> http://cassandra.apache.org/
> 
> From iPhone
> 
> 
> On 2011/04/26, at 18:55, Luke Biddell  wrote:
> 
> > Chaps,
> >
> > We're using 0.7.4 here and aren't ready to go to 0.8 just yet. If I do
> apt-get install cassandra=0.7.4 on a clean machine it appears to be
> unavailable. Am I doing something wrong?
> >
> > Thanks
> >
> > Luke

Re: IP address resolution in MultiDC setup

2011-04-26 Thread pankaj soni

one last doubt is pending after reading your document:

1. when deploying cassandra across multiple dcs using your patch, is it
possible to have internal network of nodes in each data center talking over
private ip? then I assume the node with public ip will act as coordinator.
But if it goes down the link between data centers will be down?

could you clear this one.

thnks
pankaj

On Mon, Apr 25, 2011 at 7:00 PM, pankaj soni wrote:

> scrap the last mail, just finished reading Amazon ec2 resource policy.
>
> @milind when deploying cassandra across multiple dcs using your patch, is
> it possible to have internal network of nodes in each data center talking
> over private ip?
> then I assume the node with public ip will act as co-ordinator. If it goes
> down the link between data centers will be down?
>
> Thanks
> pankaj
>
>
> On Mon, Apr 25, 2011 at 6:09 PM, pankaj soni wrote:
>
>> Just read your paper on this. Must say helped a great deal.
>>
>> 1 more query does amazon by default award both external and internal IP
>> address for each node? or we have to explicitly buy the external IP's?
>>
>> I am looking into overlay n/w's.
>>
>>
>> On Mon, Apr 25, 2011 at 5:20 PM, Milind Parikh wrote:
>>
>>> I stand correctedI show how cassandra can be deployed in multiple dcs
>>> through a simple patch; using public ips. In your scenario with an overlay
>>> n/w, you will not require this patch.
>>>
>>> /***
>>> sent from my android...please pardon occasional typos as I respond @ the
>>> speed of thought
>>> /
>>>
>>> On Apr 25, 2011 7:43 AM, "Milind Parikh"  wrote:
>>>
>>> I have authored exactly this paperplease search this ml. Please be
>>> aware about ec2's internal network as you design your deployment. Ec2 also
>>> does not support multicast; which is a pain,but not unsurmountable.
>>>
>>>
>>>
>>> /***
>>> sent from my android...please pardon occasional typos as I respond @ the
>>> ...
>>>
>>>
>>> >
>>> > On Apr 25, 2011 7:31 AM, "pankaj soni" 
>>> wrote:
>>> >
>>> > We are expecting t...
>>>
>>> pankaj
>>>
>>>
>>> >
>>> >
>>> >
>>> > On Mon, Apr 25, 2011 at 4:55 PM, Milind Parikh 
>>> wrote:
>>> > >
>>> > It will be thro...
>>>
>>>
>>
>

Re: 0.7.4 no longer installable?

2011-04-26 Thread Watanabe Maki

How about to download the binary kit manually?
http://cassandra.apache.org/

From iPhone

On 2011/04/26, at 18:55, Luke Biddell  wrote:

> Chaps,
> 
> We're using 0.7.4 here and aren't ready to go to 0.8 just yet. If I do 
> apt-get install cassandra=0.7.4 on a clean machine it appears to be 
> unavailable. Am I doing something wrong?
> 
> Thanks
> 
> Luke

0.7.4 no longer installable?

2011-04-26 Thread Luke Biddell

Chaps,

We're using 0.7.4 here and aren't ready to go to 0.8 just yet. If I do
apt-get install cassandra=0.7.4 on a clean machine it appears to be
unavailable. Am I doing something wrong?

Thanks

Luke

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-26 Thread Mck

On Fri, 2011-04-22 at 16:49 -0500, Eric Evans wrote:
> I am pleased to announce the release of Apache Cassandra 0.8.0 beta1.


*Truly Awesome!*  
  CQL rocks in so many ways. 


Is 0.8.0-beta1 available in apache's maven repository?
  And if not, why not? 

~mck

AW: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;

2011-04-26 Thread Johannes Hoerle

I am not really deep into 0.6.x configuration but what is the listening address 
in your storage conf?
localhost/127.0.0.1 or maybe the internal/public ip of the host? I think you 
need to connect to the one defined

johannes

Von: Sasha Dolgy [mailto:sdo...@gmail.com]
Gesendet: Dienstag, 26. April 2011 10:39
An: user@cassandra.apache.org
Betreff: Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;


When cassandra is not running:  netstat -an | grep 8080

anything showing up?
On Apr 26, 2011 10:04 AM, "Ali Ahsan" 
mailto:ali.ah...@panasiangroup.com>> wrote:
> Thanks for quick reply but i am not using ec2,Its a physical machine.
> On 04/26/2011 12:59 PM, Sasha Dolgy wrote:
>> on ec2, we had to change the default jmx port ... but that's with
>> 0.7.x. instead of 8080 we opted for 9090 and that solved our
>> problems.
>> -sd
>>
>> On Tue, Apr 26, 2011 at 9:57 AM, Ali 
>> Ahsanmailto:ali.ah...@panasiangroup.com>> wrote:
>>> I have added in cassandra.in.sh 
>>> -Djava.rmi.server.hostname=myipaddress but
>>> still getting same error any ideas ?
>>>
>>> On 04/17/2011 02:24 AM, Tyler Hobbs wrote:
>>>
>>> http://wiki.apache.org/cassandra/JmxGotchas
>>>
>>> On Sat, Apr 16, 2011 at 12:20 PM, Ali 
>>> Ahsanmailto:ali.ah...@panasiangroup.com>>
>>> wrote:
 Any one have solution for this problem ?


 On 04/13/2011 05:20 PM, Ali Ahsan wrote:
> I am not running any firewall this physical machine not EC2,I can telnet
> to port 8080
>
>
> telnet 127.0.0.1 8080
> Trying 127.0.0.1...
> Connected to localhost.localdomain (127.0.0.1).
> Escape character is '^]'.
>
>
>

 --
 S.Ali Ahsan

 Senior System Engineer

 e-Business (Pvt) Ltd

 49-C Jail Road, Lahore, P.O. Box 676
 Lahore 54000, Pakistan

 Tel: +92 (0)42 3758 7140 Ext. 128

 Mobile: +92 (0)345 831 8769

 Fax: +92 (0)42 3758 0027

 Email: ali.ah...@panasiangroup.com



 www.ebusiness-pg.com

 www.panasiangroup.com

 Confidentiality: This e-mail and any attachments may be confidential
 and/or privileged. If you are not a named recipient, please notify the
 sender immediately and do not disclose the contents to another person
 use it for any purpose or store or copy the information in any medium.
 Internet communications cannot be guaranteed to be timely, secure, error
 or virus-free. We do not accept liability for any errors or omissions.

>>>
>>>
>>> --
>>> Tyler Hobbs
>>> Software Engineer, DataStax
>>> Maintainer of the pycassa Cassandra Python client library
>>>
>>>
>>>
>>> --
>>> S.Ali Ahsan
>>> Senior System Engineer
>>> e-Business (Pvt) Ltd
>>> 49-C Jail Road, Lahore, P.O. Box 676
>>> Lahore 54000, Pakistan
>>> Tel: +92 (0)42 3758 7140 Ext. 128
>>> Mobile: +92 (0)345 831 8769
>>> Fax: +92 (0)42 3758 0027
>>> Email: ali.ah...@panasiangroup.com
>>>
>>> www.ebusiness-pg.com
>>> www.panasiangroup.com
>>> Confidentiality: This e-mail and any attachments may be confidential
>>> and/or privileged. If you are not a named recipient, please notify the
>>> sender immediately and do not disclose the contents to another person
>>> use it for any purpose or store or copy the information in any medium.
>>> Internet communications cannot be guaranteed to be timely, secure, error
>>> or virus-free. We do not accept liability for any errors or omissions.
>>
>>
>
>
> --
> S.Ali Ahsan
>
> Senior System Engineer
>
> e-Business (Pvt) Ltd
>
> 49-C Jail Road, Lahore, P.O. Box 676
> Lahore 54000, Pakistan
>
> Tel: +92 (0)42 3758 7140 Ext. 128
>
> Mobile: +92 (0)345 831 8769
>
> Fax: +92 (0)42 3758 0027
>
> Email: ali.ah...@panasiangroup.com
>
>
>
> www.ebusiness-pg.com
>
> www.panasiangroup.com
>
> Confidentiality: This e-mail and any attachments may be confidential
> and/or privileged. If you are not a named recipient, please notify the
> sender immediately and do not disclose the contents to another person
> use it for any purpose or store or copy the information in any medium.
> Internet communications cannot be guaranteed to be timely, secure, error
> or virus-free. We do not accept liability for any errors or omissions.
>

Re: multithreaded compaction

2011-04-26 Thread Terje Marthinussen

To be honest, this started after feeding data to cassandra for a while with
compaction disabled (sort of a test case).

when I enabled it... boom... spectacular process with 2000% CPU usage
(please note... there is compression in cassandra in this system).

This system actually have SSD's so when throttled a bit, the I/O is really
not a problem, but I doubt that a HDD based system would have managed to
keep up.

I agree, this is hopefully something that does not normally happen, but then
again, some protection against Murphy's law is always good.

Thanks!
Terje

On Tue, Apr 26, 2011 at 4:35 PM, Sylvain Lebresne wrote:

> On Tue, Apr 26, 2011 at 9:01 AM, Terje Marthinussen
>  wrote:
> > Hi,
> > I was testing the multithreaded compactions and with 2x6 cores (24 with
> HT)
> > it does seem a bit crazy with 24 compactions running concurrently.
> > It is probably not very good in terms of random I/O.
>
> It does seems a bit overkill. However, I'm slightly curious how you
> ended up with 24 parallel
> compactions, more precisely, how did you end up with enough sstables
> to trigger 24
> compactions ? Was that done on purpose for testing sake, or did you
> saw that in a real
> situation ?
>
> I'm asking because in 'real' situation, given that compaction are
> triggered only if there is
> some number of files to compact, and provided the cluster is correctly
> provisioned, I wouldn't
> expect the number of parallel compaction to jump to such numbers (one
> of the goal of
> multi_treaded compaction was to make sure we never end up accumulating
> lots of un-compacted
> sstables). Anyway, I get your point, just wondering if that was a real
> situation.
>
> > As such, I think I agree with the argument in 2191 that there should be a
> > config option for this.
> > Probably a default that is dynamic with 1 thread per column family +2 or
> 3
> > thread for parallel compactions outside of that could be good.
> > Any other opinions?
>
> Multi-threaded compaction is optional and compaction throttling is
> supposed to mitigage
> it. However I do agree that too much many compactions may be a bad use
> of resources
> because of random IO even if correctly throttled. I think it's missing
> a configuration option
> "concurrent_compactions" like there is a "concurrent_writes|reads".
> For that, I have created
>  https://issues.apache.org/jira/browse/CASSANDRA-2558
>
> > I guess the compaction thread pool should also show up in tpstats?
>
> Yes it should ... and it will ... eventually :)
>
> Thanks for the feedback.
>
> --
> Sylvain
>

Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;

2011-04-26 Thread Sasha Dolgy

When cassandra is not running:  netstat -an | grep 8080

anything showing up?
On Apr 26, 2011 10:04 AM, "Ali Ahsan"  wrote:
> Thanks for quick reply but i am not using ec2,Its a physical machine.
> On 04/26/2011 12:59 PM, Sasha Dolgy wrote:
>> on ec2, we had to change the default jmx port ... but that's with
>> 0.7.x. instead of 8080 we opted for 9090 and that solved our
>> problems.
>> -sd
>>
>> On Tue, Apr 26, 2011 at 9:57 AM, Ali Ahsan
wrote:
>>> I have added in cassandra.in.sh -Djava.rmi.server.hostname=myipaddress
but
>>> still getting same error any ideas ?
>>>
>>> On 04/17/2011 02:24 AM, Tyler Hobbs wrote:
>>>
>>> http://wiki.apache.org/cassandra/JmxGotchas
>>>
>>> On Sat, Apr 16, 2011 at 12:20 PM, Ali Ahsan
>>> wrote:
 Any one have solution for this problem ?


 On 04/13/2011 05:20 PM, Ali Ahsan wrote:
> I am not running any firewall this physical machine not EC2,I can
telnet
> to port 8080
>
>
> telnet 127.0.0.1 8080
> Trying 127.0.0.1...
> Connected to localhost.localdomain (127.0.0.1).
> Escape character is '^]'.
>
>
>

 --
 S.Ali Ahsan

 Senior System Engineer

 e-Business (Pvt) Ltd

 49-C Jail Road, Lahore, P.O. Box 676
 Lahore 54000, Pakistan

 Tel: +92 (0)42 3758 7140 Ext. 128

 Mobile: +92 (0)345 831 8769

 Fax: +92 (0)42 3758 0027

 Email: ali.ah...@panasiangroup.com



 www.ebusiness-pg.com

 www.panasiangroup.com

 Confidentiality: This e-mail and any attachments may be confidential
 and/or privileged. If you are not a named recipient, please notify the
 sender immediately and do not disclose the contents to another person
 use it for any purpose or store or copy the information in any medium.
 Internet communications cannot be guaranteed to be timely, secure,
error
 or virus-free. We do not accept liability for any errors or omissions.

>>>
>>>
>>> --
>>> Tyler Hobbs
>>> Software Engineer, DataStax
>>> Maintainer of the pycassa Cassandra Python client library
>>>
>>>
>>>
>>> --
>>> S.Ali Ahsan
>>> Senior System Engineer
>>> e-Business (Pvt) Ltd
>>> 49-C Jail Road, Lahore, P.O. Box 676
>>> Lahore 54000, Pakistan
>>> Tel: +92 (0)42 3758 7140 Ext. 128
>>> Mobile: +92 (0)345 831 8769
>>> Fax: +92 (0)42 3758 0027
>>> Email: ali.ah...@panasiangroup.com
>>>
>>> www.ebusiness-pg.com
>>> www.panasiangroup.com
>>> Confidentiality: This e-mail and any attachments may be confidential
>>> and/or privileged. If you are not a named recipient, please notify the
>>> sender immediately and do not disclose the contents to another person
>>> use it for any purpose or store or copy the information in any medium.
>>> Internet communications cannot be guaranteed to be timely, secure, error
>>> or virus-free. We do not accept liability for any errors or omissions.
>>
>>
>
>
> --
> S.Ali Ahsan
>
> Senior System Engineer
>
> e-Business (Pvt) Ltd
>
> 49-C Jail Road, Lahore, P.O. Box 676
> Lahore 54000, Pakistan
>
> Tel: +92 (0)42 3758 7140 Ext. 128
>
> Mobile: +92 (0)345 831 8769
>
> Fax: +92 (0)42 3758 0027
>
> Email: ali.ah...@panasiangroup.com
>
>
>
> www.ebusiness-pg.com
>
> www.panasiangroup.com
>
> Confidentiality: This e-mail and any attachments may be confidential
> and/or privileged. If you are not a named recipient, please notify the
> sender immediately and do not disclose the contents to another person
> use it for any purpose or store or copy the information in any medium.
> Internet communications cannot be guaranteed to be timely, secure, error
> or virus-free. We do not accept liability for any errors or omissions.
>

Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;

2011-04-26 Thread Ali Ahsan


Thanks for quick reply but i am not using ec2,Its a physical machine.
On 04/26/2011 12:59 PM, Sasha Dolgy wrote:

on ec2, we had to change the default jmx port ... but that's with
0.7.x.  instead of 8080 we opted for 9090 and that solved our
problems.
-sd

On Tue, Apr 26, 2011 at 9:57 AM, Ali Ahsan  wrote:

I have added in cassandra.in.sh -Djava.rmi.server.hostname=myipaddress but
still getting same error any ideas ?

On 04/17/2011 02:24 AM, Tyler Hobbs wrote:

http://wiki.apache.org/cassandra/JmxGotchas

On Sat, Apr 16, 2011 at 12:20 PM, Ali Ahsan
wrote:

Any one have solution for this problem ?


On 04/13/2011 05:20 PM, Ali Ahsan wrote:

I am not running any firewall this physical machine not EC2,I can telnet
to port 8080


  telnet 127.0.0.1 8080
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.





--
S.Ali Ahsan

Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 Ext. 128

Mobile: +92 (0)345 831 8769

Fax: +92 (0)42 3758 0027

Email: ali.ah...@panasiangroup.com



www.ebusiness-pg.com

www.panasiangroup.com

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.




--
Tyler Hobbs
Software Engineer, DataStax
Maintainer of the pycassa Cassandra Python client library



--
S.Ali Ahsan
Senior System Engineer
e-Business (Pvt) Ltd
49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan
Tel: +92 (0)42 3758 7140 Ext. 128
Mobile: +92 (0)345 831 8769
Fax: +92 (0)42 3758 0027
Email: ali.ah...@panasiangroup.com

www.ebusiness-pg.com
www.panasiangroup.com
Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.






--
S.Ali Ahsan

Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 Ext. 128

Mobile: +92 (0)345 831 8769

Fax: +92 (0)42 3758 0027

Email: ali.ah...@panasiangroup.com



www.ebusiness-pg.com

www.panasiangroup.com

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.

Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;

2011-04-26 Thread Sasha Dolgy

on ec2, we had to change the default jmx port ... but that's with
0.7.x.  instead of 8080 we opted for 9090 and that solved our
problems.
-sd

On Tue, Apr 26, 2011 at 9:57 AM, Ali Ahsan  wrote:
> I have added in cassandra.in.sh -Djava.rmi.server.hostname=myipaddress but
> still getting same error any ideas ?
>
> On 04/17/2011 02:24 AM, Tyler Hobbs wrote:
>
> http://wiki.apache.org/cassandra/JmxGotchas
>
> On Sat, Apr 16, 2011 at 12:20 PM, Ali Ahsan 
> wrote:
>>
>> Any one have solution for this problem ?
>>
>>
>> On 04/13/2011 05:20 PM, Ali Ahsan wrote:
>>>
>>> I am not running any firewall this physical machine not EC2,I can telnet
>>> to port 8080
>>>
>>>
>>>  telnet 127.0.0.1 8080
>>> Trying 127.0.0.1...
>>> Connected to localhost.localdomain (127.0.0.1).
>>> Escape character is '^]'.
>>>
>>>
>>>
>>
>>
>> --
>> S.Ali Ahsan
>>
>> Senior System Engineer
>>
>> e-Business (Pvt) Ltd
>>
>> 49-C Jail Road, Lahore, P.O. Box 676
>> Lahore 54000, Pakistan
>>
>> Tel: +92 (0)42 3758 7140 Ext. 128
>>
>> Mobile: +92 (0)345 831 8769
>>
>> Fax: +92 (0)42 3758 0027
>>
>> Email: ali.ah...@panasiangroup.com
>>
>>
>>
>> www.ebusiness-pg.com
>>
>> www.panasiangroup.com
>>
>> Confidentiality: This e-mail and any attachments may be confidential
>> and/or privileged. If you are not a named recipient, please notify the
>> sender immediately and do not disclose the contents to another person
>> use it for any purpose or store or copy the information in any medium.
>> Internet communications cannot be guaranteed to be timely, secure, error
>> or virus-free. We do not accept liability for any errors or omissions.
>>
>
>
>
> --
> Tyler Hobbs
> Software Engineer, DataStax
> Maintainer of the pycassa Cassandra Python client library
>
>
>
> --
> S.Ali Ahsan
> Senior System Engineer
> e-Business (Pvt) Ltd
> 49-C Jail Road, Lahore, P.O. Box 676
> Lahore 54000, Pakistan
> Tel: +92 (0)42 3758 7140 Ext. 128
> Mobile: +92 (0)345 831 8769
> Fax: +92 (0)42 3758 0027
> Email: ali.ah...@panasiangroup.com
>
> www.ebusiness-pg.com
> www.panasiangroup.com
> Confidentiality: This e-mail and any attachments may be confidential
> and/or privileged. If you are not a named recipient, please notify the
> sender immediately and do not disclose the contents to another person
> use it for any purpose or store or copy the information in any medium.
> Internet communications cannot be guaranteed to be timely, secure, error
> or virus-free. We do not accept liability for any errors or omissions.



-- 
Sasha Dolgy
sasha.do...@gmail.com

Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;

2011-04-26 Thread Ali Ahsan

I have added in cassandra.in.sh -Djava.rmi.server.hostname=myipaddress 
but still getting same error any ideas ?


On 04/17/2011 02:24 AM, Tyler Hobbs wrote:

http://wiki.apache.org/cassandra/JmxGotchas

On Sat, Apr 16, 2011 at 12:20 PM, Ali Ahsan 
mailto:ali.ah...@panasiangroup.com>> wrote:


Any one have solution for this problem ?



On 04/13/2011 05:20 PM, Ali Ahsan wrote:

I am not running any firewall this physical machine not EC2,I
can telnet to port 8080


 telnet 127.0.0.1 8080
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.





-- 
S.Ali Ahsan


Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 
Ext. 128

Mobile: +92 (0)345 831 8769 

Fax: +92 (0)42 3758 0027 

Email: ali.ah...@panasiangroup.com




www.ebusiness-pg.com 

www.panasiangroup.com 

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure,
error
or virus-free. We do not accept liability for any errors or omissions.




--
Tyler Hobbs
Software Engineer, DataStax 
Maintainer of the pycassa  
Cassandra Python client library





--
S.Ali Ahsan

Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 Ext. 128

Mobile: +92 (0)345 831 8769

Fax: +92 (0)42 3758 0027

Email: ali.ah...@panasiangroup.com



www.ebusiness-pg.com

www.panasiangroup.com

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.

Re: multithreaded compaction

2011-04-26 Thread Sylvain Lebresne

FYI, I've also created
 https://issues.apache.org/jira/browse/CASSANDRA-2559
as another approach to the problem.

--
Sylvain

On Tue, Apr 26, 2011 at 9:35 AM, Sylvain Lebresne  wrote:
> On Tue, Apr 26, 2011 at 9:01 AM, Terje Marthinussen
>  wrote:
>> Hi,
>> I was testing the multithreaded compactions and with 2x6 cores (24 with HT)
>> it does seem a bit crazy with 24 compactions running concurrently.
>> It is probably not very good in terms of random I/O.
>
> It does seems a bit overkill. However, I'm slightly curious how you
> ended up with 24 parallel
> compactions, more precisely, how did you end up with enough sstables
> to trigger 24
> compactions ? Was that done on purpose for testing sake, or did you
> saw that in a real
> situation ?
>
> I'm asking because in 'real' situation, given that compaction are
> triggered only if there is
> some number of files to compact, and provided the cluster is correctly
> provisioned, I wouldn't
> expect the number of parallel compaction to jump to such numbers (one
> of the goal of
> multi_treaded compaction was to make sure we never end up accumulating
> lots of un-compacted
> sstables). Anyway, I get your point, just wondering if that was a real
> situation.
>
>> As such, I think I agree with the argument in 2191 that there should be a
>> config option for this.
>> Probably a default that is dynamic with 1 thread per column family +2 or 3
>> thread for parallel compactions outside of that could be good.
>> Any other opinions?
>
> Multi-threaded compaction is optional and compaction throttling is
> supposed to mitigage
> it. However I do agree that too much many compactions may be a bad use
> of resources
> because of random IO even if correctly throttled. I think it's missing
> a configuration option
> "concurrent_compactions" like there is a "concurrent_writes|reads".
> For that, I have created
>  https://issues.apache.org/jira/browse/CASSANDRA-2558
>
>> I guess the compaction thread pool should also show up in tpstats?
>
> Yes it should ... and it will ... eventually :)
>
> Thanks for the feedback.
>
> --
> Sylvain
>

Re: multithreaded compaction

2011-04-26 Thread Sylvain Lebresne

On Tue, Apr 26, 2011 at 9:01 AM, Terje Marthinussen
 wrote:
> Hi,
> I was testing the multithreaded compactions and with 2x6 cores (24 with HT)
> it does seem a bit crazy with 24 compactions running concurrently.
> It is probably not very good in terms of random I/O.

It does seems a bit overkill. However, I'm slightly curious how you
ended up with 24 parallel
compactions, more precisely, how did you end up with enough sstables
to trigger 24
compactions ? Was that done on purpose for testing sake, or did you
saw that in a real
situation ?

I'm asking because in 'real' situation, given that compaction are
triggered only if there is
some number of files to compact, and provided the cluster is correctly
provisioned, I wouldn't
expect the number of parallel compaction to jump to such numbers (one
of the goal of
multi_treaded compaction was to make sure we never end up accumulating
lots of un-compacted
sstables). Anyway, I get your point, just wondering if that was a real
situation.

> As such, I think I agree with the argument in 2191 that there should be a
> config option for this.
> Probably a default that is dynamic with 1 thread per column family +2 or 3
> thread for parallel compactions outside of that could be good.
> Any other opinions?

Multi-threaded compaction is optional and compaction throttling is
supposed to mitigage
it. However I do agree that too much many compactions may be a bad use
of resources
because of random IO even if correctly throttled. I think it's missing
a configuration option
"concurrent_compactions" like there is a "concurrent_writes|reads".
For that, I have created
 https://issues.apache.org/jira/browse/CASSANDRA-2558

> I guess the compaction thread pool should also show up in tpstats?

Yes it should ... and it will ... eventually :)

Thanks for the feedback.

--
Sylvain

Re: new node can't find seed node

2011-04-26 Thread Sasha Dolgy

server2 should be pointing to server1 after server1 is online and in
the logs is accepting thrift connections.  in what you've pasted
below, you show that server2 is connecting to server2 ... not server1.

-sd

On Tue, Apr 26, 2011 at 9:19 AM, Udit Khandelwal  wrote:
> Boris Spasojevic  epfl.ch> writes:
>
>
> i have 2 machines , one windows and other linux. i am facing this issue. Could
> you please tell me how you solved it.
> Machine 1   OS listen_address thrift_address  seeds      JMX_PORT
>  server1    Xp  server1        server1         server1    8080
>  server2    linux server2      server2         server2    8082
>
>
> thanks in advance
> udit
>
>



-- 
Sasha Dolgy
sasha.do...@gmail.com

Re: new node can't find seed node

2011-04-26 Thread Udit Khandelwal

Boris Spasojevic  epfl.ch> writes:


i have 2 machines , one windows and other linux. i am facing this issue. Could 
you please tell me how you solved it.
Machine 1   OS listen_address thrift_address  seeds  JMX_PORT
 server1Xp  server1server1 server18080
 server2linux server2  server2 server28082


thanks in advance 
udit

Re: decommissioning a wrong node

2011-04-26 Thread aaron morton

There is the fabled

java.rmi.server.hostname 

http://blog.reactive.org/2011/02/connecting-to-cassandra-jmx-via-ssh.html
http://download.oracle.com/javase/1.4.2/docs/guide/rmi/javarmiproperties.html

Not sure if anyone has got it working correctly. 

Aaron


On 25 Apr 2011, at 03:51, Edward Capriolo wrote:

> On Sun, Apr 24, 2011 at 6:07 AM, Peter Schuller
>  wrote:
>>> Is there a way to bind the JMX to a specified IP only? It seems there's
>>> just 'com.sun.management.jmxremote.port' and no way to specify a host.
>> 
>> I don't think so, or at least past googling indicated several people
>> wanting to do this but not finding answers. It's extremely annoying;
>> e.g. the common case of wanting to listen on 127.0.0.1 but no public
>> interfaces...
>> 
>> --
>> / Peter Schuller
>> 
> 
> It can not be done. There is a sun bug id on this somewhere. The good
> news is that even if  JMX binds to an IP you do not want it to, the
> redirect by hostname and dynamic port choosing in the JMX protocol
> will probably cause it not to work "security my complexity".

multithreaded compaction

2011-04-26 Thread Terje Marthinussen

Hi,

I was testing the multithreaded compactions and with 2x6 cores (24 with HT)
it does seem a bit crazy with 24 compactions running concurrently.
It is probably not very good in terms of random I/O.

As such, I think I agree with the argument in 2191 that there should be a
config option for this.
Probably a default that is dynamic with 1 thread per column family +2 or 3
thread for parallel compactions outside of that could be good.

Any other opinions?

I guess the compaction thread pool should also show up in tpstats?

Regards,
Terje

42 matches

Mail list logo