date:20100601

http://php.net/manual/en/function.pack.php

2010/5/31 刘大伟 liudawei...@gmail.com:
 How can I get 16 bytes timeUUID ?


 string(36) 4698cc00-6d2f-11df-8c7f-9f342400a648 TException: UUIDs must be
 exactly 16 bytes Error:


 On Fri, Apr 23, 2010 at 5:59 PM, Olivier Rosello orose...@corp.free.fr
 wrote:

 Here is my test code :

 ColumnPath new_col;
 new_col.__isset.column = true; /* this is required! */
 new_col.column_family.assign(Incoming);
 new_col.column.assign(1968ec4a-2a73-11df-9aca-00012e27a270);
 client.insert(MyKeyspace, somekey, new_col, Random Value,
 time(NULL), ONE);

 I didn't found in the C++ Cassandra/Thrift API how to specify TimeUUID
 bytes (16) to the column name. The ColumnPath type get only a string field
 for the name column.

 With a String like this example shows, the TimeUUID is a 36 chars String
 and this code throws a Exception UUIDs must be exactly 16 bytes.

 I didn't found a function like client.insert_timeuuid_column which
 convert the column name to an uint8_t[16]... or anything else which could
 help me.

 Cheers,

 Olivier



 --
 Olivier





 --
 执著而努力着  david.liu





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Error during startup

Created https://issues.apache.org/jira/browse/CASSANDRA-1146

On Tue, Jun 1, 2010 at 12:46 AM, David Boxenhorn da...@lookin2.com wrote:
 0.6.2

 On Mon, May 31, 2010 at 9:50 PM, Jonathan Ellis jbel...@gmail.com wrote:

 What version of Cassandra was this?

 On Sun, May 30, 2010 at 8:49 AM, David Boxenhorn da...@lookin2.com
 wrote:
  I deleted the system/LocationInfo files, and now everything works.
 
  Yay! (...what happened?)
 
  On Sun, May 30, 2010 at 4:18 PM, David Boxenhorn da...@lookin2.com
  wrote:
 
  I'm getting an Expected both token and generation columns; found
  ColumnFamily error during startup can anyone tell me what it is?
  Details
  below.
 
  Starting Cassandra Server
  Listening for transport dt_socket at address: 
   INFO 16:14:33,459 Auto DiskAccessMode determined to be standard
   INFO 16:14:33,615 Sampling index for
  C:\var\lib\cassandra\data\system\LocationInfo-1-Data.db
   INFO 16:14:33,631 Removing orphan
  C:\var\lib\cassandra\data\Lookin2\Users-tmp-27-Index.db
   INFO 16:14:33,631 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Users-19-Data.db
   INFO 16:14:33,662 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Users-18-Data.db
   INFO 16:14:33,818 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Users-20-Data.db
   INFO 16:14:33,850 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Users-21-Data.db
   INFO 16:14:33,865 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Users-22-Data.db
   INFO 16:14:33,881 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestIdx-580-Data.db
   INFO 16:14:33,896 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestIdx-672-Data.db
   INFO 16:14:33,912 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestIdx-681-Data.db
   INFO 16:14:33,912 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestIdx-691-Data.db
   INFO 16:14:33,928 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestIdx-696-Data.db
   INFO 16:14:33,943 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Attractions-17-Data.db
   INFO 16:14:34,006 Sampling index for
 
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestTrendsetterIdx-5-Data.db
   INFO 16:14:34,006 Sampling index for
 
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestTrendsetterIdx-6-Data.db
   INFO 16:14:34,021 Sampling index for
 
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestPeerGroupIdx-29-Data.db
   INFO 16:14:34,350 Sampling index for
 
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestPeerGroupIdx-51-Data.db
   INFO 16:14:34,693 Sampling index for
 
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestPeerGroupIdx-72-Data.db
   INFO 16:14:35,021 Sampling index for
 
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestPeerGroupIdx-77-Data.db
   INFO 16:14:35,225 Sampling index for
 
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestPeerGroupIdx-78-Data.db
   INFO 16:14:35,350 Sampling index for
 
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestPeerGroupIdx-79-Data.db
   INFO 16:14:35,459 Sampling index for
 
  C:\var\lib\cassandra\data\Lookin2\GeoSiteInterestPeerGroupIdx-80-Data.db
   INFO 16:14:35,459 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Taxonomy-1-Data.db
   INFO 16:14:35,475 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Taxonomy-2-Data.db
   INFO 16:14:35,475 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Content-30-Data.db
   INFO 16:14:35,631 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Content-35-Data.db
   INFO 16:14:35,771 Sampling index for
  C:\var\lib\cassandra\data\Lookin2\Content-40-Data.db
   INFO 16:14:35,959 Compacting
 
  [org.apache.cassandra.io.SSTableReader(path='C:\var\lib\cassandra\data\Lookin2\Users-19-Data.db'),org.apache.cassandra.io.SSTableReader(path='C:\var\lib\cassandra\data\Lookin2\Users-20-Data.db'),org.apache.cassandra.io.SSTableReader(path='C:\var\lib\cassandra\data\Lookin2\Users-21-Data.db'),org.apache.cassandra.io.SSTableReader(path='C:\var\lib\cassandra\data\Lookin2\Users-22-Data.db')]
  ERROR 16:14:35,975 Exception encountered during startup.
  java.lang.RuntimeException: Expected both token and generation columns;
  found ColumnFamily(LocationInfo [Generation:false:4...@4,])
      at
  org.apache.cassandra.db.SystemTable.initMetadata(SystemTable.java:159)
      at
 
  org.apache.cassandra.service.StorageService.initServer(StorageService.java:305)
      at
 
  org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:99)
      at
 
  org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)
  Exception encountered during startup.
 
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Can't get data after building cluster

To elaborate:

If you manage to screw things up to where it thinks a node has data,
but it does not (adding a node without bootstrap would do this, for
instance, which is probably what you did), at most data in the token
range assigned to that node will be affected.

On Tue, Jun 1, 2010 at 12:45 AM, David Boxenhorn da...@lookin2.com wrote:
 You say no, but that is exactly what I just observed. Can I have some more
 explanation?

 To recap: I added a server to my cluster. It had some junk in the
 system/LocationInfo files from previous, unsuccessful attempts to add the
 server to the cluster. (They were unsuccessful because I hadn't opened the
 port on that computer.) When I finally succeeded in adding the 2nd server,
 the 1st server started returning null when I tried to get data using the
 CLI. I stopped the 2nd server, deleted the files in system, restarted, and
 everything worked.

 I'm afraid that this, or some similar scenario will do the same, after I go
 live. How can I protect myself?

 On Mon, May 31, 2010 at 10:10 PM, Jonathan Ellis jbel...@gmail.com wrote:

 No.

 On Mon, May 31, 2010 at 10:47 AM, David Boxenhorn da...@lookin2.com
 wrote:
  So this means that I can take my entire cluster off line if I make a
  mistake adding a new server??? Yikes!
 
  On Mon, May 31, 2010 at 6:41 PM, David Boxenhorn da...@lookin2.com
  wrote:
 
  OK. Got it working.
 
  I had some data in the 2nd server from previous failed attempts at
  hooking
  up to the cluster. When I deleted that data and tried again, it said
  bootstrapping and my 1st server started working again.
 
  On Mon, May 31, 2010 at 4:50 PM, David Boxenhorn da...@lookin2.com
  wrote:
 
  I am trying to get a cluster up and working for the first time.
 
  I got one server up and running, with lots of data on it, which I can
  see
  with the CLI.
 
  I added my 2nd server, they seem to recognize each other.
 
  Now I can't see my data with the CLI. I do a get and it returns null.
  The
  data files seem to be intact.
 
  What happened??? How can I fix it?
 
 
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: searching keys of the form substring*

2010-06-01 Thread Sagar Agrawal

Thanks Vineet for replying, but I am not able to understand how can we use
variable substitution in it.



On Mon, May 31, 2010 at 4:42 PM, vd vineetdan...@gmail.com wrote:

 Hi Sagar

 You can use variable substitution.
 ___
 Vineet Daniel
 ___

 Let your email find you



 On Mon, May 31, 2010 at 3:44 PM, Sagar Agrawal sna...@gmail.com wrote:

 Hi folks,

 I want to  fetch all those records from my column family such that the key
 starts with a specified string...

 e.g.  Suppose I have a CF keyed on full names(first name + last name) of
 persons...
 now I want to fetch all those records whose first name is 'John'

 Right now, I am using OPP and KeyRange in the following way:

  KeyRange keyRange = new KeyRange();
 keyRange.setStart_key(John);
 keyRange.setEnd_key(Joho);

 but this is sort of hard coding can anyone suggest a better way to
 achieve this?

 I would be really grateful... thank you.

writing speed test

Hi all,

I'm testing writing speed of cassandra with 4 servers. I'm confused by
the behavior of cassandra.

---env---
load-data app written in c++, using libcassandra (w/ modified batch
insert)
20 writing threads in 2 processes running on 2 servers

---optimization---
1.turn log level to INFO
2.JVM has 8G heap
3.32 concurrent read  128 write in storage-conf.xml, other cache
enlarged as well.

---result---
1-monitoring by `date;nodetool -h host ring`
I add all load together and measure the writing speed by
(load_difference / time_difference), and I get about 15MB/s for the
whole cluster.

2-monitoring by `iostat -m 10`
I can watch the disk_io from the system level and have about 10MB/s -
65MB/s for a single machine. Very big variance over time.

3-monitoring by `iptraf -g`
In this way I watch the communication between servers and get about
10MB/s for a single machine.

---opinion---
So, have you checked the writing speed of cassandra? I feel it's quite
slow currently.

Could anyone confirm this is the normal writing speed of cassandra, or
please provide someway of improving it?

-- 
Shuai Yuan 
Supertool Corp. ??
13810436859
yuan-sh...@yuan-shuai.info

Re: nodetool cleanup isn't cleaning up?

2010-06-01 Thread Ran Tavory

ok, let me try and translate your answer ;)

Are you saying that the data that was left on the node is
non-primary-replicas of rows from the time before the move?
So this implies that when a node moves in the ring, it will affect
distribution of:
- new keys
- old keys primary node
-- but will not affect distribution of old keys non-primary replicas.

If so, still I don't understand something... I would expect even the
non-primary replicas of keys to be moved since if they don't, how would they
be found? I mean upon reads the serving node should not care about whether
the row is new or old, it should have a consistent and global mapping of
tokens. So I guess this ruins my theory...
What did you mean then? Is this deletions of non-primary replicated data?
How does the replication factor affect the load on the moved host then?

On Tue, Jun 1, 2010 at 1:19 AM, Jonathan Ellis jbel...@gmail.com wrote:

 well, there you are then.

 On Mon, May 31, 2010 at 2:34 PM, Ran Tavory ran...@gmail.com wrote:
  yes, replication factor = 2
 
  On Mon, May 31, 2010 at 10:07 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  you have replication factor  1 ?
 
  On Mon, May 31, 2010 at 7:23 AM, Ran Tavory ran...@gmail.com wrote:
   I hope I understand nodetool cleanup correctly - it should clean up
 all
   data
   that does not (currently) belong to this node. If so, I think it might
   not
   be working correctly.
   Look at nodes 192.168.252.124 and 192.168.252.99 below
   192.168.252.99Up 279.35 MB
   3544607988759775661076818827414252202
|--|
   192.168.252.124Up 167.23 MB
   56713727820156410577229101238628035242 |   ^
   192.168.252.125Up 82.91 MB
85070591730234615865843651857942052863 v   |
   192.168.254.57Up 366.6 MB
113427455640312821154458202477256070485|   ^
   192.168.254.58Up 88.44 MB
141784319550391026443072753096570088106v   |
   192.168.254.59Up 88.45 MB
170141183460469231731687303715884105727|--|
   I wanted 124 to take all the load from 99. So I issued a move command.
   $ nodetool -h cass99 -p 9004 move
 56713727820156410577229101238628035243
  
   This command tells 99 to take the space b/w
  
  
 (56713727820156410577229101238628035242, 
 56713727820156410577229101238628035243]
   which is basically just one item in the token space, almost nothing...
 I
   wanted it to be very slim (just playing around).
   So, next I get this:
   192.168.252.124Up 803.33 MB
   56713727820156410577229101238628035242 |--|
   192.168.252.99Up 352.85 MB
   56713727820156410577229101238628035243 |   ^
   192.168.252.125Up 134.24 MB
   85070591730234615865843651857942052863 v   |
   192.168.254.57Up 676.41 MB
   113427455640312821154458202477256070485|   ^
   192.168.254.58Up 99.74 MB
141784319550391026443072753096570088106v   |
   192.168.254.59Up 99.94 MB
170141183460469231731687303715884105727|--|
   The tokens are correct, but it seems that 99 still has a lot of data.
   Why?
   OK, that might be b/c it didn't delete its moved data.
   So next I issued a nodetool cleanup, which should have taken care of
   that.
   Only that it didn't, the node 99 still has 352 MB of data. Why?
   So, you know what, I waited for 1h. Still no good, data wasn't cleaned
   up.
   I restarted the server. Still, data wasn't cleaned up... I issued a
   cleanup
   again... still no good... what's up with this node?
  
  
 
 
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of Riptano, the source for professional Cassandra support
  http://riptano.com
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com

access a multinode cluster

2010-06-01 Thread huajun qi

If you have a multinode cluster, which node you should connect to fetch
data?

Is there a master node in a cluster which accepts data request and dispatch
it? Or every node in the cluster is completely same?

If all nodes are same in a cluster, should client connect to random node to
reduce cassandra's load?

-- 
Location:

Re: [SPAM ] access a multinode cluster

?? 2010-06-01 15:00 +0800??huajun qi??
 If you have a multinode cluster, which node you should connect to
 fetch data?
any one.
 
 Is there a master node in a cluster which accepts data request and
 dispatch it? Or every node in the cluster is completely same?
no master. all the same.
 
 If all nodes are same in a cluster, should client connect to random
 node to reduce cassandra's load?
I think so. But I guess if you're sure where the data is, you can
connect the target machine directly.
 -- 
 Location:
 
Kevin Yuan

Re: [SPAM ] access a multinode cluster

2010-06-01 Thread huajun qi

谢谢

Re: Administration Memory for Noobs. (GC for ConcurrentMarkSweep ?)

2010-06-01 Thread Oleg Anastasjev

xavier manach xav at tekio.org writes:

 
 Hi.  I search informations for basic tunning of memory in Cassandra.My
situation :  I started to test larges imports of data in Cassandra 6.1.My first
import worked fine : 100 Millions row in 2 hours ~ around 1 insert row by
seconds
 My second is slower with the same script in another column family :  ~ around
500 insert row by seconds...I didn't understand why I have a lot of GC for
ConcurrentMarkSweep.[ GC for ConcurrentMarkSweep: 3437 ms, 104971488 reclaimed
leaving 986519328 used; max is 1211170816. ]
 ( The max did'n't move, what is this value 1211170816 ? ) I think the GC
process appear when the insert is slow. The inserts didn't work when the GC
works ?My machine has 66M of RAM, and the processor java only use around 1.8 %
 How Can I optimise the use of memory ?There is a the guideline for best
performances ?Thanks.


You may run out of memory. Cassandra stores some information about those 100M
rows you just inserted in RAM. By default cassandra is configured to take up to
1Gb of RAM. You can configure more memory for cassandra by editing
bin/cassandra.in.sh. Look there for -Xmx1G and change it to your taste.

Re: searching keys of the form substring*

2010-06-01 Thread vd

As I told you on IRC channel dont go for shortcuts ...learn java
first.
___
Vineet Daniel
___

Let your email find you


On Tue, Jun 1, 2010 at 11:47 AM, Sagar Agrawal sna...@gmail.com wrote:

 Thanks Vineet for replying, but I am not able to understand how can we use
 variable substitution in it.




 On Mon, May 31, 2010 at 4:42 PM, vd vineetdan...@gmail.com wrote:

 Hi Sagar

 You can use variable substitution.
 ___
 Vineet Daniel
 ___

 Let your email find you



 On Mon, May 31, 2010 at 3:44 PM, Sagar Agrawal sna...@gmail.com wrote:

 Hi folks,

 I want to  fetch all those records from my column family such that the
 key starts with a specified string...

 e.g.  Suppose I have a CF keyed on full names(first name + last name) of
 persons...
 now I want to fetch all those records whose first name is 'John'

 Right now, I am using OPP and KeyRange in the following way:

  KeyRange keyRange = new KeyRange();
 keyRange.setStart_key(John);
 keyRange.setEnd_key(Joho);

 but this is sort of hard coding can anyone suggest a better way to
 achieve this?

 I would be really grateful... thank you.

Re: Administration Memory for Noobs. (GC for ConcurrentMarkSweep ?)

2010-06-01 Thread xavier manach

  Perfect :) I test it.
I didn't open this file before. I did think the configuration only was in
the foloder conf.
I am not a specialist java. I will search about the meaning of JVM
parameters.

  For now, I read this page for undertand the others options of JVM :
http://java.sun.com/performance/reference/whitepapers/tuning.html

Thanks Oleg.



2010/6/1 Oleg Anastasjev olega...@gmail.com

 xavier manach xav at tekio.org writes:

 
  Hi.  I search informations for basic tunning of memory in Cassandra.My
 situation :  I started to test larges imports of data in Cassandra 6.1.Myfirst
 import worked fine : 100 Millions row in 2 hours ~ around 1 insert row
 by
 seconds
  My second is slower with the same script in another column family :  ~
 around
 500 insert row by seconds...I didn't understand why I have a lot of GC for
 ConcurrentMarkSweep.[ GC for ConcurrentMarkSweep: 3437 ms, 104971488
 reclaimed
 leaving 986519328 used; max is 1211170816. ]
  ( The max did'n't move, what is this value 1211170816 ? ) I think the GC
 process appear when the insert is slow. The inserts didn't work when the GC
 works ?My machine has 66M of RAM, and the processor java only use around
 1.8 %
  How Can I optimise the use of memory ?There is a the guideline for best
 performances ?Thanks.


 You may run out of memory. Cassandra stores some information about those
 100M
 rows you just inserted in RAM. By default cassandra is configured to take
 up to
 1Gb of RAM. You can configure more memory for cassandra by editing
 bin/cassandra.in.sh. Look there for -Xmx1G and change it to your taste.

question about class SlicePredicate

Hi all,

I don't quite understand the usage of 'class SlicePredicate' when trying
to retrieve a ranged slice. 

How should it be initialized?

Thanks!
-- 
Kevin Yuan
www.yuan-shuai.info

Re: Algorithm for distributing key of Cassandra

2010-06-01 Thread gabriele renzi

On Mon, May 31, 2010 at 8:50 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Doesn't ring a bell.  Maybe if you included the link to which you refer?

I guess this is the related post
http://spyced.blogspot.com/2009/05/consistent-hashing-vs-order-preserving.html

thought I believe the original poster misphrased or misread (the hack
in question was assigning multiple tokens to nodes for load balancing,
which cassandra does not).

The two links in the second paragraph are broken, I remember this
cause I had been curious to read them too :)

Re: question about class SlicePredicate

2010-06-01 Thread Eric Yu

It needs a SliceRange. For example:

SliceRange range = new SliceRange(); range.setStart(.getBytes());
range.setFinish(.getBytes()); range.setReversed(true); range.setCount(20);
SlicePredicate sp = new SlicePredicate(); sp.setSlice_range(range);
client.get_slice(KEYSPACE, KEY, ColumnParent, sp, ConsistencyLevel.ONE);

2010/6/1 Shuai Yuan yuansh...@supertool.net.cn

 Hi all,

 I don't quite understand the usage of 'class SlicePredicate' when trying
 to retrieve a ranged slice.

 How should it be initialized?

 Thanks!
 --
 Kevin Yuan
 www.yuan-shuai.info

Re: question about class SlicePredicate

2010-06-01 Thread Olivier Mallassi

Does it work whatever the chosen partionner?
Or only for OrderPreservingPartitionner ?

On Tuesday, June 1, 2010, Eric Yu suc...@gmail.com wrote:
 It needs a SliceRange. For example:
 SliceRange range = new SliceRange();
 range.setStart(.getBytes());
 range.setFinish(.getBytes());
 range.setReversed(true);
 range.setCount(20);

 SlicePredicate sp = new SlicePredicate();
 sp.setSlice_range(range);

 client.get_slice(KEYSPACE, KEY, ColumnParent, sp, ConsistencyLevel.ONE);
 2010/6/1 Shuai Yuan yuansh...@supertool.net.cn
 Hi all,

 I don't quite understand the usage of 'class SlicePredicate' when trying
 to retrieve a ranged slice.

 How should it be initialized?

 Thanks!
 --
 Kevin Yuan
 www.yuan-shuai.info






-- 

Olivier Mallassi
OCTO Technology

50, Avenue des Champs-Elysées
75008 Paris

Mobile: (33) 6 28 70 26 61
Tél: (33) 1 58 56 10 00
Fax: (33) 1 58 56 10 01

http://www.octo.com
Octo Talks! http://blog.octo.com

Skipping corrupted rows when doing compaction

2010-06-01 Thread hive13 Wong

Hi,

Is there a way to skip corrupted rows when doing compaction?

We are currently deploying 2 nodes with replicationfactor=2 but one node
reports lots of exceptions like java.io.UTFDataFormatException: malformed
input around byte 72. My guess is that some of the data in the SSTable is
corrupted but not all because I can still read data out of the related CF
but for some keys.

It's OK for us to throw away a small portion of the data to get the nodes
working normal.

If there is no such way to skip corrupted rows can I just clean all the data
in the corrupted node and then add it back to the cluster?
Will it automatically migrating data from the other node?

Thanks.

Ivan

Which kind of applications are Cassandra fit for?

2010-06-01 Thread 史英杰

Hi,ALL
 I found that most applications on Cassandra are for web applications,
such as store friiend information or digg information, and they get good
performance, many companies or groups want to move their applications to
Cassandra,  so which kind of applications are Cassandra fit for?  Thanks a
lot!

Yingjie

Re: nodetool cleanup isn't cleaning up?

I'm saying that .99 is getting a copy of all the data for which .124
is the primary.  (If you are using RackUnawarePartitioner.  If you are
using RackAware it is some other node.)

On Tue, Jun 1, 2010 at 1:25 AM, Ran Tavory ran...@gmail.com wrote:
 ok, let me try and translate your answer ;)
 Are you saying that the data that was left on the node is
 non-primary-replicas of rows from the time before the move?
 So this implies that when a node moves in the ring, it will affect
 distribution of:
 - new keys
 - old keys primary node
 -- but will not affect distribution of old keys non-primary replicas.
 If so, still I don't understand something... I would expect even the
 non-primary replicas of keys to be moved since if they don't, how would they
 be found? I mean upon reads the serving node should not care about whether
 the row is new or old, it should have a consistent and global mapping of
 tokens. So I guess this ruins my theory...
 What did you mean then? Is this deletions of non-primary replicated data?
 How does the replication factor affect the load on the moved host then?

 On Tue, Jun 1, 2010 at 1:19 AM, Jonathan Ellis jbel...@gmail.com wrote:

 well, there you are then.

 On Mon, May 31, 2010 at 2:34 PM, Ran Tavory ran...@gmail.com wrote:
  yes, replication factor = 2
 
  On Mon, May 31, 2010 at 10:07 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
 
  you have replication factor  1 ?
 
  On Mon, May 31, 2010 at 7:23 AM, Ran Tavory ran...@gmail.com wrote:
   I hope I understand nodetool cleanup correctly - it should clean up
   all
   data
   that does not (currently) belong to this node. If so, I think it
   might
   not
   be working correctly.
   Look at nodes 192.168.252.124 and 192.168.252.99 below
   192.168.252.99Up         279.35 MB
   3544607988759775661076818827414252202
        |--|
   192.168.252.124Up         167.23 MB
   56713727820156410577229101238628035242     |   ^
   192.168.252.125Up         82.91 MB
    85070591730234615865843651857942052863     v   |
   192.168.254.57Up         366.6 MB
    113427455640312821154458202477256070485    |   ^
   192.168.254.58Up         88.44 MB
    141784319550391026443072753096570088106    v   |
   192.168.254.59Up         88.45 MB
    170141183460469231731687303715884105727    |--|
   I wanted 124 to take all the load from 99. So I issued a move
   command.
   $ nodetool -h cass99 -p 9004 move
   56713727820156410577229101238628035243
  
   This command tells 99 to take the space b/w
  
  
   (56713727820156410577229101238628035242, 56713727820156410577229101238628035243]
   which is basically just one item in the token space, almost
   nothing... I
   wanted it to be very slim (just playing around).
   So, next I get this:
   192.168.252.124Up         803.33 MB
   56713727820156410577229101238628035242     |--|
   192.168.252.99Up         352.85 MB
   56713727820156410577229101238628035243     |   ^
   192.168.252.125Up         134.24 MB
   85070591730234615865843651857942052863     v   |
   192.168.254.57Up         676.41 MB
   113427455640312821154458202477256070485    |   ^
   192.168.254.58Up         99.74 MB
    141784319550391026443072753096570088106    v   |
   192.168.254.59Up         99.94 MB
    170141183460469231731687303715884105727    |--|
   The tokens are correct, but it seems that 99 still has a lot of data.
   Why?
   OK, that might be b/c it didn't delete its moved data.
   So next I issued a nodetool cleanup, which should have taken care of
   that.
   Only that it didn't, the node 99 still has 352 MB of data. Why?
   So, you know what, I waited for 1h. Still no good, data wasn't
   cleaned
   up.
   I restarted the server. Still, data wasn't cleaned up... I issued a
   cleanup
   again... still no good... what's up with this node?
  
  
 
 
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of Riptano, the source for professional Cassandra support
  http://riptano.com
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: access a multinode cluster

http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to

On Tue, Jun 1, 2010 at 2:00 AM, huajun qi qih...@gmail.com wrote:
 If you have a multinode cluster, which node you should connect to fetch
 data?
 Is there a master node in a cluster which accepts data request and dispatch
 it? Or every node in the cluster is completely same?
 If all nodes are same in a cluster, should client connect to random node to
 reduce cassandra's load?

 --
 Location:




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Skipping corrupted rows when doing compaction

If you're on a version earlier than 0.6.1, you might be running into
https://issues.apache.org/jira/browse/CASSANDRA-866.  Upgrading will
fix it, you don't need to reload data.

It's also worth trying 0.6.2 and DiskAccessMode=standard, in case
you've found another similar bug.

On Tue, Jun 1, 2010 at 7:37 AM, hive13 Wong hiv...@gmail.com wrote:
 Hi,
 Is there a way to skip corrupted rows when doing compaction?
 We are currently deploying 2 nodes with replicationfactor=2 but one node
 reports lots of exceptions like java.io.UTFDataFormatException: malformed
 input around byte 72. My guess is that some of the data in the SSTable is
 corrupted but not all because I can still read data out of the related CF
 but for some keys.
 It's OK for us to throw away a small portion of the data to get the nodes
 working normal.
 If there is no such way to skip corrupted rows can I just clean all the data
 in the corrupted node and then add it back to the cluster?
 Will it automatically migrating data from the other node?
 Thanks.
 Ivan



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Which kind of applications are Cassandra fit for?

2010-06-01 Thread sharanabasava raddi

The  applications which require bigger storage and fast response for
retrieval.

On Tue, Jun 1, 2010 at 6:13 PM, 史英杰 shiyingjie1...@gmail.com wrote:

 Hi,ALL
  I found that most applications on Cassandra are for web applications,
 such as store friiend information or digg information, and they get good
 performance, many companies or groups want to move their applications to
 Cassandra,  so which kind of applications are Cassandra fit for?  Thanks a
 lot!

 Yingjie

Re: Which kind of applications are Cassandra fit for?

2010-06-01 Thread 史英杰

Thanks, but would you please describe it in more details, because most
applications require fast response for retrieval.

2010/6/1 sharanabasava raddi shivub...@gmail.com

 The  applications which require bigger storage and fast response for
 retrieval.


 On Tue, Jun 1, 2010 at 6:13 PM, 史英杰 shiyingjie1...@gmail.com wrote:

 Hi,ALL
  I found that most applications on Cassandra are for web applications,
 such as store friiend information or digg information, and they get good
 performance, many companies or groups want to move their applications to
 Cassandra,  so which kind of applications are Cassandra fit for?  Thanks a
 lot!

 Yingjie

Re: Skipping corrupted rows when doing compaction

2010-06-01 Thread hive13 Wong

Thanks, Jonathan

I'm using 0.6.1
And another thing is that I get lots of zero-sized tmp files in the data
directory.
When I restarted cassandra those tmp files will be deleted then new empty
tmp files will be generated gradually, while still lots
of UTFDataFormatException in the system.log

Using 0.6.2 and DiskAccessMode=standard will skip corrupted rows?

On Tue, Jun 1, 2010 at 9:08 PM, Jonathan Ellis jbel...@gmail.com wrote:

 If you're on a version earlier than 0.6.1, you might be running into
 https://issues.apache.org/jira/browse/CASSANDRA-866.  Upgrading will
 fix it, you don't need to reload data.

 It's also worth trying 0.6.2 and DiskAccessMode=standard, in case
 you've found another similar bug.

 On Tue, Jun 1, 2010 at 7:37 AM, hive13 Wong hiv...@gmail.com wrote:
  Hi,
  Is there a way to skip corrupted rows when doing compaction?
  We are currently deploying 2 nodes with replicationfactor=2 but one node
  reports lots of exceptions like java.io.UTFDataFormatException: malformed
  input around byte 72. My guess is that some of the data in the SSTable is
  corrupted but not all because I can still read data out of the related CF
  but for some keys.
  It's OK for us to throw away a small portion of the data to get the nodes
  working normal.
  If there is no such way to skip corrupted rows can I just clean all the
 data
  in the corrupted node and then add it back to the cluster?
  Will it automatically migrating data from the other node?
  Thanks.
  Ivan



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com

Monitoring compaction

Are stats exposed over JMX for compaction?  I'm trying to see when a node is
in compaction, and guess when it will complete.  tpstats doesn't show
anything but the process is using lots of CPU time... I was wondering if
there's a better view on compaction besides looking backwards in the
system.log for a compaction start message without a corresponding completion
message.

Ian

Re: Monitoring compaction

2010-06-01 Thread Dylan Egan / WildfireApp.com

Hi Ian,

On Tue, Jun 1, 2010 at 9:27 AM, Ian Soboroff isobor...@gmail.com wrote:
 Are stats exposed over JMX for compaction?

You can view them via the
org.apache.cassandra.db:type=CompactionManager MBean. The PendingTasks
attribute might suit you best.

Cheers,

Dylan.

Re: Monitoring compaction

Thanks.  Are folks open to exposing this via nodetool?  I've been trying to
figure out a decent way to aggregate and expose all this information that is
easier than nodetool and less noisy than nagios... suggestions appreciated.

(My cluster only exposes a master node and everything else is private, so
running a pile of jconsoles is not even possible...)

Ian

On Tue, Jun 1, 2010 at 12:33 PM, Dylan Egan / WildfireApp.com 
dylan.e...@wildfireapp.com wrote:

 Hi Ian,

 On Tue, Jun 1, 2010 at 9:27 AM, Ian Soboroff isobor...@gmail.com wrote:
  Are stats exposed over JMX for compaction?

 You can view them via the
 org.apache.cassandra.db:type=CompactionManager MBean. The PendingTasks
 attribute might suit you best.

 Cheers,

 Dylan.

Re: Monitoring compaction

Regarding compaction thresholds... the BMT example says to set the threshold
to 0 during an import.  Is this advisable during any bulk import (say using
batch mutations or just lots and lots of thrift inserts)?

Also, when I asked are folks open to... I meant that I'm happy to code a
patch if anyone's interested.
Ian

On Tue, Jun 1, 2010 at 12:41 PM, Ian Soboroff isobor...@gmail.com wrote:

 Thanks.  Are folks open to exposing this via nodetool?  I've been trying to
 figure out a decent way to aggregate and expose all this information that is
 easier than nodetool and less noisy than nagios... suggestions appreciated.

 (My cluster only exposes a master node and everything else is private, so
 running a pile of jconsoles is not even possible...)

 Ian


 On Tue, Jun 1, 2010 at 12:33 PM, Dylan Egan / WildfireApp.com 
 dylan.e...@wildfireapp.com wrote:

 Hi Ian,

 On Tue, Jun 1, 2010 at 9:27 AM, Ian Soboroff isobor...@gmail.com wrote:
  Are stats exposed over JMX for compaction?

 You can view them via the
 org.apache.cassandra.db:type=CompactionManager MBean. The PendingTasks
 attribute might suit you best.

 Cheers,

 Dylan.

Re: Monitoring compaction

2010-06-01 Thread Dylan Egan / WildfireApp.com

Hi Ian,

On Tue, Jun 1, 2010 at 9:41 AM, Ian Soboroff isobor...@gmail.com wrote:
 Thanks.  Are folks open to exposing this via nodetool?  I've been trying to
 figure out a decent way to aggregate and expose all this information that is
 easier than nodetool and less noisy than nagios... suggestions appreciated.

You may be interested in the munin plugins written by James Golick and
Jonathan Ellis at
http://github.com/jamesgolick/cassandra-munin-plugins

Cheers,

Dylan.

Re: [ANN] Cassandra Tutorial @ OSCON

2010-06-01 Thread Eric Evans

On Mon, 2010-05-24 at 17:04 -0500, Eric Evans wrote:
 For those interested in Cassandra training, I'll be giving a 3-hour
 tutorial[1] at OSCON this year entitled Hands-on Cassandra.
 
 [1]: http://www.oscon.com/oscon2010/public/schedule/detail/14283
 
 The tutorial will cover setup, configuration, and management of
 clusters, and will include some Python code exercises using
 Twissandra[2]. 
 
 [2]: http://github.com/ericflo/twissandra
 
 Use discount code os10fos when signing up to get 20% off.

Just a reminder. Early-bird pricing for OSCON ends tomorrow, after that
the price goes up $250 (the discount code above is still good for 20%
though).

-- 
Eric Evans
eev...@rackspace.com

Re: nodetool cleanup isn't cleaning up?

2010-06-01 Thread Ran Tavory

I'm using RackAwareStrategy. But it still doesn't make sense I think...
let's see what did I miss...
According to http://wiki.apache.org/cassandra/Operations


   -

   RackAwareStrategy: replica 2 is placed in the first node along the ring
   the belongs in *another* data center than the first; the remaining N-2
   replicas, if any, are placed on the first nodes along the ring in the *
   same* rack as the first



192.168.252.124Up803.33 MB
56713727820156410577229101238628035242 |--|
192.168.252.99Up 352.85 MB
56713727820156410577229101238628035243 |   ^
192.168.252.125Up134.24 MB
85070591730234615865843651857942052863 v   |
192.168.254.57Up 676.41 MB
 113427455640312821154458202477256070485|   ^
192.168.254.58Up  99.74 MB
 141784319550391026443072753096570088106v   |
192.168.254.59Up  99.94 MB
 170141183460469231731687303715884105727|--|

Alright, so I made a mistake and didn't use the alternate-datacenter
suggestion on the page so the first node of every DC is overloaded with
replicas. However,  the current situation still doesn't make sense to me.
.252.124 will be overloaded b/c it has the first token in the 252 dc.
.254.57 will also be overloaded since it has the first token in the .254 DC.
But for which node does 252.99 serve as a replicator? It's not the first in
the DC and it's just one single token more than it's predecessor (which is
in the same DC).

On Tue, Jun 1, 2010 at 4:00 PM, Jonathan Ellis jbel...@gmail.com wrote:

 I'm saying that .99 is getting a copy of all the data for which .124
 is the primary.  (If you are using RackUnawarePartitioner.  If you are
 using RackAware it is some other node.)

 On Tue, Jun 1, 2010 at 1:25 AM, Ran Tavory ran...@gmail.com wrote:
  ok, let me try and translate your answer ;)
  Are you saying that the data that was left on the node is
  non-primary-replicas of rows from the time before the move?
  So this implies that when a node moves in the ring, it will affect
  distribution of:
  - new keys
  - old keys primary node
  -- but will not affect distribution of old keys non-primary replicas.
  If so, still I don't understand something... I would expect even the
  non-primary replicas of keys to be moved since if they don't, how would
 they
  be found? I mean upon reads the serving node should not care about
 whether
  the row is new or old, it should have a consistent and global mapping of
  tokens. So I guess this ruins my theory...
  What did you mean then? Is this deletions of non-primary replicated data?
  How does the replication factor affect the load on the moved host then?
 
  On Tue, Jun 1, 2010 at 1:19 AM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  well, there you are then.
 
  On Mon, May 31, 2010 at 2:34 PM, Ran Tavory ran...@gmail.com wrote:
   yes, replication factor = 2
  
   On Mon, May 31, 2010 at 10:07 PM, Jonathan Ellis jbel...@gmail.com
   wrote:
  
   you have replication factor  1 ?
  
   On Mon, May 31, 2010 at 7:23 AM, Ran Tavory ran...@gmail.com
 wrote:
I hope I understand nodetool cleanup correctly - it should clean up
all
data
that does not (currently) belong to this node. If so, I think it
might
not
be working correctly.
Look at nodes 192.168.252.124 and 192.168.252.99 below
192.168.252.99Up 279.35 MB
3544607988759775661076818827414252202
 |--|
192.168.252.124Up 167.23 MB
56713727820156410577229101238628035242 |   ^
192.168.252.125Up 82.91 MB
 85070591730234615865843651857942052863 v   |
192.168.254.57Up 366.6 MB
 113427455640312821154458202477256070485|   ^
192.168.254.58Up 88.44 MB
 141784319550391026443072753096570088106v   |
192.168.254.59Up 88.45 MB
 170141183460469231731687303715884105727|--|
I wanted 124 to take all the load from 99. So I issued a move
command.
$ nodetool -h cass99 -p 9004 move
56713727820156410577229101238628035243
   
This command tells 99 to take the space b/w
   
   
   
 (56713727820156410577229101238628035242, 
 56713727820156410577229101238628035243]
which is basically just one item in the token space, almost
nothing... I
wanted it to be very slim (just playing around).
So, next I get this:
192.168.252.124Up 803.33 MB
56713727820156410577229101238628035242 |--|
192.168.252.99Up 352.85 MB
56713727820156410577229101238628035243 |   ^
192.168.252.125Up 134.24 MB
85070591730234615865843651857942052863 v   |
192.168.254.57Up 676.41 MB
113427455640312821154458202477256070485|   ^
192.168.254.58Up 99.74 MB
 141784319550391026443072753096570088106v   |
192.168.254.59Up 99.94 MB
 170141183460469231731687303715884105727|--|
The tokens are correct, but it seems that 99 still has a lot of
 data.
Why?
OK, that might be

Re: Can't get data after building cluster

2010-06-01 Thread Jonathan Shook

Depending on the key, the request would have been proxied to the first
or second node.
The CLI uses a consistency level of ONE, meaning that only a single
node's data would have been considered when you get().
Also, the responsible nodes for a given key are mapped accordingly at
request time, and proxy requests are made internally on your behalf.
This allows the R+WN to hold, where N is the replication factor. It
closes the subset of active nodes responsible for a key in a
deterministic way.

See
http://www.slideshare.net/benjaminblack/introduction-to-cassandra-replication-and-consistency
for more information.

On Tue, Jun 1, 2010 at 1:43 AM, David Boxenhorn da...@lookin2.com wrote:
I don't think it can be the case that at most data in the token range
assigned to that node will be affected - the new node had no knowledge of
any of our data. Any fake data that it might have had through some error
on my part could not have been within the range of real data. I had 4.25 G
of data on the 1st server, and as far as I could tell I couldn't access any
of it.

On Tue, Jun 1, 2010 at 9:10 AM, Jonathan Ellis jbel...@gmail.com wrote:

To elaborate:

If you manage to screw things up to where it thinks a node has data,
but it does not (adding a node without bootstrap would do this, for
instance, which is probably what you did), at most data in the token
range assigned to that node will be affected.

On Tue, Jun 1, 2010 at 12:45 AM, David Boxenhorn da...@lookin2.com
wrote:
You say no, but that is exactly what I just observed. Can I have some
more
explanation?

To recap: I added a server to my cluster. It had some junk in the
system/LocationInfo files from previous, unsuccessful attempts to add
the
server to the cluster. (They were unsuccessful because I hadn't opened
the
port on that computer.) When I finally succeeded in adding the 2nd
server,
the 1st server started returning null when I tried to get data using the
CLI. I stopped the 2nd server, deleted the files in system, restarted,
and
everything worked.

I'm afraid that this, or some similar scenario will do the same, after I
go
live. How can I protect myself?

On Mon, May 31, 2010 at 10:10 PM, Jonathan Ellis jbel...@gmail.com
wrote:

No.

On Mon, May 31, 2010 at 10:47 AM, David Boxenhorn da...@lookin2.com
wrote:
So this means that I can take my entire cluster off line if I
make a
mistake adding a new server??? Yikes!

On Mon, May 31, 2010 at 6:41 PM, David Boxenhorn da...@lookin2.com
wrote:

OK. Got it working.

I had some data in the 2nd server from previous failed attempts at
hooking
up to the cluster. When I deleted that data and tried again, it said
bootstrapping and my 1st server started working again.

On Mon, May 31, 2010 at 4:50 PM, David Boxenhorn da...@lookin2.com
wrote:

I am trying to get a cluster up and working for the first time.

I got one server up and running, with lots of data on it, which I
can
see
with the CLI.

I added my 2nd server, they seem to recognize each other.

Now I can't see my data with the CLI. I do a get and it returns
null.
The
data files seem to be intact.

What happened??? How can I fix it?

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: writing speed test

2010-06-01 Thread Jonathan Shook

Also, what are you meaning specifically by 'slow'? Which measurements
are you looking at. What are your baseline constraints for your test
system?


2010/6/1 史英杰 shiyingjie1...@gmail.com:
 Hi, It would be better if we know which Consistency Level did you choose,
 and what is the schema of test data?

 在 2010年6月1日 下午4:48，Shuai Yuan yuansh...@supertool.net.cn写道：

 Hi all,

 I'm testing writing speed of cassandra with 4 servers. I'm confused by
 the behavior of cassandra.

 ---env---
 load-data app written in c++, using libcassandra (w/ modified batch
 insert)
 20 writing threads in 2 processes running on 2 servers

 ---optimization---
 1.turn log level to INFO
 2.JVM has 8G heap
 3.32 concurrent read  128 write in storage-conf.xml, other cache
 enlarged as well.

 ---result---
 1-monitoring by `date;nodetool -h host ring`
 I add all load together and measure the writing speed by
 (load_difference / time_difference), and I get about 15MB/s for the
 whole cluster.

 2-monitoring by `iostat -m 10`
 I can watch the disk_io from the system level and have about 10MB/s -
 65MB/s for a single machine. Very big variance over time.

 3-monitoring by `iptraf -g`
 In this way I watch the communication between servers and get about
 10MB/s for a single machine.

 ---opinion---
 So, have you checked the writing speed of cassandra? I feel it's quite
 slow currently.

 Could anyone confirm this is the normal writing speed of cassandra, or
 please provide someway of improving it?
 --
 Kevin Yuan
 www.yuan-shuai.info

Handling disk-full scenarios

My nodes have 5 disks and are using them separately as data disks.  The
usage on the disks is not uniform, and one is nearly full.  Is there some
way to manually balance the files across the disks?  Pretty much anything
done via nodetool incurs an anticompaction with obviously fails.  system/ is
not the problem, it's in my data's keyspace.

Ian

Re: Which kind of applications are Cassandra fit for?

2010-06-01 Thread Jonathan Shook

There is no easy answer to this. The requirements vary widely even
within a particular type of application.
If you have a list of specific requirements for a given application,
it is easier to say whether it is a good fit.

If you need a schema marshaling system, then you will have to build it
into your application somewhere. Some client libraries support this
type of interface.
Otherwise, Cassandra doesn't make you pay for the kitchen sink if you
don't need it enough to let it take up space and time in your
application.

The storage layout of Cassandra mimics lists, sets, and maps, as used
by programmers everywhere. Cassandra is responsible for getting the
data to and from those in-memory structures. Because there is little
conceptual baggage between the in-storage representation and the
in-memory representation, this is easier to optimize for the general
case. There are a few necessary optimizations for dealing with the
underlying storage medium, but the core concepts are generic.

There are lots of bells and whistles, but they tend to fall in the
happy zone between need-to-have, and want-to-have. Because Cassandra
provides a generic service for data storage (in sets, lists, maps, and
combinations of these), it serves as a good building block for
close-to-the-metal designs, or as a layer to build more strongly-typed
or schema-constrained systems on top of.

I know this didn't answer your question, but maybe it got you in the ballpark.

Jonathan


On Tue, Jun 1, 2010 at 7:43 AM, 史英杰 shiyingjie1...@gmail.com wrote:
 Hi,ALL
  I found that most applications on Cassandra are for web applications,
 such as store friiend information or digg information, and they get good
 performance, many companies or groups want to move their applications to
 Cassandra,  so which kind of applications are Cassandra fit for?  Thanks a
 lot!

 Yingjie

Re: Which kind of applications are Cassandra fit for?

2010-06-01 Thread Rafał Krupiński


On 01.06.2010 15:32, sharanabasava raddi wrote:

1. Performance data of network storage elements which may be required
for performance tuning.
2. Data dictionaries.
3. Satellite communications.
4. General search applications.

etc.
below is the performance statistics compared to traditional Databases.

MySQL Comparison

•MySQL  50 GB Data
Writes Average : ~300 ms
Reads Average : ~350 ms

•Cassandra  50 GB Data
Writes Average : 0.12 ms
Reads Average : 15 ms


I've seen this on some cassandra presentations, but there are no details 
on schema, FKs, hardware, etc.
300ms for single write in mysql is a lot. I'd treat this statistics as 
a marketing/urban legend.

Is there any way to detect when a node is down so I can failover more effectively?

2010-06-01 Thread Patricio Echagüe

Hi all, I'm using Hector framework to interact with Cassandra and at trying
to handle failover more effectively I found it a bit complicated to fetch
all cassandra nodes that are up and running.

My goal is to keep an up-to-date list of active/up Cassandra servers to
provide HEctor every time I need to execute against the db.

I've seen this Thrift  method: get_string_property(token map) but it
returns the nodes in the ring no matter is the node is down.



Any advice?

-- 
Patricio.-

Re: writing speed test

?? 2010-06-01 15:00 -0500??Jonathan Shook??
 Also, what are you meaning specifically by 'slow'? Which measurements
 are you looking at. What are your baseline constraints for your test
 system?
 
Actually, the problem is the utilizaton of resources(for a single
machine):
CPU: 700% / 1600% (16 cores)
MEM: almost 100% (16GB)
Swap: almost 0%
Disk IO(write): 20~30MB / 200MB (7.2k raid5, benchmarked previously)
NET: up to 100Mbps / 950Mbps (1Gbps, tuned and benchmarked previously)

So the speed of generating load, about 15M/s as reported before seems
quite slow to me. I assume the system should get at least about 50MB/s
of Disk IO speed.

MEM? I don't think it plays a major role in this writing game. What's
the bottleneck of the system?

P.S
about Consistency Level, I've tried ONE/DCQUORUM and found ONE is about
10-15% faster. However that's neither a promising result.

Thanks!

Kevin
 
 2010/6/1 ?? shiyingjie1...@gmail.com:
  Hi, It would be better if we know which Consistency Level did you choose,
  and what is the schema of test data?
 
  ?? 2010??6??1?? 4:48??Shuai Yuan yuansh...@supertool.net.cn??
 
  Hi all,
 
  I'm testing writing speed of cassandra with 4 servers. I'm confused by
  the behavior of cassandra.
 
  ---env---
  load-data app written in c++, using libcassandra (w/ modified batch
  insert)
  20 writing threads in 2 processes running on 2 servers
 
  ---optimization---
  1.turn log level to INFO
  2.JVM has 8G heap
  3.32 concurrent read  128 write in storage-conf.xml, other cache
  enlarged as well.
 
  ---result---
  1-monitoring by `date;nodetool -h host ring`
  I add all load together and measure the writing speed by
  (load_difference / time_difference), and I get about 15MB/s for the
  whole cluster.
 
  2-monitoring by `iostat -m 10`
  I can watch the disk_io from the system level and have about 10MB/s -
  65MB/s for a single machine. Very big variance over time.
 
  3-monitoring by `iptraf -g`
  In this way I watch the communication between servers and get about
  10MB/s for a single machine.
 
  ---opinion---
  So, have you checked the writing speed of cassandra? I feel it's quite
  slow currently.
 
  Could anyone confirm this is the normal writing speed of cassandra, or
  please provide someway of improving it?
  --
  Kevin Yuan
  www.yuan-shuai.info
 
 
 
 
 

-- 
Kevin Yuan
www.yuan-shuai.info

Re: writing speed test

2010-06-01 Thread lwl

MEM: almost 100% (16GB)
-
maybe this is the bottleneck.
writing concerns Memtable and SSTable in memory.

在 2010年6月2日 上午9:48，Shuai Yuan yuansh...@supertool.net.cn写道：

 在 2010-06-01二的 15:00 -0500，Jonathan Shook写道：
  Also, what are you meaning specifically by 'slow'? Which measurements
  are you looking at. What are your baseline constraints for your test
  system?
 
 Actually, the problem is the utilizaton of resources(for a single
 machine):
 CPU: 700% / 1600% (16 cores)
 MEM: almost 100% (16GB)
 Swap: almost 0%
 Disk IO(write): 20~30MB / 200MB (7.2k raid5, benchmarked previously)
 NET: up to 100Mbps / 950Mbps (1Gbps, tuned and benchmarked previously)

 So the speed of generating load, about 15M/s as reported before seems
 quite slow to me. I assume the system should get at least about 50MB/s
 of Disk IO speed.

 MEM? I don't think it plays a major role in this writing game. What's
 the bottleneck of the system?

 P.S
 about Consistency Level, I've tried ONE/DCQUORUM and found ONE is about
 10-15% faster. However that's neither a promising result.

 Thanks!

 Kevin
 
  2010/6/1 史英杰 shiyingjie1...@gmail.com:
   Hi, It would be better if we know which Consistency Level did you
 choose,
   and what is the schema of test data?
  
   在 2010年6月1日 下午4:48，Shuai Yuan yuansh...@supertool.net.cn写道：
  
   Hi all,
  
   I'm testing writing speed of cassandra with 4 servers. I'm confused by
   the behavior of cassandra.
  
   ---env---
   load-data app written in c++, using libcassandra (w/ modified batch
   insert)
   20 writing threads in 2 processes running on 2 servers
  
   ---optimization---
   1.turn log level to INFO
   2.JVM has 8G heap
   3.32 concurrent read  128 write in storage-conf.xml, other cache
   enlarged as well.
  
   ---result---
   1-monitoring by `date;nodetool -h host ring`
   I add all load together and measure the writing speed by
   (load_difference / time_difference), and I get about 15MB/s for the
   whole cluster.
  
   2-monitoring by `iostat -m 10`
   I can watch the disk_io from the system level and have about 10MB/s -
   65MB/s for a single machine. Very big variance over time.
  
   3-monitoring by `iptraf -g`
   In this way I watch the communication between servers and get about
   10MB/s for a single machine.
  
   ---opinion---
   So, have you checked the writing speed of cassandra? I feel it's quite
   slow currently.
  
   Could anyone confirm this is the normal writing speed of cassandra, or
   please provide someway of improving it?
   --
   Kevin Yuan
   www.yuan-shuai.info
  
  
  
  
 

 --
 Kevin Yuan
 www.yuan-shuai.info

Re: [SPAM ] Re: writing speed test

Thanks lwl.

Then is there anyway of tuning this, faster flush to disk or else?

Cheers,

Kevin

?? 2010-06-02 09:57 +0800??lwl??
 MEM: almost 100% (16GB)
 -
 maybe this is the bottleneck.
 writing concerns Memtable and SSTable in memory.
 
 ?? 2010??6??2?? 9:48??Shuai Yuan yuansh...@supertool.net.cn??
 
 ?? 2010-06-01 15:00 -0500??Jonathan Shook??
  Also, what are you meaning specifically by 'slow'? Which
 measurements
  are you looking at. What are your baseline constraints for
 your test
  system?
 
 
 Actually, the problem is the utilizaton of resources(for a
 single
 machine):
 CPU: 700% / 1600% (16 cores)
 MEM: almost 100% (16GB)
 Swap: almost 0%
 Disk IO(write): 20~30MB / 200MB (7.2k raid5, benchmarked
 previously)
 NET: up to 100Mbps / 950Mbps (1Gbps, tuned and benchmarked
 previously)
 
 So the speed of generating load, about 15M/s as reported
 before seems
 quite slow to me. I assume the system should get at least
 about 50MB/s
 of Disk IO speed.
 
 MEM? I don't think it plays a major role in this writing game.
 What's
 the bottleneck of the system?
 
 P.S
 about Consistency Level, I've tried ONE/DCQUORUM and found ONE
 is about
 10-15% faster. However that's neither a promising result.
 
 Thanks!
 
 Kevin
 
 
  2010/6/1 ?? shiyingjie1...@gmail.com:
   Hi, It would be better if we know which Consistency Level
 did you choose,
   and what is the schema of test data?
  
   ?? 2010??6??1?? 4:48??Shuai Yuan
 yuansh...@supertool.net.cn??
  
   Hi all,
  
   I'm testing writing speed of cassandra with 4 servers.
 I'm confused by
   the behavior of cassandra.
  
   ---env---
   load-data app written in c++, using libcassandra (w/
 modified batch
   insert)
   20 writing threads in 2 processes running on 2 servers
  
   ---optimization---
   1.turn log level to INFO
   2.JVM has 8G heap
   3.32 concurrent read  128 write in storage-conf.xml,
 other cache
   enlarged as well.
  
   ---result---
   1-monitoring by `date;nodetool -h host ring`
   I add all load together and measure the writing speed by
   (load_difference / time_difference), and I get about
 15MB/s for the
   whole cluster.
  
   2-monitoring by `iostat -m 10`
   I can watch the disk_io from the system level and have
 about 10MB/s -
   65MB/s for a single machine. Very big variance over time.
  
   3-monitoring by `iptraf -g`
   In this way I watch the communication between servers and
 get about
   10MB/s for a single machine.
  
   ---opinion---
   So, have you checked the writing speed of cassandra? I
 feel it's quite
   slow currently.
  
   Could anyone confirm this is the normal writing speed of
 cassandra, or
   please provide someway of improving it?
   --
   Kevin Yuan
   www.yuan-shuai.info
  
  
  
  
 
 
 
 --
 
 Kevin Yuan
 www.yuan-shuai.info
 
 
 
 
 

-- 
Shuai Yuan 
Supertool Corp. ??
www.yuan-shuai.info

Re: [SPAM ] Re: writing speed test

2010-06-01 Thread lwl

is all the 4 servers' MEM  almost 100%?

在 2010年6月2日 上午10:12，Shuai Yuan yuansh...@supertool.net.cn写道：

 Thanks lwl.

 Then is there anyway of tuning this, faster flush to disk or else?

 Cheers,

 Kevin

 在 2010-06-02三的 09:57 +0800，lwl写道：
  MEM: almost 100% (16GB)
  -
  maybe this is the bottleneck.
  writing concerns Memtable and SSTable in memory.
 
  在 2010年6月2日 上午9:48，Shuai Yuan yuansh...@supertool.net.cn写
  道：
  在 2010-06-01二的 15:00 -0500，Jonathan Shook写道：
   Also, what are you meaning specifically by 'slow'? Which
  measurements
   are you looking at. What are your baseline constraints for
  your test
   system?
  
 
  Actually, the problem is the utilizaton of resources(for a
  single
  machine):
  CPU: 700% / 1600% (16 cores)
  MEM: almost 100% (16GB)
  Swap: almost 0%
  Disk IO(write): 20~30MB / 200MB (7.2k raid5, benchmarked
  previously)
  NET: up to 100Mbps / 950Mbps (1Gbps, tuned and benchmarked
  previously)
 
  So the speed of generating load, about 15M/s as reported
  before seems
  quite slow to me. I assume the system should get at least
  about 50MB/s
  of Disk IO speed.
 
  MEM? I don't think it plays a major role in this writing game.
  What's
  the bottleneck of the system?
 
  P.S
  about Consistency Level, I've tried ONE/DCQUORUM and found ONE
  is about
  10-15% faster. However that's neither a promising result.
 
  Thanks!
 
  Kevin
 
  
   2010/6/1 史英杰 shiyingjie1...@gmail.com:
Hi, It would be better if we know which Consistency Level
  did you choose,
and what is the schema of test data?
   
在 2010年6月1日 下午4:48，Shuai Yuan
  yuansh...@supertool.net.cn写道：
   
Hi all,
   
I'm testing writing speed of cassandra with 4 servers.
  I'm confused by
the behavior of cassandra.
   
---env---
load-data app written in c++, using libcassandra (w/
  modified batch
insert)
20 writing threads in 2 processes running on 2 servers
   
---optimization---
1.turn log level to INFO
2.JVM has 8G heap
3.32 concurrent read  128 write in storage-conf.xml,
  other cache
enlarged as well.
   
---result---
1-monitoring by `date;nodetool -h host ring`
I add all load together and measure the writing speed by
(load_difference / time_difference), and I get about
  15MB/s for the
whole cluster.
   
2-monitoring by `iostat -m 10`
I can watch the disk_io from the system level and have
  about 10MB/s -
65MB/s for a single machine. Very big variance over time.
   
3-monitoring by `iptraf -g`
In this way I watch the communication between servers and
  get about
10MB/s for a single machine.
   
---opinion---
So, have you checked the writing speed of cassandra? I
  feel it's quite
slow currently.
   
Could anyone confirm this is the normal writing speed of
  cassandra, or
please provide someway of improving it?
--
Kevin Yuan
www.yuan-shuai.info
   
   
   
   
  
 
 
  --
 
  Kevin Yuan
  www.yuan-shuai.info
 
 
 
 
 

 --
 Shuai Yuan 袁帅
 Supertool Corp. 北京学之途网络科技有限公司
 www.yuan-shuai.info

Re: [SPAM ] Re: [SPAM ] Re: writing speed test