Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
My nodes all have themselves in their list of seeds - always did - and
everything works. (You may ask why I did this. I don't know, I must have
copied it from an example somewhere.)

On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory ran...@gmail.com wrote:

 I was able to make the node join the ring but I'm confused.
 What I did is, first when adding the node, this node was not in the seeds
 list of itself. AFAIK this is how it's supposed to be. So it was able to
 transfer all data to itself from other nodes but then it stayed in the
 bootstrapping state.
 So what I did (and I don't know why it works), is add this node to the
 seeds list in its own storage-conf.xml file. Then restart the server and
 then I finally see it in the ring...
 If I had added the node to the seeds list of itself when first joining it,
 it would not join the ring but if I do it in two phases it did work.
 So it's either my misunderstanding or a bug...


 On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory ran...@gmail.com wrote:

 The new node does not see itself as part of the ring, it sees all others
 but itself, so from that perspective the view is consistent.
 The only problem is that the node never finishes to bootstrap. It stays in
 this state for hours (It's been 20 hours now...)


 $ bin/nodetool -p 9004 -h localhost streams
 Mode: Bootstrapping
 Not sending any streams.
 Not receiving any streams.


 On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall n...@riptano.com wrote:

 Does the new node have itself in the list of seeds per chance? This
 could cause some issues if so.

 On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory ran...@gmail.com wrote:
  I'm still at lost.   I haven't been able to resolve this. I tried
  adding another node at a different location on the ring but this node
  too remains stuck in the bootstrapping state for many hours without
  any of the other nodes being busy with anti compaction or anything
  else. I don't know what's keeping it from finishing the bootstrap,no
  CPU, no io, files were already streamed so what is it waiting for?
  I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
  be anything addressing a similar issue so I figured there was no point
  in upgrading. But let me know if you think there is.
  Or any other advice...
 
  On Tuesday, January 4, 2011, Ran Tavory ran...@gmail.com wrote:
  Thanks Jake, but unfortunately the streams directory is empty so I
 don't think that any of the nodes is anti-compacting data right now or had
 been in the past 5 hours. It seems that all the data was already transferred
 to the joining host but the joining node, after having received the data
 would still remain in bootstrapping mode and not join the cluster. I'm not
 sure that *all* data was transferred (perhaps other nodes need to transfer
 more data) but nothing is actually happening so I assume all has been moved.
  Perhaps it's a configuration error from my part. Should I use I use
 AutoBootstrap=true ? Anything else I should look out for in the
 configuration file or something else?
 
 
  On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani jak...@gmail.com
 wrote:
 
  In 0.6, locate the node doing anti-compaction and look in the
 streams subdirectory in the keyspace data dir to monitor the
 anti-compaction progress (it puts new SSTables for bootstrapping node in
 there)
 
 
  On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory ran...@gmail.com wrote:
 
 
  Running nodetool decommission didn't help. Actually the node refused
 to decommission itself (b/c it wasn't part of the ring). So I simply stopped
 the process, deleted all the data directories and started it again. It
 worked in the sense of the node bootstrapped again but as before, after it
 had finished moving the data nothing happened for a long time (I'm still
 waiting, but nothing seems to be happening).
 
 
 
 
  Any hints how to analyze a stuck bootstrapping node??thanks
  On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory ran...@gmail.com wrote:
  Thanks Shimi, so indeed anticompaction was run on one of the other
 nodes from the same DC but to my understanding it has already ended. A few
 hour ago...
 
 
 
  I plenty of log messages such as [1] which ended a couple of hours
 ago, and I've seen the new node streaming and accepting the data from the
 node which performed the anticompaction and so far it was normal so it
 seemed that data is at its right place. But now the new node seems sort of
 stuck. None of the other nodes is anticompacting right now or had been
 anticompacting since then.
 
 
 
 
  The new node's CPU is close to zero, it's iostats are almost zero so I
 can't find another bottleneck that would keep it hanging.
  On the IRC someone suggested I'd maybe retry to join this node,
 e.g. decommission and rejoin it again. I'll try it now...
 
 
 
 
 
 
  [1] INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721
 CompactionManager.java (line 338) AntiCompacting
 

Re: Bootstrapping taking long

2011-01-05 Thread Jake Luciani
Well your ring issues don't make sense to me, seed list should be the same
across the cluster.
I'm just thinking of other things to try, non-boostrapped nodes should join
the ring instantly but reads will fail if you aren't using quorum.


On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory ran...@gmail.com wrote:

 I haven't tried repair.  Should I?
 On Jan 5, 2011 3:48 PM, Jake Luciani jak...@gmail.com wrote:
  Have you tried not bootstrapping but setting the token and manually
 calling
  repair?
 
  On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory ran...@gmail.com wrote:
 
  My conclusion is lame: I tried this on several hosts and saw the same
  behavior, the only way I was able to join new nodes was to first start
 them
  when they are *not in* their own seeds list and after they
  finish transferring the data, then restart them with themselves *in*
 their
  own seeds list. After doing that the node would join the ring.
  This is either my misunderstanding or a bug, but the only place I found
 it
  documented stated that the new node should not be in its own seeds list.
  Version 0.6.6.
 
  On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn da...@lookin2.com
 wrote:
 
  My nodes all have themselves in their list of seeds - always did - and
  everything works. (You may ask why I did this. I don't know, I must
 have
  copied it from an example somewhere.)
 
  On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory ran...@gmail.com wrote:
 
  I was able to make the node join the ring but I'm confused.
  What I did is, first when adding the node, this node was not in the
 seeds
  list of itself. AFAIK this is how it's supposed to be. So it was able
 to
  transfer all data to itself from other nodes but then it stayed in the
  bootstrapping state.
  So what I did (and I don't know why it works), is add this node to the
  seeds list in its own storage-conf.xml file. Then restart the server
 and
  then I finally see it in the ring...
  If I had added the node to the seeds list of itself when first joining
  it, it would not join the ring but if I do it in two phases it did
 work.
  So it's either my misunderstanding or a bug...
 
 
  On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory ran...@gmail.com wrote:
 
  The new node does not see itself as part of the ring, it sees all
 others
  but itself, so from that perspective the view is consistent.
  The only problem is that the node never finishes to bootstrap. It
 stays
  in this state for hours (It's been 20 hours now...)
 
 
  $ bin/nodetool -p 9004 -h localhost streams
  Mode: Bootstrapping
  Not sending any streams.
  Not receiving any streams.
 
 
  On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall n...@riptano.com
 wrote:
 
  Does the new node have itself in the list of seeds per chance? This
  could cause some issues if so.
 
  On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory ran...@gmail.com
 wrote:
   I'm still at lost. I haven't been able to resolve this. I tried
   adding another node at a different location on the ring but this
 node
   too remains stuck in the bootstrapping state for many hours
 without
   any of the other nodes being busy with anti compaction or anything
   else. I don't know what's keeping it from finishing the
 bootstrap,no
   CPU, no io, files were already streamed so what is it waiting for?
   I read the release notes of 0.6.7 and 0.6.8 and there didn't seem
 to
   be anything addressing a similar issue so I figured there was no
  point
   in upgrading. But let me know if you think there is.
   Or any other advice...
  
   On Tuesday, January 4, 2011, Ran Tavory ran...@gmail.com wrote:
   Thanks Jake, but unfortunately the streams directory is empty so
 I
  don't think that any of the nodes is anti-compacting data right now
 or had
  been in the past 5 hours. It seems that all the data was already
 transferred
  to the joining host but the joining node, after having received the
 data
  would still remain in bootstrapping mode and not join the cluster.
 I'm not
  sure that *all* data was transferred (perhaps other nodes need to
 transfer
  more data) but nothing is actually happening so I assume all has
 been moved.
   Perhaps it's a configuration error from my part. Should I use I
 use
  AutoBootstrap=true ? Anything else I should look out for in the
  configuration file or something else?
  
  
   On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani jak...@gmail.com
  wrote:
  
   In 0.6, locate the node doing anti-compaction and look in the
  streams subdirectory in the keyspace data dir to monitor the
  anti-compaction progress (it puts new SSTables for bootstrapping
 node in
  there)
  
  
   On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory ran...@gmail.com
  wrote:
  
  
   Running nodetool decommission didn't help. Actually the node
 refused
  to decommission itself (b/c it wasn't part of the ring). So I simply
 stopped
  the process, deleted all the data directories and started it again.
 It
  worked in the sense of the node bootstrapped again but as before,
 after it
  had 

The size of the data, I must be doing smth wrong....

2011-01-05 Thread nicolas lattuada

Hi 

i have some data size issues:

i am storing super columns with the following content:

{a=1, b=2, c=3...n=14}

i am storing it 300 000 times and i have a data size on the disk about 283Mo

And in other side i have a mysql table which stores a bunch of data the schema 
follows:
6 varchars +100
5 ints +6

I put about 1 300 000 records on it and end up with 150Mo of data and 57Mo of 
index.

Then i think i am certainly doing something wrong...

The other thing is when i run flush and then compact the size of my data 
increases, then i imagine something is copied up on compaction
So is there a way to remove the unused data? (cleanup doesn t seem to do the 
job).

Any help to reduce the size of the data would be greatly apreciated!
Greetings

  

Re: Cassandra 0.7 - Query on network topology

2011-01-05 Thread Jonathan Ellis
On Wed, Jan 5, 2011 at 3:37 AM, Narendra Sharma
narendra.sha...@gmail.com wrote:
 What I am looking for is:
 1. Some way to send requests for keys whose token fall between 0-25 to B and
 never to C even though C will have the data due to it being replica of B.
 2. Only when B is down or not reachable, the request should go to C.
 3. Once the requests start going to C, they should continue unless C is down
 and in which case the requests should then go to B.

 My understanding is that SimpleSnitch should fit here except for the
 enforcing #3 above.

Right, with the caveat that you'll probably want to set the dynamic
snitch badness threshold to allow switching to B even if C merely gets
overloaded rather than completely down.  The alternative is disabling
the dynamic snitch entirely.

 will SimpleSnitch come into
 picture if the request from client reaches node C directly?

Yes.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: The size of the data, I must be doing smth wrong....

2011-01-05 Thread Jonathan Ellis
It's normal for Cassandra to use more disk space than MySQL.  It's
part of what we trade for not having to rewrite every row when you add
a new column.

SSTables that are obsoleted by a compaction are deleted
asynchronously when the JVM performs a GC.
http://wiki.apache.org/cassandra/MemtableSSTable

On Wed, Jan 5, 2011 at 8:35 AM, nicolas lattuada
nicolaslattu...@hotmail.fr wrote:
 Hi

 i have some data size issues:

 i am storing super columns with the following content:

 {a=1, b=2, c=3...n=14}

 i am storing it 300 000 times and i have a data size on the disk about 283Mo

 And in other side i have a mysql table which stores a bunch of data the
 schema follows:
 6 varchars +100
 5 ints +6

 I put about 1 300 000 records on it and end up with 150Mo of data and 57Mo
 of index.

 Then i think i am certainly doing something wrong...

 The other thing is when i run flush and then compact the size of my data
 increases, then i imagine something is copied up on compaction
 So is there a way to remove the unused data? (cleanup doesn t seem to do the
 job).

 Any help to reduce the size of the data would be greatly apreciated!
 Greetings





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: The size of the data, I must be doing smth wrong....

2011-01-05 Thread Edward Capriolo
On Wed, Jan 5, 2011 at 9:52 AM, Jonathan Ellis jbel...@gmail.com wrote:
 It's normal for Cassandra to use more disk space than MySQL.  It's
 part of what we trade for not having to rewrite every row when you add
 a new column.

 SSTables that are obsoleted by a compaction are deleted
 asynchronously when the JVM performs a GC.
 http://wiki.apache.org/cassandra/MemtableSSTable

 On Wed, Jan 5, 2011 at 8:35 AM, nicolas lattuada
 nicolaslattu...@hotmail.fr wrote:
 Hi

 i have some data size issues:

 i am storing super columns with the following content:

 {a=1, b=2, c=3...n=14}

 i am storing it 300 000 times and i have a data size on the disk about 283Mo

 And in other side i have a mysql table which stores a bunch of data the
 schema follows:
 6 varchars +100
 5 ints +6

 I put about 1 300 000 records on it and end up with 150Mo of data and 57Mo
 of index.

 Then i think i am certainly doing something wrong...

 The other thing is when i run flush and then compact the size of my data
 increases, then i imagine something is copied up on compaction
 So is there a way to remove the unused data? (cleanup doesn t seem to do the
 job).

 Any help to reduce the size of the data would be greatly apreciated!
 Greetings





 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com


Unlike datastores that are delimited or have fixed column sizes
Cassandra does not. Each row is a Sorted Map of columns. A Column is a
tupple of {columnname,columnvalue,time}. Also the data is not stored
as tersely as it is inside mysql.


Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
In storage-conf I see this comment [1] from which I understand that the
recommended way to bootstrap a new node is to set AutoBootstrap=true and
remove itself from the seeds list.
Moreover, I did try to set AutoBootstrap=true and have the node in its own
seeds list, but it would not bootstrap. I don't recall the exact message but
it was something like I found myself in the seeds list therefore I'm not
going to bootstrap even though AutoBootstrap is true.

[1]
  !--
   ~ Turn on to make new [non-seed] nodes automatically migrate the right
data
   ~ to themselves.  (If no InitialToken is specified, they will pick one
   ~ such that they will get half the range of the most-loaded node.)
   ~ If a node starts up without bootstrapping, it will mark itself
bootstrapped
   ~ so that you can't subsequently accidently bootstrap a node with
   ~ data on it.  (You can reset this by wiping your data and commitlog
   ~ directories.)
   ~
   ~ Off by default so that new clusters and upgraders from 0.4 don't
   ~ bootstrap immediately.  You should turn this on when you start adding
   ~ new nodes to a cluster that already has data on it.  (If you are
upgrading
   ~ from 0.4, start your cluster with it off once before changing it to
true.
   ~ Otherwise, no data will be lost but you will incur a lot of unnecessary
   ~ I/O before your cluster starts up.)
  --
  AutoBootstrapfalse/AutoBootstrap

On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn da...@lookin2.com wrote:

 If seed list should be the same across the cluster that means that nodes
 *should* have themselves as a seed. If that doesn't work for Ran, then that
 is the first problem, no?


 On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani jak...@gmail.com wrote:

 Well your ring issues don't make sense to me, seed list should be the same
 across the cluster.
 I'm just thinking of other things to try, non-boostrapped nodes should
 join the ring instantly but reads will fail if you aren't using quorum.


 On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory ran...@gmail.com wrote:

 I haven't tried repair.  Should I?
 On Jan 5, 2011 3:48 PM, Jake Luciani jak...@gmail.com wrote:
  Have you tried not bootstrapping but setting the token and manually
 calling
  repair?
 
  On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory ran...@gmail.com wrote:
 
  My conclusion is lame: I tried this on several hosts and saw the same
  behavior, the only way I was able to join new nodes was to first start
 them
  when they are *not in* their own seeds list and after they
  finish transferring the data, then restart them with themselves *in*
 their
  own seeds list. After doing that the node would join the ring.
  This is either my misunderstanding or a bug, but the only place I
 found it
  documented stated that the new node should not be in its own seeds
 list.
  Version 0.6.6.
 
  On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn da...@lookin2.com
 wrote:
 
  My nodes all have themselves in their list of seeds - always did -
 and
  everything works. (You may ask why I did this. I don't know, I must
 have
  copied it from an example somewhere.)
 
  On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory ran...@gmail.com wrote:
 
  I was able to make the node join the ring but I'm confused.
  What I did is, first when adding the node, this node was not in the
 seeds
  list of itself. AFAIK this is how it's supposed to be. So it was
 able to
  transfer all data to itself from other nodes but then it stayed in
 the
  bootstrapping state.
  So what I did (and I don't know why it works), is add this node to
 the
  seeds list in its own storage-conf.xml file. Then restart the server
 and
  then I finally see it in the ring...
  If I had added the node to the seeds list of itself when first
 joining
  it, it would not join the ring but if I do it in two phases it did
 work.
  So it's either my misunderstanding or a bug...
 
 
  On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory ran...@gmail.com
 wrote:
 
  The new node does not see itself as part of the ring, it sees all
 others
  but itself, so from that perspective the view is consistent.
  The only problem is that the node never finishes to bootstrap. It
 stays
  in this state for hours (It's been 20 hours now...)
 
 
  $ bin/nodetool -p 9004 -h localhost streams
  Mode: Bootstrapping
  Not sending any streams.
  Not receiving any streams.
 
 
  On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall n...@riptano.com
 wrote:
 
  Does the new node have itself in the list of seeds per chance?
 This
  could cause some issues if so.
 
  On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory ran...@gmail.com
 wrote:
   I'm still at lost. I haven't been able to resolve this. I tried
   adding another node at a different location on the ring but this
 node
   too remains stuck in the bootstrapping state for many hours
 without
   any of the other nodes being busy with anti compaction or
 anything
   else. I don't know what's keeping it from finishing the
 bootstrap,no
   CPU, no io, files were already streamed so 

Re: Bootstrapping taking long

2011-01-05 Thread Thibaut Britz
https://issues.apache.org/jira/browse/CASSANDRA-1676

you have to use at least 0.6.7


On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory ran...@gmail.com wrote:
  In storage-conf I see this comment [1] from which I understand that the
  recommended way to bootstrap a new node is to set AutoBootstrap=true and
  remove itself from the seeds list.
  Moreover, I did try to set AutoBootstrap=true and have the node in its
 own
  seeds list, but it would not bootstrap. I don't recall the exact message
 but
  it was something like I found myself in the seeds list therefore I'm not
  going to bootstrap even though AutoBootstrap is true.
 
  [1]
!--
 ~ Turn on to make new [non-seed] nodes automatically migrate the right
  data
 ~ to themselves.  (If no InitialToken is specified, they will pick one
 ~ such that they will get half the range of the most-loaded node.)
 ~ If a node starts up without bootstrapping, it will mark itself
  bootstrapped
 ~ so that you can't subsequently accidently bootstrap a node with
 ~ data on it.  (You can reset this by wiping your data and commitlog
 ~ directories.)
 ~
 ~ Off by default so that new clusters and upgraders from 0.4 don't
 ~ bootstrap immediately.  You should turn this on when you start
 adding
 ~ new nodes to a cluster that already has data on it.  (If you are
  upgrading
 ~ from 0.4, start your cluster with it off once before changing it to
  true.
 ~ Otherwise, no data will be lost but you will incur a lot of
 unnecessary
 ~ I/O before your cluster starts up.)
--
AutoBootstrapfalse/AutoBootstrap
  On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn da...@lookin2.com
 wrote:
 
  If seed list should be the same across the cluster that means that
 nodes
  *should* have themselves as a seed. If that doesn't work for Ran, then
 that
  is the first problem, no?
 
 
  On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani jak...@gmail.com wrote:
 
  Well your ring issues don't make sense to me, seed list should be the
  same across the cluster.
  I'm just thinking of other things to try, non-boostrapped nodes should
  join the ring instantly but reads will fail if you aren't using quorum.
 
  On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory ran...@gmail.com wrote:
 
  I haven't tried repair.  Should I?
 
  On Jan 5, 2011 3:48 PM, Jake Luciani jak...@gmail.com wrote:
   Have you tried not bootstrapping but setting the token and manually
   calling
   repair?
  
   On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory ran...@gmail.com
 wrote:
  
   My conclusion is lame: I tried this on several hosts and saw the
 same
   behavior, the only way I was able to join new nodes was to first
   start them
   when they are *not in* their own seeds list and after they
   finish transferring the data, then restart them with themselves
 *in*
   their
   own seeds list. After doing that the node would join the ring.
   This is either my misunderstanding or a bug, but the only place I
   found it
   documented stated that the new node should not be in its own seeds
   list.
   Version 0.6.6.
  
   On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
   da...@lookin2.comwrote:
  
   My nodes all have themselves in their list of seeds - always did -
   and
   everything works. (You may ask why I did this. I don't know, I
 must
   have
   copied it from an example somewhere.)
  
   On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory ran...@gmail.com
 wrote:
  
   I was able to make the node join the ring but I'm confused.
   What I did is, first when adding the node, this node was not in
 the
   seeds
   list of itself. AFAIK this is how it's supposed to be. So it was
   able to
   transfer all data to itself from other nodes but then it stayed
 in
   the
   bootstrapping state.
   So what I did (and I don't know why it works), is add this node
 to
   the
   seeds list in its own storage-conf.xml file. Then restart the
   server and
   then I finally see it in the ring...
   If I had added the node to the seeds list of itself when first
   joining
   it, it would not join the ring but if I do it in two phases it
 did
   work.
   So it's either my misunderstanding or a bug...
  
  
   On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory ran...@gmail.com
   wrote:
  
   The new node does not see itself as part of the ring, it sees
 all
   others
   but itself, so from that perspective the view is consistent.
   The only problem is that the node never finishes to bootstrap.
 It
   stays
   in this state for hours (It's been 20 hours now...)
  
  
   $ bin/nodetool -p 9004 -h localhost streams
   Mode: Bootstrapping
   Not sending any streams.
   Not receiving any streams.
  
  
   On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall n...@riptano.com
   wrote:
  
   Does the new node have itself in the list of seeds per chance?
   This
   could cause some issues if so.
  
   On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory 

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
@Thibaut wrong email? Or how's Avoid dropping messages off the client
request path (CASSANDRA-1676) related to the bootstrap questions I had?

On Wed, Jan 5, 2011 at 5:23 PM, Thibaut Britz thibaut.br...@trendiction.com
 wrote:

 https://issues.apache.org/jira/browse/CASSANDRA-1676

 you have to use at least 0.6.7



 On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory ran...@gmail.com wrote:
  In storage-conf I see this comment [1] from which I understand that the
  recommended way to bootstrap a new node is to set AutoBootstrap=true and
  remove itself from the seeds list.
  Moreover, I did try to set AutoBootstrap=true and have the node in its
 own
  seeds list, but it would not bootstrap. I don't recall the exact message
 but
  it was something like I found myself in the seeds list therefore I'm
 not
  going to bootstrap even though AutoBootstrap is true.
 
  [1]
!--
 ~ Turn on to make new [non-seed] nodes automatically migrate the
 right
  data
 ~ to themselves.  (If no InitialToken is specified, they will pick
 one
 ~ such that they will get half the range of the most-loaded node.)
 ~ If a node starts up without bootstrapping, it will mark itself
  bootstrapped
 ~ so that you can't subsequently accidently bootstrap a node with
 ~ data on it.  (You can reset this by wiping your data and commitlog
 ~ directories.)
 ~
 ~ Off by default so that new clusters and upgraders from 0.4 don't
 ~ bootstrap immediately.  You should turn this on when you start
 adding
 ~ new nodes to a cluster that already has data on it.  (If you are
  upgrading
 ~ from 0.4, start your cluster with it off once before changing it to
  true.
 ~ Otherwise, no data will be lost but you will incur a lot of
 unnecessary
 ~ I/O before your cluster starts up.)
--
AutoBootstrapfalse/AutoBootstrap
  On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn da...@lookin2.com
 wrote:
 
  If seed list should be the same across the cluster that means that
 nodes
  *should* have themselves as a seed. If that doesn't work for Ran, then
 that
  is the first problem, no?
 
 
  On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani jak...@gmail.com wrote:
 
  Well your ring issues don't make sense to me, seed list should be the
  same across the cluster.
  I'm just thinking of other things to try, non-boostrapped nodes should
  join the ring instantly but reads will fail if you aren't using
 quorum.
 
  On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory ran...@gmail.com wrote:
 
  I haven't tried repair.  Should I?
 
  On Jan 5, 2011 3:48 PM, Jake Luciani jak...@gmail.com wrote:
   Have you tried not bootstrapping but setting the token and manually
   calling
   repair?
  
   On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory ran...@gmail.com
 wrote:
  
   My conclusion is lame: I tried this on several hosts and saw the
 same
   behavior, the only way I was able to join new nodes was to first
   start them
   when they are *not in* their own seeds list and after they
   finish transferring the data, then restart them with themselves
 *in*
   their
   own seeds list. After doing that the node would join the ring.
   This is either my misunderstanding or a bug, but the only place I
   found it
   documented stated that the new node should not be in its own seeds
   list.
   Version 0.6.6.
  
   On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
   da...@lookin2.comwrote:
  
   My nodes all have themselves in their list of seeds - always did
 -
   and
   everything works. (You may ask why I did this. I don't know, I
 must
   have
   copied it from an example somewhere.)
  
   On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory ran...@gmail.com
 wrote:
  
   I was able to make the node join the ring but I'm confused.
   What I did is, first when adding the node, this node was not in
 the
   seeds
   list of itself. AFAIK this is how it's supposed to be. So it was
   able to
   transfer all data to itself from other nodes but then it stayed
 in
   the
   bootstrapping state.
   So what I did (and I don't know why it works), is add this node
 to
   the
   seeds list in its own storage-conf.xml file. Then restart the
   server and
   then I finally see it in the ring...
   If I had added the node to the seeds list of itself when first
   joining
   it, it would not join the ring but if I do it in two phases it
 did
   work.
   So it's either my misunderstanding or a bug...
  
  
   On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory ran...@gmail.com
   wrote:
  
   The new node does not see itself as part of the ring, it sees
 all
   others
   but itself, so from that perspective the view is consistent.
   The only problem is that the node never finishes to bootstrap.
 It
   stays
   in this state for hours (It's been 20 hours now...)
  
  
   $ bin/nodetool -p 9004 -h localhost streams
   Mode: Bootstrapping
   Not sending any streams.
   Not receiving any streams.
  
  
   

Re: Bootstrapping taking long

2011-01-05 Thread Thibaut Britz
Had the same Problem a while ago. Upgrading solved the problem (Don't know
if you have to redeploy your cluster though)

http://www.mail-archive.com/user@cassandra.apache.org/msg07106.html


On Wed, Jan 5, 2011 at 4:29 PM, Ran Tavory ran...@gmail.com wrote:

 @Thibaut wrong email? Or how's Avoid dropping messages off the client
 request path (CASSANDRA-1676) related to the bootstrap questions I had?


 On Wed, Jan 5, 2011 at 5:23 PM, Thibaut Britz 
 thibaut.br...@trendiction.com wrote:

 https://issues.apache.org/jira/browse/CASSANDRA-1676

 you have to use at least 0.6.7



 On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory ran...@gmail.com wrote:
  In storage-conf I see this comment [1] from which I understand that the
  recommended way to bootstrap a new node is to set AutoBootstrap=true
 and
  remove itself from the seeds list.
  Moreover, I did try to set AutoBootstrap=true and have the node in its
 own
  seeds list, but it would not bootstrap. I don't recall the exact
 message but
  it was something like I found myself in the seeds list therefore I'm
 not
  going to bootstrap even though AutoBootstrap is true.
 
  [1]
!--
 ~ Turn on to make new [non-seed] nodes automatically migrate the
 right
  data
 ~ to themselves.  (If no InitialToken is specified, they will pick
 one
 ~ such that they will get half the range of the most-loaded node.)
 ~ If a node starts up without bootstrapping, it will mark itself
  bootstrapped
 ~ so that you can't subsequently accidently bootstrap a node with
 ~ data on it.  (You can reset this by wiping your data and commitlog
 ~ directories.)
 ~
 ~ Off by default so that new clusters and upgraders from 0.4 don't
 ~ bootstrap immediately.  You should turn this on when you start
 adding
 ~ new nodes to a cluster that already has data on it.  (If you are
  upgrading
 ~ from 0.4, start your cluster with it off once before changing it
 to
  true.
 ~ Otherwise, no data will be lost but you will incur a lot of
 unnecessary
 ~ I/O before your cluster starts up.)
--
AutoBootstrapfalse/AutoBootstrap
  On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn da...@lookin2.com
 wrote:
 
  If seed list should be the same across the cluster that means that
 nodes
  *should* have themselves as a seed. If that doesn't work for Ran, then
 that
  is the first problem, no?
 
 
  On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani jak...@gmail.com
 wrote:
 
  Well your ring issues don't make sense to me, seed list should be the
  same across the cluster.
  I'm just thinking of other things to try, non-boostrapped nodes
 should
  join the ring instantly but reads will fail if you aren't using
 quorum.
 
  On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory ran...@gmail.com wrote:
 
  I haven't tried repair.  Should I?
 
  On Jan 5, 2011 3:48 PM, Jake Luciani jak...@gmail.com wrote:
   Have you tried not bootstrapping but setting the token and
 manually
   calling
   repair?
  
   On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory ran...@gmail.com
 wrote:
  
   My conclusion is lame: I tried this on several hosts and saw the
 same
   behavior, the only way I was able to join new nodes was to first
   start them
   when they are *not in* their own seeds list and after they
   finish transferring the data, then restart them with themselves
 *in*
   their
   own seeds list. After doing that the node would join the ring.
   This is either my misunderstanding or a bug, but the only place I
   found it
   documented stated that the new node should not be in its own
 seeds
   list.
   Version 0.6.6.
  
   On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
   da...@lookin2.comwrote:
  
   My nodes all have themselves in their list of seeds - always did
 -
   and
   everything works. (You may ask why I did this. I don't know, I
 must
   have
   copied it from an example somewhere.)
  
   On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory ran...@gmail.com
 wrote:
  
   I was able to make the node join the ring but I'm confused.
   What I did is, first when adding the node, this node was not in
 the
   seeds
   list of itself. AFAIK this is how it's supposed to be. So it
 was
   able to
   transfer all data to itself from other nodes but then it stayed
 in
   the
   bootstrapping state.
   So what I did (and I don't know why it works), is add this node
 to
   the
   seeds list in its own storage-conf.xml file. Then restart the
   server and
   then I finally see it in the ring...
   If I had added the node to the seeds list of itself when first
   joining
   it, it would not join the ring but if I do it in two phases it
 did
   work.
   So it's either my misunderstanding or a bug...
  
  
   On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory ran...@gmail.com
   wrote:
  
   The new node does not see itself as part of the ring, it sees
 all
   others
   but itself, so from that perspective the view is consistent.
   

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
OK, thanks, so I see we had the same problem (I too had multiple keyspace,
not that I know why it matters to the problem at hand) and I see that by
upgrading to 0.6.7 you solved your problem (I didn't try it, had a different
workaround) but frankly, I don't understand how
https://issues.apache.org/jira/browse/CASSANDRA-1676 would relate the the
stuck bootstrap problem (I'm not saying that it isn't, I'd just like to
understand why...)


On Wed, Jan 5, 2011 at 5:42 PM, Thibaut Britz thibaut.br...@trendiction.com
 wrote:

 Had the same Problem a while ago. Upgrading solved the problem (Don't know
 if you have to redeploy your cluster though)

 http://www.mail-archive.com/user@cassandra.apache.org/msg07106.html



 On Wed, Jan 5, 2011 at 4:29 PM, Ran Tavory ran...@gmail.com wrote:

 @Thibaut wrong email? Or how's Avoid dropping messages off the client
 request path (CASSANDRA-1676) related to the bootstrap questions I had?


 On Wed, Jan 5, 2011 at 5:23 PM, Thibaut Britz 
 thibaut.br...@trendiction.com wrote:

 https://issues.apache.org/jira/browse/CASSANDRA-1676

 you have to use at least 0.6.7



 On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo 
 edlinuxg...@gmail.comwrote:

 On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory ran...@gmail.com wrote:
  In storage-conf I see this comment [1] from which I understand that
 the
  recommended way to bootstrap a new node is to set AutoBootstrap=true
 and
  remove itself from the seeds list.
  Moreover, I did try to set AutoBootstrap=true and have the node in its
 own
  seeds list, but it would not bootstrap. I don't recall the exact
 message but
  it was something like I found myself in the seeds list therefore I'm
 not
  going to bootstrap even though AutoBootstrap is true.
 
  [1]
!--
 ~ Turn on to make new [non-seed] nodes automatically migrate the
 right
  data
 ~ to themselves.  (If no InitialToken is specified, they will pick
 one
 ~ such that they will get half the range of the most-loaded node.)
 ~ If a node starts up without bootstrapping, it will mark itself
  bootstrapped
 ~ so that you can't subsequently accidently bootstrap a node with
 ~ data on it.  (You can reset this by wiping your data and
 commitlog
 ~ directories.)
 ~
 ~ Off by default so that new clusters and upgraders from 0.4 don't
 ~ bootstrap immediately.  You should turn this on when you start
 adding
 ~ new nodes to a cluster that already has data on it.  (If you are
  upgrading
 ~ from 0.4, start your cluster with it off once before changing it
 to
  true.
 ~ Otherwise, no data will be lost but you will incur a lot of
 unnecessary
 ~ I/O before your cluster starts up.)
--
AutoBootstrapfalse/AutoBootstrap
  On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn da...@lookin2.com
 wrote:
 
  If seed list should be the same across the cluster that means that
 nodes
  *should* have themselves as a seed. If that doesn't work for Ran,
 then that
  is the first problem, no?
 
 
  On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani jak...@gmail.com
 wrote:
 
  Well your ring issues don't make sense to me, seed list should be
 the
  same across the cluster.
  I'm just thinking of other things to try, non-boostrapped nodes
 should
  join the ring instantly but reads will fail if you aren't using
 quorum.
 
  On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory ran...@gmail.com
 wrote:
 
  I haven't tried repair.  Should I?
 
  On Jan 5, 2011 3:48 PM, Jake Luciani jak...@gmail.com wrote:
   Have you tried not bootstrapping but setting the token and
 manually
   calling
   repair?
  
   On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory ran...@gmail.com
 wrote:
  
   My conclusion is lame: I tried this on several hosts and saw the
 same
   behavior, the only way I was able to join new nodes was to first
   start them
   when they are *not in* their own seeds list and after they
   finish transferring the data, then restart them with themselves
 *in*
   their
   own seeds list. After doing that the node would join the ring.
   This is either my misunderstanding or a bug, but the only place
 I
   found it
   documented stated that the new node should not be in its own
 seeds
   list.
   Version 0.6.6.
  
   On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
   da...@lookin2.comwrote:
  
   My nodes all have themselves in their list of seeds - always
 did -
   and
   everything works. (You may ask why I did this. I don't know, I
 must
   have
   copied it from an example somewhere.)
  
   On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory ran...@gmail.com
 wrote:
  
   I was able to make the node join the ring but I'm confused.
   What I did is, first when adding the node, this node was not
 in the
   seeds
   list of itself. AFAIK this is how it's supposed to be. So it
 was
   able to
   transfer all data to itself from other nodes but then it
 stayed in
   the
   bootstrapping state.
   So what I did (and I don't know why it works), is add this
 node to
   the
   seeds list in its 

Re: Bootstrapping taking long

2011-01-05 Thread Jonathan Ellis
1676 says Avoid dropping messages off the client request path.
Bootstrap messages are off the client requst path.  So, if some of
the nodes involved were loaded enough that they were dropping messages
older than RPC_TIMEOUT to cope, it could lose part of the bootstrap
communication permanently.

On Wed, Jan 5, 2011 at 10:01 AM, Ran Tavory ran...@gmail.com wrote:
 OK, thanks, so I see we had the same problem (I too had multiple keyspace,
 not that I know why it matters to the problem at hand) and I see that by
 upgrading to 0.6.7 you solved your problem (I didn't try it, had a different
 workaround) but frankly, I don't understand
 how https://issues.apache.org/jira/browse/CASSANDRA-1676 would relate the
 the stuck bootstrap problem (I'm not saying that it isn't, I'd just like
 to understand why...)

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Nate McCall
It was our original intention on discussing this feature was to have
back-and-forth conversion from timestamps (we were modelling similar
functionality in Pycassa). It's lack of inclusion may have just been
an oversight. We will add this in Hector trunk shortly - thanks for
the complete code sample.



On Tue, Jan 4, 2011 at 10:06 PM, Roshan Dawrani roshandawr...@gmail.com wrote:
 Ok, found the solution - finally ! - by applying opposite of what
 createTime() does in TimeUUIDUtils. Ideally I would have preferred for this
 solution to come from Hector API, so I didn't have to be tied to the private
 createTime() implementation.

 
 import java.util.UUID;
 import me.prettyprint.cassandra.utils.TimeUUIDUtils;

 public class TryHector {
     public static void main(String[] args) throws Exception {
         final long NUM_100NS_INTERVALS_SINCE_UUID_EPOCH =
 0x01b21dd213814000L;

         UUID u1 = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
         final long t1 = u1.timestamp();

         long tmp = (t1 - NUM_100NS_INTERVALS_SINCE_UUID_EPOCH) / 1;

         UUID u2 = TimeUUIDUtils.getTimeUUID(tmp);
         long t2 = u2.timestamp();

         System.out.println(u2.equals(u1));
         System.out.println(t2 == t1);
     }

 }
  


 On Wed, Jan 5, 2011 at 8:15 AM, Roshan Dawrani roshandawr...@gmail.com
 wrote:

 If I use com.eaio.uuid.UUID directly, then I am able to do what I need
 (attached a Java program for the same), but unfortunately I need to deal
 with java.util.UUID in my application and I don't have its equivalent
 com.eaio.uuid.UUID at the point where I need the timestamp value.

 Any suggestion on how I can achieve the equivalent using Hector library's
 TimeUUIDUtils?

 On Wed, Jan 5, 2011 at 7:21 AM, Roshan Dawrani roshandawr...@gmail.com
 wrote:

 Hi Victor / Patricio,

 I have been using Hector library's TimeUUIDUtils. I also just looked at
 TimeUUIDUtilsTest also but didn't find anything similar being tested there.

 Here is what I am trying and it's not working - I am creating a Time
 UUID, extracting its timestamp value and with that I create another Time
 UUID and I am expecting both time UUIDs to have the same timestamp() value -
 am I doing / expecting something wrong here?:

 ===
 import java.util.UUID;
 import me.prettyprint.cassandra.utils.TimeUUIDUtils;

 public class TryHector {
     public static void main(String[] args) throws Exception {
         UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
         long timestamp1 = someUUID.timestamp();

         UUID otherUUID = TimeUUIDUtils.getTimeUUID(timestamp1);
         long timestamp2 = otherUUID.timestamp();

         System.out.println(timestamp1);
         System.out.println(timestamp2);
     }
 }
 ===

 I have to create the timestamp() equivalent of my time UUIDs so I can
 send it to my UI client, for which it will be simpler to compare long
 timestamp than comparing UUIDs. Then for the long timestamp chosen by the
 client, I need to re-create the equivalent time UUID and go and filter the
 data from Cassandra database.

 --
 Roshan
 Blog: http://roshandawrani.wordpress.com/
 Twitter: @roshandawrani
 Skype: roshandawrani

 On Wed, Jan 5, 2011 at 1:32 AM, Victor Kabdebon
 victor.kabde...@gmail.com wrote:

 Hi Roshan,
 Sorry I misunderstood your problem.It is weird that it doesn't work, it
 works for me...
 As Patricio pointed out use hector standard way of creating TimeUUID
 and tell us if it still doesn't work.
 Maybe you can paste here some of the code you use to query your columns
 too.

 Victor K.
 http://www.voxnucleus.fr

 2011/1/4 Patricio Echagüe patric...@gmail.com

 In Hector framework, take a look at TimeUUIDUtils.java

 You can create a UUID using   TimeUUIDUtils.getTimeUUID(long time); or
 TimeUUIDUtils.getTimeUUID(ClockResolution clock)

 and later on, TimeUUIDUtils.getTimeFromUUID(..) or just
 UUID.timestamp();

 There are some example in TimeUUIDUtilsTest.java

 Let me know if it helps.



 On Tue, Jan 4, 2011 at 10:27 AM, Roshan Dawrani
 roshandawr...@gmail.com wrote:

 Hello Victor,

 It is actually not that I need the 2 UUIDs to be exactly same - they
 need to be same timestamp wise.

 So, what I need is to extract the timestamp portion from a time UUID
 (say, U1) and then later in the cycle, use the same long timestamp value 
 to
 re-create a UUID (say, U2) that is equivalent of the previous one in 
 terms
 of its timestamp portion - i.e., I should be able to give this U2 and 
 filter
 the data from a column family - and it should be same as if I had used 
 the
 original UUID U1.

 Does it make any more sense than before? Any way I can do that?

 rgds,
 Roshan

 On Tue, Jan 4, 2011 at 11:46 PM, Victor Kabdebon
 victor.kabde...@gmail.com wrote:

 Hello Roshan,
 Well it 

Question about replication

2011-01-05 Thread Mayuresh Kulkarni


Hello,

Is it possible to set the replication factor to some kind of ALL setting 
so that all data gets replicated to all nodes and if a new node is 
dynamically added to the cluster, the current nodes replicate their data 
to it?


Thanks,
Mayuresh


Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread Peter Schuller
 The CLI sometimes gets only 100 results (even though there are more) - and
 sometimes gets all the results, even when there are more than 100!

 What is going on here? Is there some logic that says if there are too many
 results return 100, even though too many can be more than 100?

API calls have a limit since streaming is not supported and you could
potentially have almost arbitrary large result sets. I believe
cassandra-cli will allow you to set the limit if you look at the
'help' output and look for the word 'limit'.

The way to iterate over large amounts of data is to do paging, with
multiple queries.

-- 
/ Peter Schuller


Re: Cassandra 0.7 - Query on network topology

2011-01-05 Thread Peter Schuller
 1. Some way to send requests for keys whose token fall between 0-25 to B and
 never to C even though C will have the data due to it being replica of B.

If your data set is large, be mindful of the fact that this will cause
C to be completely cold in terms of caches. I.e., when B does go down,
C will take lots of iops.

-- 
/ Peter Schuller


Re: Question about replication

2011-01-05 Thread Jonathan Ellis
No.

On Wed, Jan 5, 2011 at 10:38 AM, Mayuresh Kulkarni kul...@cs.rpi.edu wrote:

 Hello,

 Is it possible to set the replication factor to some kind of ALL setting
 so that all data gets replicated to all nodes and if a new node is
 dynamically added to the cluster, the current nodes replicate their data to
 it?

 Thanks,
 Mayuresh




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread David Boxenhorn
I know that there's a limit, and I just assumed that the CLI set it to 100,
until I saw more than 100 results.

On Wed, Jan 5, 2011 at 6:56 PM, Peter Schuller
peter.schul...@infidyne.comwrote:

  The CLI sometimes gets only 100 results (even though there are more) -
 and
  sometimes gets all the results, even when there are more than 100!
 
  What is going on here? Is there some logic that says if there are too
 many
  results return 100, even though too many can be more than 100?

 API calls have a limit since streaming is not supported and you could
 potentially have almost arbitrary large result sets. I believe
 cassandra-cli will allow you to set the limit if you look at the
 'help' output and look for the word 'limit'.

 The way to iterate over large amounts of data is to do paging, with
 multiple queries.

 --
 / Peter Schuller



Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread Peter Schuller
 I know that there's a limit, and I just assumed that the CLI set it to 100,
 until I saw more than 100 results.

Ooh, sorry. Didn't read carefully enough. Not sure why you see that
behavior. Sounds strange; should not be supported at the thrift level
AFAIK.

-- 
/ Peter Schuller


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Patricio Echagüe
Roshan, just a comment in your solution. The time returned is not a simple
long. It also contains some bits indicating the version.
On the other hand, you are assuming that the same machine is processing your
request and recreating a UUID base on a long you provide. The
clockseqAndNode id will vary if another machine takes care of the request
(referring to your use case) .

Is it possible for you to send the UUID to the view? I think that would be
the correct behavior as a simple long does not contain enough information to
recreate the original UUID.

Does it make sense?

On Wed, Jan 5, 2011 at 8:36 AM, Nate McCall n...@riptano.com wrote:

 It was our original intention on discussing this feature was to have
 back-and-forth conversion from timestamps (we were modelling similar
 functionality in Pycassa). It's lack of inclusion may have just been
 an oversight. We will add this in Hector trunk shortly - thanks for
 the complete code sample.



 On Tue, Jan 4, 2011 at 10:06 PM, Roshan Dawrani roshandawr...@gmail.com
 wrote:
  Ok, found the solution - finally ! - by applying opposite of what
  createTime() does in TimeUUIDUtils. Ideally I would have preferred for
 this
  solution to come from Hector API, so I didn't have to be tied to the
 private
  createTime() implementation.
 
  
  import java.util.UUID;
  import me.prettyprint.cassandra.utils.TimeUUIDUtils;
 
  public class TryHector {
  public static void main(String[] args) throws Exception {
  final long NUM_100NS_INTERVALS_SINCE_UUID_EPOCH =
  0x01b21dd213814000L;
 
  UUID u1 = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
  final long t1 = u1.timestamp();
 
  long tmp = (t1 - NUM_100NS_INTERVALS_SINCE_UUID_EPOCH) / 1;
 
  UUID u2 = TimeUUIDUtils.getTimeUUID(tmp);
  long t2 = u2.timestamp();
 
  System.out.println(u2.equals(u1));
  System.out.println(t2 == t1);
  }
 
  }
   
 
 
  On Wed, Jan 5, 2011 at 8:15 AM, Roshan Dawrani roshandawr...@gmail.com
  wrote:
 
  If I use com.eaio.uuid.UUID directly, then I am able to do what I need
  (attached a Java program for the same), but unfortunately I need to deal
  with java.util.UUID in my application and I don't have its equivalent
  com.eaio.uuid.UUID at the point where I need the timestamp value.
 
  Any suggestion on how I can achieve the equivalent using Hector
 library's
  TimeUUIDUtils?
 
  On Wed, Jan 5, 2011 at 7:21 AM, Roshan Dawrani roshandawr...@gmail.com
 
  wrote:
 
  Hi Victor / Patricio,
 
  I have been using Hector library's TimeUUIDUtils. I also just looked at
  TimeUUIDUtilsTest also but didn't find anything similar being tested
 there.
 
  Here is what I am trying and it's not working - I am creating a Time
  UUID, extracting its timestamp value and with that I create another
 Time
  UUID and I am expecting both time UUIDs to have the same timestamp()
 value -
  am I doing / expecting something wrong here?:
 
  ===
  import java.util.UUID;
  import me.prettyprint.cassandra.utils.TimeUUIDUtils;
 
  public class TryHector {
  public static void main(String[] args) throws Exception {
  UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
  long timestamp1 = someUUID.timestamp();
 
  UUID otherUUID = TimeUUIDUtils.getTimeUUID(timestamp1);
  long timestamp2 = otherUUID.timestamp();
 
  System.out.println(timestamp1);
  System.out.println(timestamp2);
  }
  }
  ===
 
  I have to create the timestamp() equivalent of my time UUIDs so I can
  send it to my UI client, for which it will be simpler to compare long
  timestamp than comparing UUIDs. Then for the long timestamp chosen by
 the
  client, I need to re-create the equivalent time UUID and go and filter
 the
  data from Cassandra database.
 
  --
  Roshan
  Blog: http://roshandawrani.wordpress.com/
  Twitter: @roshandawrani
  Skype: roshandawrani
 
  On Wed, Jan 5, 2011 at 1:32 AM, Victor Kabdebon
  victor.kabde...@gmail.com wrote:
 
  Hi Roshan,
  Sorry I misunderstood your problem.It is weird that it doesn't work,
 it
  works for me...
  As Patricio pointed out use hector standard way of creating TimeUUID
  and tell us if it still doesn't work.
  Maybe you can paste here some of the code you use to query your
 columns
  too.
 
  Victor K.
  http://www.voxnucleus.fr
 
  2011/1/4 Patricio Echagüe patric...@gmail.com
 
  In Hector framework, take a look at TimeUUIDUtils.java
 
  You can create a UUID using   TimeUUIDUtils.getTimeUUID(long time);
 or
  TimeUUIDUtils.getTimeUUID(ClockResolution clock)
 
  and later on, TimeUUIDUtils.getTimeFromUUID(..) or just
  UUID.timestamp();
 
  There are some example in TimeUUIDUtilsTest.java
 
  Let me know if it helps.
 
 
 
  On Tue, Jan 4, 

Cassandra Meetup in San Francisco Bay Area

2011-01-05 Thread Mubarak Seyed
We are hosting a Cassandra meetup in BayArea. Jonathan will give a talk on
Cassandra 0.7

The link to the meetup page is at
http://www.meetup.com/Cassandra-User-Group-Meeting/

Thanks,
Mubarak


Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
I see. Thanks for claryfing Jonathan.

On Wednesday, January 5, 2011, Jonathan Ellis jbel...@gmail.com wrote:
 1676 says Avoid dropping messages off the client request path.
 Bootstrap messages are off the client requst path.  So, if some of
 the nodes involved were loaded enough that they were dropping messages
 older than RPC_TIMEOUT to cope, it could lose part of the bootstrap
 communication permanently.

 On Wed, Jan 5, 2011 at 10:01 AM, Ran Tavory ran...@gmail.com wrote:
 OK, thanks, so I see we had the same problem (I too had multiple keyspace,
 not that I know why it matters to the problem at hand) and I see that by
 upgrading to 0.6.7 you solved your problem (I didn't try it, had a different
 workaround) but frankly, I don't understand
 how https://issues.apache.org/jira/browse/CASSANDRA-1676 would relate the
 the stuck bootstrap problem (I'm not saying that it isn't, I'd just like
 to understand why...)

 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com


-- 
/Ran


pig cassandra contribution

2011-01-05 Thread felix gao
I am having problem running the cassandra_loadfunc.jar on my build of
cassandra.
PIG_CLASSPATH=:bin/../build/cassandra_loadfunc.jar::bin/../../..//lib/antlr-3.1.3.jar:bin/../../..//lib/avro-1.2.0-dev.jar:bin/../../..//lib/clhm-production.jar:bin/../../..//lib/commons-cli-1.1.jar:bin/../../..//lib/commons-codec-1.2.jar:bin/../../..//lib/commons-collections-3.2.1.jar:bin/../../..//lib/commons-lang-2.4.jar:bin/../../..//lib/google-collections-1.0.jar:bin/../../..//lib/hadoop-core-0.20.1.jar:bin/../../..//lib/high-scale-lib.jar:bin/../../..//lib/jackson-core-asl-1.4.0.jar:bin/../../..//lib/jackson-mapper-asl-1.4.0.jar:bin/../../..//lib/jline-0.9.94.jar:bin/../../..//lib/json-simple-1.1.jar:bin/../../..//lib/libthrift.jar:bin/../../..//lib/log4j-1.2.14.jar:bin/../../..//lib/slf4j-api-1.5.8.jar:bin/../../..//lib/slf4j-log4j12-1.5.8.jar:bin/../../..//lib/spymemcached-2.4.2.jar:bin/../../..//lib/zapcat-1.2.jar:bin/../../..//build/lib/jars/ant-1.6.5.jar:bin/../../..//build/lib/jars/apache-rat-0.6.jar:bin/../../..//build/lib/jars/apache-rat-core-0.6.jar:bin/../../..//build/lib/jars/apache-rat-tasks-0.6.jar:bin/../../..//build/lib/jars/asm-3.2.jar:bin/../../..//build/lib/jars/avalon-framework-4.1.3.jar:bin/../../..//build/lib/jars/commons-cli-1.1.jar:bin/../../..//build/lib/jars/commons-collections-3.2.jar:bin/../../..//build/lib/jars/commons-lang-2.1.jar:bin/../../..//build/lib/jars/commons-logging-1.1.1.jar:bin/../../..//build/lib/jars/junit-4.6.jar:bin/../../..//build/lib/jars/log4j-1.2.12.jar:bin/../../..//build/lib/jars/logkit-1.0.1.jar:bin/../../..//build/lib/jars/paranamer-ant-2.1.jar:bin/../../..//build/lib/jars/paranamer-generator-2.1.jar:bin/../../..//build/lib/jars/qdox-1.10.jar:bin/../../..//build/lib/jars/servlet-api-2.3.jar:bin/../../..//build/apache-cassandra-0.6.4.jar:bin/../../..//build/ivy-2.1.0.jar:/usr/local/pig-0.7.0/pig.jar

In Grunt I did register again just in case it is not picked up by the
classpath
register /usr/local/pig-0.7.0/pig.jar; register
/home/felix/cassandra/lib/libthrift.jar; register
/home/felix/cassandra/contrib/pig/build/cassandra_loadfunc.jar
grunt rows = LOAD 'cassandra://test.data' USING CassandraStorge();

  2011-01-05 13:50:50,071 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve
CassandraStorge using imports: [org.apache.cassandra.hadoop.pig., ,
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Details at logfile: /home/felix/cassandra/contrib/pig/pig_1294257032719.log


the log file contains

Pig Stack Trace
---
ERROR 1070: Could not resolve CassandraStorge using imports:
[org.apache.cassandra.hadoop.pig., , org.apache.pig.builtin.,
org.apache.pig.impl.builtin.]

java.lang.RuntimeException: Cannot instantiate:CassandraStorge
at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:455)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.NonEvalFuncSpec(QueryParser.java:5087)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1434)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:357)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070:
Could not resolve CassandraStorge using imports:
[org.apache.cassandra.hadoop.pig., , org.apache.pig.builtin.,
org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:440)
at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:452)
... 15 more

Running hadoop 0.20.2 with pig0.7.0 and have to use cassandra 0.6.4.

Thanks,

Felix


Re: Reclaim deleted rows space

2011-01-05 Thread shimi
How does minor compaction is triggered? Is it triggered Only when a new
SStable is added?

I was wondering if triggering a compaction with minimumCompactionThreshold
set to 1 would be useful. If this can happen I assume it will do compaction
on files with similar size and remove deleted rows on the rest.

Shimi

On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller
peter.schul...@infidyne.comwrote:

  I don't have a problem with disk space. I have a problem with the data
  size.

 [snip]

  Bottom line is that I want to reduce the number of requests that goes to
  disk. Since there is enough data that is no longer valid I can do it by
  reclaiming the space. The only way to do it is by running Major
 compaction.
  I can wait and let Cassandra do it for me but then the data size will get
  even bigger and the response time will be worst. I can do it manually but
 I
  prefer it to happen in the background with less impact on the system

 Ok - that makes perfect sense then. Sorry for misunderstanding :)

 So essentially, for workloads that are teetering on the edge of cache
 warmness and is subject to significant overwrites or removals, it may
 be beneficial to perform much more aggressive background compaction
 even though it might waste lots of CPU, to keep the in-memory working
 set down.

 There was talk (I think in the compaction redesign ticket) about
 potentially improving the use of bloom filters such that obsolete data
 in sstables could be eliminated from the read set without
 necessitating actual compaction; that might help address cases like
 these too.

 I don't think there's a pre-existing silver bullet in a current
 release; you probably have to live with the need for
 greater-than-theoretically-optimal memory requirements to keep the
 working set in memory.

 --
 / Peter Schuller



Re: Cassandra Meetup in San Francisco Bay Area

2011-01-05 Thread Jonathan Ellis
Thanks for organizing this, Mubarak!

A little more detail -- I'll explain the new features in Cassandra 0.7
including column time-to-live, columnfamily truncation, and secondary
indexes, as well as some of the features that have been backported to
recent 0.6 releases (aka Why You Should Upgrade Yesterday). The focus
will primarily be on how these affect application design, but we'll
also touch on operational considerations.

I'm excited to meet everyone!  I hear there will be pizza, too. :)

On Wed, Jan 5, 2011 at 1:31 PM, Mubarak Seyed biggd...@gmail.com wrote:
 We are hosting a Cassandra meetup in BayArea. Jonathan will give a talk on
 Cassandra 0.7

 The link to the meetup page is at
 http://www.meetup.com/Cassandra-User-Group-Meeting/

 Thanks,
 Mubarak




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Reclaim deleted rows space

2011-01-05 Thread Jonathan Ellis
Pretty sure there's logic in there that says don't bother compacting
a single sstable.

On Wed, Jan 5, 2011 at 2:26 PM, shimi shim...@gmail.com wrote:
 How does minor compaction is triggered? Is it triggered Only when a new
 SStable is added?

 I was wondering if triggering a compaction with minimumCompactionThreshold
 set to 1 would be useful. If this can happen I assume it will do compaction
 on files with similar size and remove deleted rows on the rest.
 Shimi
 On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  I don't have a problem with disk space. I have a problem with the data
  size.

 [snip]

  Bottom line is that I want to reduce the number of requests that goes to
  disk. Since there is enough data that is no longer valid I can do it by
  reclaiming the space. The only way to do it is by running Major
  compaction.
  I can wait and let Cassandra do it for me but then the data size will
  get
  even bigger and the response time will be worst. I can do it manually
  but I
  prefer it to happen in the background with less impact on the system

 Ok - that makes perfect sense then. Sorry for misunderstanding :)

 So essentially, for workloads that are teetering on the edge of cache
 warmness and is subject to significant overwrites or removals, it may
 be beneficial to perform much more aggressive background compaction
 even though it might waste lots of CPU, to keep the in-memory working
 set down.

 There was talk (I think in the compaction redesign ticket) about
 potentially improving the use of bloom filters such that obsolete data
 in sstables could be eliminated from the read set without
 necessitating actual compaction; that might help address cases like
 these too.

 I don't think there's a pre-existing silver bullet in a current
 release; you probably have to live with the need for
 greater-than-theoretically-optimal memory requirements to keep the
 working set in memory.

 --
 / Peter Schuller





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: pig cassandra contribution

2011-01-05 Thread felix gao
Ignore the above error, I somehow passed that stage. However, I am still
having problem with it.

grunt register /home/felix/pig-0.7.0/pig-0.7.1-dev.jar; register
/home/felix/cassandra/lib/libthrift.jar;
grunt rows = LOAD 'cassandra://test/data' USING CassandraStorage();
grunt cols = FOREACH rows GENERATE flatten($1);
grunt colnames = FOREACH cols GENERATE $0;
grunt limit_colnames = limit colnames 10;
grunt dump limit_colnames
2011-01-05 15:44:17,378 [main] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with
processName=JobTracker, sessionId=
2011-01-05 15:44:17,460 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name:
Store(file:/tmp/temp-1545399343/tmp576746049:org.apache.pig.builtin.BinStorage)
- 1-27 Operator Key: 1-27)
2011-01-05 15:44:17,507 [main] INFO
 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
2011-01-05 15:44:17,507 [main] INFO
 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1
2011-01-05 15:44:17,533 [main] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:17,539 [main] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:17,539 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-01-05 15:44:21,785 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-01-05 15:44:21,841 [main] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:21,842 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-01-05 15:44:21,846 [Thread-5] WARN  org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2011-01-05 15:44:22,115 [Thread-5] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:22,133 [Thread-5] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:22,344 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-01-05 15:44:22,348 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2117: Unexpected error when launching map reduce job.
Details at logfile: /home/felix/cassandra/contrib/pig/pig_1294263823129.log


cat pig_1294263823129.log
Pig Stack Trace
---
ERROR 2117: Unexpected error when launching map reduce job.

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
open iterator for alias limit_colnames
at org.apache.pig.PigServer.openIterator(PigServer.java:521)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:544)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:357)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002:
Unable to store alias limit_colnames
at org.apache.pig.PigServer.store(PigServer.java:577)
at org.apache.pig.PigServer.openIterator(PigServer.java:504)
... 6 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117:
Unexpected error when launching map reduce job.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:209)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:308)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:835)
at org.apache.pig.PigServer.store(PigServer.java:569)
... 7 more
Caused by: java.lang.RuntimeException: Could not resolve error that occured
when launching map reduce job: java.lang.ExceptionInInitializerError
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:510)
at java.lang.Thread.dispatchUncaughtException(Thread.java:1831)




On Wed, Jan 5, 2011 at 12:02 PM, felix gao gre1...@gmail.com wrote:

 I am having problem running the 

Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Roshan Dawrani
Hi Patricio,

Thanks for your comment. Replying inline.

2011/1/5 Patricio Echagüe patric...@gmail.com

 Roshan, just a comment in your solution. The time returned is not a simple
 long. It also contains some bits indicating the version.


I don't think so. The version bits from the most significant 64 bits of the
UUID are not used in creating timestamp() value. It uses only time_low,
time_mid and time_hi fields of the UUID and not version, as documented here:
http://download.oracle.com/javase/1.5.0/docs/api/java/util/UUID.html#timestamp%28%29.


When the same timestamp comes back and I call
TimeUUIDUtils.getTimeUUID(tmp), it internally puts the version back in it
and makes it a Time UUID.


 On the other hand, you are assuming that the same machine is processing
 your request and recreating a UUID base on a long you provide. The
 clockseqAndNode id will vary if another machine takes care of the request
 (referring to your use case) .


When I recreate my UUID using the timestamp() value, my requirement is not
to arrive at the exactly same UUID from which timestamp() was derived in the
first place. I need a recreated UUID *that should be equivalent in terms of
its time value* - so that filtering the time-sorted columns using this time
UUID works fine. So, if the lower order 64 bits (clockseq + node) become
different, I don't think it is of any concern because the UUID comparison
first goes by most significant 64 bits, i.e. the time value and that should
settle the time comparison in my use case.


 Is it possible for you to send the UUID to the view? I think that would be
 the correct behavior as a simple long does not contain enough information to
 recreate the original UUID.


In my use case, the non-Java clients will be receiving a number of such
UUIDs then and they will have to sort them chronologically. I wanted to
avoid bits based UUID comparison in these clients. Long timestamp() value is
perfect for such ordering of data elements and I send much lesser amount of
data over the wire.


  Does it make sense?


Nearly everything makes sense to me :-)

-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani http://twitter.com/roshandawrani
Skype: roshandawrani


Re: Reclaim deleted rows space

2011-01-05 Thread Tyler Hobbs
Although it's not exactly the ability to list specific SSTables, the ability
to only compact specific CFs will be in upcoming releases:

https://issues.apache.org/jira/browse/CASSANDRA-1812

- Tyler

On Wed, Jan 5, 2011 at 7:46 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis jbel...@gmail.com wrote:
  Pretty sure there's logic in there that says don't bother compacting
  a single sstable.
 
  On Wed, Jan 5, 2011 at 2:26 PM, shimi shim...@gmail.com wrote:
  How does minor compaction is triggered? Is it triggered Only when a new
  SStable is added?
 
  I was wondering if triggering a compaction
 with minimumCompactionThreshold
  set to 1 would be useful. If this can happen I assume it will do
 compaction
  on files with similar size and remove deleted rows on the rest.
  Shimi
  On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller 
 peter.schul...@infidyne.com
  wrote:
 
   I don't have a problem with disk space. I have a problem with the
 data
   size.
 
  [snip]
 
   Bottom line is that I want to reduce the number of requests that goes
 to
   disk. Since there is enough data that is no longer valid I can do it
 by
   reclaiming the space. The only way to do it is by running Major
   compaction.
   I can wait and let Cassandra do it for me but then the data size will
   get
   even bigger and the response time will be worst. I can do it manually
   but I
   prefer it to happen in the background with less impact on the system
 
  Ok - that makes perfect sense then. Sorry for misunderstanding :)
 
  So essentially, for workloads that are teetering on the edge of cache
  warmness and is subject to significant overwrites or removals, it may
  be beneficial to perform much more aggressive background compaction
  even though it might waste lots of CPU, to keep the in-memory working
  set down.
 
  There was talk (I think in the compaction redesign ticket) about
  potentially improving the use of bloom filters such that obsolete data
  in sstables could be eliminated from the read set without
  necessitating actual compaction; that might help address cases like
  these too.
 
  I don't think there's a pre-existing silver bullet in a current
  release; you probably have to live with the need for
  greater-than-theoretically-optimal memory requirements to keep the
  working set in memory.
 
  --
  / Peter Schuller
 
 
 
 
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of Riptano, the source for professional Cassandra support
  http://riptano.com
 

 I was wording if it made sense to have a JMX operation that can
 compact a list of tables by file name. This opens it up for power
 users to have more options then compact entire keyspace.



Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Patricio Echagüe
Roshan, the first 64 bits does contain the version. The method
UUID.timestamp() indeed takes it out before returning. You are right in that
point. I based my comment on the UUID spec.

What I am not convinced is that the framework should provide support to
create an almost identical UUID where only the timestamp is the same
(between the two UUIDs).

UUID.equals() and UUID.compareTo() does compare the whole bit set to say
that two objects are the same. It does compare the first 64 bits to avoid
comparing the rest in case the most significant bits already show a
difference.

But coming to your point, should Hector provide that kind of support or do
you feel that the problem you have is specific to your application ?

I feel like UUID is as it says an Unique Identifier and creating a sort-of
UUID based on a previous timestamp disregarding the least significant bits
is not the right support Hector should expose.

Thoughts?

On Wed, Jan 5, 2011 at 6:30 PM, Roshan Dawrani roshandawr...@gmail.comwrote:

 Hi Patricio,

 Thanks for your comment. Replying inline.

 2011/1/5 Patricio Echagüe patric...@gmail.com

 Roshan, just a comment in your solution. The time returned is not a simple
 long. It also contains some bits indicating the version.


 I don't think so. The version bits from the most significant 64 bits of the
 UUID are not used in creating timestamp() value. It uses only time_low,
 time_mid and time_hi fields of the UUID and not version, as documented here:

 http://download.oracle.com/javase/1.5.0/docs/api/java/util/UUID.html#timestamp%28%29.


 When the same timestamp comes back and I call
 TimeUUIDUtils.getTimeUUID(tmp), it internally puts the version back in it
 and makes it a Time UUID.


 On the other hand, you are assuming that the same machine is processing
 your request and recreating a UUID base on a long you provide. The
 clockseqAndNode id will vary if another machine takes care of the request
 (referring to your use case) .


 When I recreate my UUID using the timestamp() value, my requirement is not
 to arrive at the exactly same UUID from which timestamp() was derived in the
 first place. I need a recreated UUID *that should be equivalent in terms
 of its time value* - so that filtering the time-sorted columns using this
 time UUID works fine. So, if the lower order 64 bits (clockseq + node)
 become different, I don't think it is of any concern because the UUID
 comparison first goes by most significant 64 bits, i.e. the time value and
 that should settle the time comparison in my use case.


 Is it possible for you to send the UUID to the view? I think that would be
 the correct behavior as a simple long does not contain enough information to
 recreate the original UUID.


 In my use case, the non-Java clients will be receiving a number of such
 UUIDs then and they will have to sort them chronologically. I wanted to
 avoid bits based UUID comparison in these clients. Long timestamp() value is
 perfect for such ordering of data elements and I send much lesser amount of
 data over the wire.


  Does it make sense?


 Nearly everything makes sense to me :-)

 --
 Roshan
 Blog: http://roshandawrani.wordpress.com/
 Twitter: @roshandawrani http://twitter.com/roshandawrani
 Skype: roshandawrani




-- 
Patricio.-


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Roshan Dawrani
Hi Patricio,

Some thoughts inline.

2011/1/6 Patricio Echagüe patric...@gmail.com

 Roshan, the first 64 bits does contain the version. The method
 UUID.timestamp() indeed takes it out before returning. You are right in that
 point. I based my comment on the UUID spec.


I know 64 bits have the version, but timestamp() doesn't and hence it is OK
to use it for chronological ordering. Anyway, we agree on it now and this
point is out.



 What I am not convinced is that the framework should provide support to
 create an almost identical UUID where only the timestamp is the same
 (between the two UUIDs).


Well, I didn't really ask for framework to provide me such an almost
identical UUID. What I raised was that since Hector is computing UTC time in
100 nano-sec units as

utcTime = msec * 1000 + 0x01B21DD213814000L
(NUM_100NS_INTERVALS_SINCE_UUID_EPOCH), it should, at the minimum, give a
utility function to do the opposite

msec  =  (utcTime - 0x01B21DD213814000L / 1000), so that if someone has to
create an almost identical UUID, where timestamp is same, as I needed, *he
shouldn't need to deal with such magic numbers that are linked to Hector's
guts.*

So, I don't mind creating the UUID myself, but I don't want to do magic
calculations that should be done inside Hector to-and-fro, as it is an
Hector's internal design thing.



 UUID.equals() and UUID.compareTo() does compare the whole bit set to say
 that two objects are the same. It does compare the first 64 bits to avoid
 comparing the rest in case the most significant bits already show a
 difference.


I know it may need to look at all 128 bits eventually - but it first looks
at first 64 bits (time stamp) and then the next 64. That's why I qualified
it with for my usecase. It works for me, because the data I am filtering
is already within a particular user's data-set - and the possibility of a
user having 2 data-points at the same nano-second value (so that
clockseq+node bits come into picture) is functionally nil.



 But coming to your point, should Hector provide that kind of support or do
 you feel that the problem you have is specific to your application ?


As covered above, half of my solution should go inside Hector API, I feel.
Other half about re-creating the same-timestamp-UUID and comparison using it
is specific to my application.



 I feel like UUID is as it says an Unique Identifier and creating a sort-of
 UUID based on a previous timestamp disregarding the least significant bits
 is not the right support Hector should expose.


The support Hector should expose is to keep its magic calculations inside
to-and-fro.

Does it make any sense?

-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani http://twitter.com/roshandawrani
Skype: roshandawrani


Riptano Cassandra trainings in Baltimore and Santa Clara

2011-01-05 Thread Jonathan Ellis
Riptano has two Apache Cassandra training days coming up: Baltimore on
Jan 19 and Santa Clara on Feb 4.

The Baltimore training will be taught by Jake Luciani, author of
Lucandra/Solandra.  The Santa Clara training will be taught by Ben
Coverston, Riptano's director of operations.

These are both full-day, hands-on events covering application design
and operations with the new features in Cassandra 0.7.  For more
details, see http://www.eventbrite.com/org/474011012.

See you there!

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com