Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
I was able to make the node join the ring but I'm confused.
What I did is, first when adding the node, this node was not in the seeds
list of itself. AFAIK this is how it's supposed to be. So it was able to
transfer all data to itself from other nodes but then it stayed in the
bootstrapping state.
So what I did (and I don't know why it works), is add this node to the seeds
list in its own storage-conf.xml file. Then restart the server and then I
finally see it in the ring...
If I had added the node to the seeds list of itself when first joining it,
it would not join the ring but if I do it in two phases it did work.
So it's either my misunderstanding or a bug...

On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory  wrote:

> The new node does not see itself as part of the ring, it sees all others
> but itself, so from that perspective the view is consistent.
> The only problem is that the node never finishes to bootstrap. It stays in
> this state for hours (It's been 20 hours now...)
>
>
> $ bin/nodetool -p 9004 -h localhost streams
>> Mode: Bootstrapping
>> Not sending any streams.
>> Not receiving any streams.
>
>
> On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall  wrote:
>
>> Does the new node have itself in the list of seeds per chance? This
>> could cause some issues if so.
>>
>> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory  wrote:
>> > I'm still at lost.   I haven't been able to resolve this. I tried
>> > adding another node at a different location on the ring but this node
>> > too remains stuck in the bootstrapping state for many hours without
>> > any of the other nodes being busy with anti compaction or anything
>> > else. I don't know what's keeping it from finishing the bootstrap,no
>> > CPU, no io, files were already streamed so what is it waiting for?
>> > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
>> > be anything addressing a similar issue so I figured there was no point
>> > in upgrading. But let me know if you think there is.
>> > Or any other advice...
>> >
>> > On Tuesday, January 4, 2011, Ran Tavory  wrote:
>> >> Thanks Jake, but unfortunately the streams directory is empty so I
>> don't think that any of the nodes is anti-compacting data right now or had
>> been in the past 5 hours. It seems that all the data was already transferred
>> to the joining host but the joining node, after having received the data
>> would still remain in bootstrapping mode and not join the cluster. I'm not
>> sure that *all* data was transferred (perhaps other nodes need to transfer
>> more data) but nothing is actually happening so I assume all has been moved.
>> >> Perhaps it's a configuration error from my part. Should I use I use
>> AutoBootstrap=true ? Anything else I should look out for in the
>> configuration file or something else?
>> >>
>> >>
>> >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani  wrote:
>> >>
>> >> In 0.6, locate the node doing anti-compaction and look in the "streams"
>> subdirectory in the keyspace data dir to monitor the anti-compaction
>> progress (it puts new SSTables for bootstrapping node in there)
>> >>
>> >>
>> >> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory  wrote:
>> >>
>> >>
>> >> Running nodetool decommission didn't help. Actually the node refused to
>> decommission itself (b/c it wasn't part of the ring). So I simply stopped
>> the process, deleted all the data directories and started it again. It
>> worked in the sense of the node bootstrapped again but as before, after it
>> had finished moving the data nothing happened for a long time (I'm still
>> waiting, but nothing seems to be happening).
>> >>
>> >>
>> >>
>> >>
>> >> Any hints how to analyze a "stuck" bootstrapping node??thanks
>> >> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:
>> >> Thanks Shimi, so indeed anticompaction was run on one of the other
>> nodes from the same DC but to my understanding it has already ended. A few
>> hour ago...
>> >>
>> >>
>> >>
>> >> I plenty of log messages such as [1] which ended a couple of hours ago,
>> and I've seen the new node streaming and accepting the data from the node
>> which performed the anticompaction and so far it was normal so it seemed
>> that data is at its right place. But now the new node seems sort of stuck.
>> None of the other nodes is anticompacting right now or had been
>> anticompacting since then.
>> >>
>> >>
>> >>
>> >>
>> >> The new node's CPU is close to zero, it's iostats are almost zero so I
>> can't find another bottleneck that would keep it hanging.
>> >> On the IRC someone suggested I'd maybe retry to join this node,
>> e.g. decommission and rejoin it again. I'll try it now...
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> [1] INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721
>> CompactionManager.java (line 338) AntiCompacting
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>> >>
>> >>
>> >>
>> >>
>> >>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683
>> CompactionManager.java 

Re: Cassandra LongType data insertion problem

2011-01-04 Thread Tyler Hobbs
Oops, I made one typo there. It should be:

"my_long = my_long >> 8;"

That is, shift by a byte, not a bit.
- Tyler

On Tue, Jan 4, 2011 at 10:50 PM, Tyler Hobbs  wrote:

> Here's an example:
>
> int64_t my_long = 12345678;
> char chars[8];
> for(int i = 0; i < 8; ++i) {
> chars[i] = my_long & 0xff;
> my_long = my_long >> 1;
> }
>
> std::string str_long(chars, 8);
>
> Column c1;
> c1.name = str_long;
> // etc ...
>
> Basically, Thrift expects a string which is a big-endian binary
> representation of a long. When you create the std::string, you have to
> specify the length of the char[] so that it doesn't terminate the string on
> a 0x00 byte.
>
> The approach is similar for integers and UUIDs.
> - Tyler
>
>
> On Tue, Jan 4, 2011 at 4:32 PM, Jaydeep Chovatia <
> jaydeep.chova...@openwave.com> wrote:
>
>>  Hi,
>>
>>
>>
>> I have configured Cassandra Column Family (standard CF) of LongType. If I
>> try to insert data (using batch_mutate) in this Column Family then it
>> shows me following error: “*A long is exactly 8 bytes”. *I have tried
>> assigning column name of 8 bytes, 7 bytes, etc. but it shows same error.
>>
>>
>>
>> Please find my sample program details:
>>
>> *Platform*: Linux
>>
>> *Language*: C++, Cassandra Thrift interface
>>
>>
>>
>> Column c1;
>>
>> c1.name = "12345678";
>>
>> c1.value = SString(len).AsPtr();
>>
>> c1.timestamp = curTime;
>>
>> columns.push_back(c1);
>>
>>
>>
>> Any help on this would be appreciated.
>>
>>
>>
>> Thank you,
>>
>> Jaydeep
>>
>
>


Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
The new node does not see itself as part of the ring, it sees all others but
itself, so from that perspective the view is consistent.
The only problem is that the node never finishes to bootstrap. It stays in
this state for hours (It's been 20 hours now...)

$ bin/nodetool -p 9004 -h localhost streams
> Mode: Bootstrapping
> Not sending any streams.
> Not receiving any streams.


On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall  wrote:

> Does the new node have itself in the list of seeds per chance? This
> could cause some issues if so.
>
> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory  wrote:
> > I'm still at lost.   I haven't been able to resolve this. I tried
> > adding another node at a different location on the ring but this node
> > too remains stuck in the bootstrapping state for many hours without
> > any of the other nodes being busy with anti compaction or anything
> > else. I don't know what's keeping it from finishing the bootstrap,no
> > CPU, no io, files were already streamed so what is it waiting for?
> > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
> > be anything addressing a similar issue so I figured there was no point
> > in upgrading. But let me know if you think there is.
> > Or any other advice...
> >
> > On Tuesday, January 4, 2011, Ran Tavory  wrote:
> >> Thanks Jake, but unfortunately the streams directory is empty so I don't
> think that any of the nodes is anti-compacting data right now or had been in
> the past 5 hours. It seems that all the data was already transferred to the
> joining host but the joining node, after having received the data would
> still remain in bootstrapping mode and not join the cluster. I'm not sure
> that *all* data was transferred (perhaps other nodes need to transfer more
> data) but nothing is actually happening so I assume all has been moved.
> >> Perhaps it's a configuration error from my part. Should I use I use
> AutoBootstrap=true ? Anything else I should look out for in the
> configuration file or something else?
> >>
> >>
> >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani  wrote:
> >>
> >> In 0.6, locate the node doing anti-compaction and look in the "streams"
> subdirectory in the keyspace data dir to monitor the anti-compaction
> progress (it puts new SSTables for bootstrapping node in there)
> >>
> >>
> >> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory  wrote:
> >>
> >>
> >> Running nodetool decommission didn't help. Actually the node refused to
> decommission itself (b/c it wasn't part of the ring). So I simply stopped
> the process, deleted all the data directories and started it again. It
> worked in the sense of the node bootstrapped again but as before, after it
> had finished moving the data nothing happened for a long time (I'm still
> waiting, but nothing seems to be happening).
> >>
> >>
> >>
> >>
> >> Any hints how to analyze a "stuck" bootstrapping node??thanks
> >> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:
> >> Thanks Shimi, so indeed anticompaction was run on one of the other nodes
> from the same DC but to my understanding it has already ended. A few hour
> ago...
> >>
> >>
> >>
> >> I plenty of log messages such as [1] which ended a couple of hours ago,
> and I've seen the new node streaming and accepting the data from the node
> which performed the anticompaction and so far it was normal so it seemed
> that data is at its right place. But now the new node seems sort of stuck.
> None of the other nodes is anticompacting right now or had been
> anticompacting since then.
> >>
> >>
> >>
> >>
> >> The new node's CPU is close to zero, it's iostats are almost zero so I
> can't find another bottleneck that would keep it hanging.
> >> On the IRC someone suggested I'd maybe retry to join this node,
> e.g. decommission and rejoin it again. I'll try it now...
> >>
> >>
> >>
> >>
> >>
> >>
> >> [1] INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721
> CompactionManager.java (line 338) AntiCompacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
> >>
> >>
> >>
> >>
> >>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java
> (line 338) AntiCompacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
> >>
> >>
> >>
> >>
> >>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java
> (line 338) AntiCompacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
> >>
> >>
> >

Re: Cassandra LongType data insertion problem

2011-01-04 Thread Tyler Hobbs
Here's an example:

int64_t my_long = 12345678;
char chars[8];
for(int i = 0; i < 8; ++i) {
chars[i] = my_long & 0xff;
my_long = my_long >> 1;
}

std::string str_long(chars, 8);

Column c1;
c1.name = str_long;
// etc ...

Basically, Thrift expects a string which is a big-endian binary
representation of a long. When you create the std::string, you have to
specify the length of the char[] so that it doesn't terminate the string on
a 0x00 byte.

The approach is similar for integers and UUIDs.
- Tyler

On Tue, Jan 4, 2011 at 4:32 PM, Jaydeep Chovatia <
jaydeep.chova...@openwave.com> wrote:

>  Hi,
>
>
>
> I have configured Cassandra Column Family (standard CF) of LongType. If I
> try to insert data (using batch_mutate) in this Column Family then it
> shows me following error: “*A long is exactly 8 bytes”. *I have tried
> assigning column name of 8 bytes, 7 bytes, etc. but it shows same error.
>
>
>
> Please find my sample program details:
>
> *Platform*: Linux
>
> *Language*: C++, Cassandra Thrift interface
>
>
>
> Column c1;
>
> c1.name = "12345678";
>
> c1.value = SString(len).AsPtr();
>
> c1.timestamp = curTime;
>
> columns.push_back(c1);
>
>
>
> Any help on this would be appreciated.
>
>
>
> Thank you,
>
> Jaydeep
>


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-04 Thread Roshan Dawrani
Ok, found the solution - finally ! - by applying opposite of what
createTime() does in TimeUUIDUtils. Ideally I would have preferred for this
solution to come from Hector API, so I didn't have to be tied to the private
createTime() implementation.


import java.util.UUID;
import me.prettyprint.cassandra.utils.TimeUUIDUtils;

public class TryHector {
public static void main(String[] args) throws Exception {
final long NUM_100NS_INTERVALS_SINCE_UUID_EPOCH =
0x01b21dd213814000L;

UUID u1 = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
final long t1 = u1.timestamp();

long tmp = (t1 - NUM_100NS_INTERVALS_SINCE_UUID_EPOCH) / 1;

UUID u2 = TimeUUIDUtils.getTimeUUID(tmp);
long t2 = u2.timestamp();

System.out.println(u2.equals(u1));
System.out.println(t2 == t1);
}

}
 


On Wed, Jan 5, 2011 at 8:15 AM, Roshan Dawrani wrote:

> If I use *com.eaio.uuid.UUID* directly, then I am able to do what I need
> (attached a Java program for the same), but unfortunately I need to deal
> with *java.util.UUID *in my application and I don't have its equivalent
> com.eaio.uuid.UUID at the point where I need the timestamp value.
>
> Any suggestion on how I can achieve the equivalent using Hector library's
> TimeUUIDUtils?
>
>
> On Wed, Jan 5, 2011 at 7:21 AM, Roshan Dawrani wrote:
>
>> Hi Victor / Patricio,
>>
>> I have been using Hector library's TimeUUIDUtils. I also just looked at
>> TimeUUIDUtilsTest also but didn't find anything similar being tested there.
>>
>> Here is what I am trying and it's not working - I am creating a Time UUID,
>> extracting its timestamp value and with that I create another Time UUID and
>> I am expecting both time UUIDs to have the same timestamp() value - am I
>> doing / expecting something wrong here?:
>>
>> ===
>> import java.util.UUID;
>> import me.prettyprint.cassandra.utils.TimeUUIDUtils;
>>
>> public class TryHector {
>> public static void main(String[] args) throws Exception {
>> UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
>> long timestamp1 = someUUID.timestamp();
>>
>> UUID otherUUID = TimeUUIDUtils.getTimeUUID(timestamp1);
>> long timestamp2 = otherUUID.timestamp();
>>
>> System.out.println(timestamp1);
>> System.out.println(timestamp2);
>> }
>> }
>> ===
>>
>> I have to create the timestamp() equivalent of my time UUIDs so I can send
>> it to my UI client, for which it will be simpler to compare "long" timestamp
>> than comparing UUIDs. Then for the "long" timestamp chosen by the client, I
>> need to re-create the equivalent time UUID and go and filter the data from
>> Cassandra database.
>>
>>
>> --
>> Roshan
>> Blog: http://roshandawrani.wordpress.com/
>> Twitter: @roshandawrani 
>> Skype: roshandawrani
>>
>> On Wed, Jan 5, 2011 at 1:32 AM, Victor Kabdebon <
>> victor.kabde...@gmail.com> wrote:
>>
>>> Hi Roshan,
>>>
>>> Sorry I misunderstood your problem.It is weird that it doesn't work, it
>>> works for me...
>>> As Patricio pointed out use hector "standard" way of creating TimeUUID
>>> and tell us if it still doesn't work.
>>> Maybe you can paste here some of the code you use to query your columns
>>> too.
>>>
>>> Victor K.
>>> http://www.voxnucleus.fr
>>>
>>> 2011/1/4 Patricio Echagüe 
>>>
>>> In Hector framework, take a look at TimeUUIDUtils.java

 You can create a UUID using   TimeUUIDUtils.getTimeUUID(long time); or
 TimeUUIDUtils.getTimeUUID(ClockResolution clock)

 and later on, TimeUUIDUtils.getTimeFromUUID(..) or just
 UUID.timestamp();

 There are some example in TimeUUIDUtilsTest.java

 Let me know if it helps.




 On Tue, Jan 4, 2011 at 10:27 AM, Roshan Dawrani <
 roshandawr...@gmail.com> wrote:

> Hello Victor,
>
> It is actually not that I need the 2 UUIDs to be exactly same - they
> need to be same timestamp wise.
>
> So, what I need is to extract the timestamp portion from a time UUID
> (say, U1) and then later in the cycle, use the same long timestamp value 
> to
> re-create a UUID (say, U2) that is equivalent of the previous one in terms
> of its timestamp portion - i.e., I should be able to give this U2 and 
> filter
> the data from a column family - and it should be same as if I had used the
> original UUID U1.
>
> Does it make any more sense than before? Any way I can do that?
>
> rgds,
> Roshan
>
>
> On Tue, Jan 4, 2011 at 11:46 PM, Victor Kabdebon <
> victor.kabde...@gmail.com> wrote:
>
>> Hello Roshan,
>>
>> Well it is normal to do not be able to get the exact same UUID from a
>> timest

Re: anyone using Cassandra as an analytics/data warehouse?

2011-01-04 Thread Jake Luciani
Some relevant information here:
https://www.cloudkick.com/blog/2010/mar/02/4_months_with_cassandra/

On Tue, Jan 4, 2011 at 10:09 PM, Dave Viner  wrote:

> Hi Peter,
>
> Thanks.  These are great ideas.  One comment tho.  I'm actually not as
> worried about the "logging into the system" performance and more
> speculating/imagining the querying out of the system.
>
> Most traditional data warehouses have a cube or a star schema or something
> similar.  I'm trying to imagine how one might use Cassandra in situations
> where that sort of design has historically been applied.
>
> But, I want to make sure I understand your suggestion A.
>
> Is it something like this?
>
> "a Column Family with the row key being the Unix time divided by 60x60 and
> a column key of... pretty much anything unique"
> LogCF[hour-day-in-epoch-seconds][timeuuid] = 1
> where 'hour-day-in-epoch-seconds' is something like the first second of the
> given hour of the day, so 01/04/2011 19:00:00 (in epoch
> seconds: 1294167600); 'timeuuid' is a TimeUUID from cassandra, and '1' is
> the value of the entry.
>
> Then "look at the current row every hour to actually compile the numbers,
> and store the count in the same Column Family"
> LogCF[hour-day-in-epoch-seconds][total] = x
> where 'x' is the sum of the number of timeuuid columns in the row?
>
>
> Is that what you're envisioning in Option A?
>
> Thanks
> Dave Viner
>
>
>
> On Tue, Jan 4, 2011 at 6:38 PM, Peter Harrison wrote:
>
>> Okay, here is two ways to handle this, both are quite different from each
>> other.
>>
>>
>> A)
>>
>> This approach does not depend on counters. You simply have a Column Family
>> with the row key being the Unix time divided by 60x60 and a column key of...
>> pretty much anything unique. Then have another process look at the current
>> row every hour to actually compile the numbers, and store the count in the
>> same Column Family. This will solve the first and third use cases, as it is
>> just a matter of looking at the right rows. The second case will require a
>> similar index, but one which includes a country code to be appended to the
>> row key.
>>
>> The downside here is that you are storing lots of data on individual
>> requests and retaining it. If you don't want the detailed data you might add
>> a second process to purge the detail every hour.
>>
>> B)
>>
>> There is a "counter" feature added to the latest versions of Cassandra. I
>> have not used them, but they should be able to be used to achieve the same
>> effect without a second process cleaning up every hour. Also means it is
>> more of a real time system so you can see how many requests in the hour you
>> are currently in.
>>
>>
>>
>> Basically you have to design your approach based on the query you will be
>> doing. Don't get too hung up on traditional data structures and queries as
>> they have little relationship to a Cassandra approach.
>>
>>
>>
>> On Wed, Jan 5, 2011 at 2:34 PM, Dave Viner  wrote:
>>
>>> Does anyone use Cassandra to power an analytics or data warehouse
>>> implementation?
>>>
>>> As a concrete example, one could imagine Cassandra storing data for
>>> something that reports on page-views on a website.  The basic notions might
>>> be simple (url as row-key and columns as timeuuids of viewers).  But, how
>>> would one store things like ip-geolocation to set of pages viewed?  Or
>>> hour-of-day to pages viewed?
>>>
>>> Also, how would one do a query like
>>> - "tell me how many page views occurred between 12/01/2010 and
>>> 12/31/2010"?
>>> - "tell me how many page views occurred between 12/01/2010 and 12/31/2010
>>> from the US"?
>>> - "tell me how many page views occurred between 12/01/2010 and 12/31/2010
>>> from the US in the 9th hour of the day (in gmt)"?
>>>
>>> Time slicing and dimension slicing seems like it might be very
>>> challenging (especially since the windows of time would not be known in
>>> advance).
>>>
>>> Thanks
>>> Dave Viner
>>>
>>
>>
>


Re: anyone using Cassandra as an analytics/data warehouse?

2011-01-04 Thread Dave Viner
Hi Peter,

Thanks.  These are great ideas.  One comment tho.  I'm actually not as
worried about the "logging into the system" performance and more
speculating/imagining the querying out of the system.

Most traditional data warehouses have a cube or a star schema or something
similar.  I'm trying to imagine how one might use Cassandra in situations
where that sort of design has historically been applied.

But, I want to make sure I understand your suggestion A.

Is it something like this?

"a Column Family with the row key being the Unix time divided by 60x60 and a
column key of... pretty much anything unique"
LogCF[hour-day-in-epoch-seconds][timeuuid] = 1
where 'hour-day-in-epoch-seconds' is something like the first second of the
given hour of the day, so 01/04/2011 19:00:00 (in epoch
seconds: 1294167600); 'timeuuid' is a TimeUUID from cassandra, and '1' is
the value of the entry.

Then "look at the current row every hour to actually compile the numbers,
and store the count in the same Column Family"
LogCF[hour-day-in-epoch-seconds][total] = x
where 'x' is the sum of the number of timeuuid columns in the row?


Is that what you're envisioning in Option A?

Thanks
Dave Viner



On Tue, Jan 4, 2011 at 6:38 PM, Peter Harrison  wrote:

> Okay, here is two ways to handle this, both are quite different from each
> other.
>
>
> A)
>
> This approach does not depend on counters. You simply have a Column Family
> with the row key being the Unix time divided by 60x60 and a column key of...
> pretty much anything unique. Then have another process look at the current
> row every hour to actually compile the numbers, and store the count in the
> same Column Family. This will solve the first and third use cases, as it is
> just a matter of looking at the right rows. The second case will require a
> similar index, but one which includes a country code to be appended to the
> row key.
>
> The downside here is that you are storing lots of data on individual
> requests and retaining it. If you don't want the detailed data you might add
> a second process to purge the detail every hour.
>
> B)
>
> There is a "counter" feature added to the latest versions of Cassandra. I
> have not used them, but they should be able to be used to achieve the same
> effect without a second process cleaning up every hour. Also means it is
> more of a real time system so you can see how many requests in the hour you
> are currently in.
>
>
>
> Basically you have to design your approach based on the query you will be
> doing. Don't get too hung up on traditional data structures and queries as
> they have little relationship to a Cassandra approach.
>
>
>
> On Wed, Jan 5, 2011 at 2:34 PM, Dave Viner  wrote:
>
>> Does anyone use Cassandra to power an analytics or data warehouse
>> implementation?
>>
>> As a concrete example, one could imagine Cassandra storing data for
>> something that reports on page-views on a website.  The basic notions might
>> be simple (url as row-key and columns as timeuuids of viewers).  But, how
>> would one store things like ip-geolocation to set of pages viewed?  Or
>> hour-of-day to pages viewed?
>>
>> Also, how would one do a query like
>> - "tell me how many page views occurred between 12/01/2010 and
>> 12/31/2010"?
>> - "tell me how many page views occurred between 12/01/2010 and 12/31/2010
>> from the US"?
>> - "tell me how many page views occurred between 12/01/2010 and 12/31/2010
>> from the US in the 9th hour of the day (in gmt)"?
>>
>> Time slicing and dimension slicing seems like it might be very challenging
>> (especially since the windows of time would not be known in advance).
>>
>> Thanks
>> Dave Viner
>>
>
>


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-04 Thread Roshan Dawrani
If I use *com.eaio.uuid.UUID* directly, then I am able to do what I need
(attached a Java program for the same), but unfortunately I need to deal
with *java.util.UUID *in my application and I don't have its equivalent
com.eaio.uuid.UUID at the point where I need the timestamp value.

Any suggestion on how I can achieve the equivalent using Hector library's
TimeUUIDUtils?

On Wed, Jan 5, 2011 at 7:21 AM, Roshan Dawrani wrote:

> Hi Victor / Patricio,
>
> I have been using Hector library's TimeUUIDUtils. I also just looked at
> TimeUUIDUtilsTest also but didn't find anything similar being tested there.
>
> Here is what I am trying and it's not working - I am creating a Time UUID,
> extracting its timestamp value and with that I create another Time UUID and
> I am expecting both time UUIDs to have the same timestamp() value - am I
> doing / expecting something wrong here?:
>
> ===
> import java.util.UUID;
> import me.prettyprint.cassandra.utils.TimeUUIDUtils;
>
> public class TryHector {
> public static void main(String[] args) throws Exception {
> UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
> long timestamp1 = someUUID.timestamp();
>
> UUID otherUUID = TimeUUIDUtils.getTimeUUID(timestamp1);
> long timestamp2 = otherUUID.timestamp();
>
> System.out.println(timestamp1);
> System.out.println(timestamp2);
> }
> }
> ===
>
> I have to create the timestamp() equivalent of my time UUIDs so I can send
> it to my UI client, for which it will be simpler to compare "long" timestamp
> than comparing UUIDs. Then for the "long" timestamp chosen by the client, I
> need to re-create the equivalent time UUID and go and filter the data from
> Cassandra database.
>
>
> --
> Roshan
> Blog: http://roshandawrani.wordpress.com/
> Twitter: @roshandawrani 
> Skype: roshandawrani
>
> On Wed, Jan 5, 2011 at 1:32 AM, Victor Kabdebon  > wrote:
>
>> Hi Roshan,
>>
>> Sorry I misunderstood your problem.It is weird that it doesn't work, it
>> works for me...
>> As Patricio pointed out use hector "standard" way of creating TimeUUID and
>> tell us if it still doesn't work.
>> Maybe you can paste here some of the code you use to query your columns
>> too.
>>
>> Victor K.
>> http://www.voxnucleus.fr
>>
>> 2011/1/4 Patricio Echagüe 
>>
>> In Hector framework, take a look at TimeUUIDUtils.java
>>>
>>> You can create a UUID using   TimeUUIDUtils.getTimeUUID(long time); or
>>> TimeUUIDUtils.getTimeUUID(ClockResolution clock)
>>>
>>> and later on, TimeUUIDUtils.getTimeFromUUID(..) or just UUID.timestamp();
>>>
>>> There are some example in TimeUUIDUtilsTest.java
>>>
>>> Let me know if it helps.
>>>
>>>
>>>
>>>
>>> On Tue, Jan 4, 2011 at 10:27 AM, Roshan Dawrani >> > wrote:
>>>
 Hello Victor,

 It is actually not that I need the 2 UUIDs to be exactly same - they
 need to be same timestamp wise.

 So, what I need is to extract the timestamp portion from a time UUID
 (say, U1) and then later in the cycle, use the same long timestamp value to
 re-create a UUID (say, U2) that is equivalent of the previous one in terms
 of its timestamp portion - i.e., I should be able to give this U2 and 
 filter
 the data from a column family - and it should be same as if I had used the
 original UUID U1.

 Does it make any more sense than before? Any way I can do that?

 rgds,
 Roshan


 On Tue, Jan 4, 2011 at 11:46 PM, Victor Kabdebon <
 victor.kabde...@gmail.com> wrote:

> Hello Roshan,
>
> Well it is normal to do not be able to get the exact same UUID from a
> timestamp, it is its purpose.
> When you create an UUID you have in fact two information : random 64
> bits number - 64 bits timestamp. You put that together and you have your
> uuid.
> .
> So unless you save your random number two UUID for the same milli( or
> micro) second are different.
>
> Best regards,
> Victor K.
> http://www.voxnucleus.fr
>
> 2011/1/4 Roshan Dawrani 
>
> Hi,
>> I am having a little difficulty converting a time UUID to its
>> timestamp equivalent and back. Can someone please help?
>>
>> Here is what I am trying. Is it not the right way to do it?
>>
>> ===
>> UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
>>
>> long time = someUUID.timestamp(); /* convery from UUID to a
>> long timestamp */
>> UUID otherUUID = TimeUUIDUtils.getTimeUUID(time); /* do the
>> reverse and get back the UUID from timestamp */
>>
>> System.out.println(someUUID); /* someUUID and otherUUID should
>> be same, but are different */
>> System.out.println(otherUUID);
>> 

Re: anyone using Cassandra as an analytics/data warehouse?

2011-01-04 Thread Peter Harrison
Okay, here is two ways to handle this, both are quite different from each
other.


A)

This approach does not depend on counters. You simply have a Column Family
with the row key being the Unix time divided by 60x60 and a column key of...
pretty much anything unique. Then have another process look at the current
row every hour to actually compile the numbers, and store the count in the
same Column Family. This will solve the first and third use cases, as it is
just a matter of looking at the right rows. The second case will require a
similar index, but one which includes a country code to be appended to the
row key.

The downside here is that you are storing lots of data on individual
requests and retaining it. If you don't want the detailed data you might add
a second process to purge the detail every hour.

B)

There is a "counter" feature added to the latest versions of Cassandra. I
have not used them, but they should be able to be used to achieve the same
effect without a second process cleaning up every hour. Also means it is
more of a real time system so you can see how many requests in the hour you
are currently in.



Basically you have to design your approach based on the query you will be
doing. Don't get too hung up on traditional data structures and queries as
they have little relationship to a Cassandra approach.


On Wed, Jan 5, 2011 at 2:34 PM, Dave Viner  wrote:

> Does anyone use Cassandra to power an analytics or data warehouse
> implementation?
>
> As a concrete example, one could imagine Cassandra storing data for
> something that reports on page-views on a website.  The basic notions might
> be simple (url as row-key and columns as timeuuids of viewers).  But, how
> would one store things like ip-geolocation to set of pages viewed?  Or
> hour-of-day to pages viewed?
>
> Also, how would one do a query like
> - "tell me how many page views occurred between 12/01/2010 and 12/31/2010"?
> - "tell me how many page views occurred between 12/01/2010 and 12/31/2010
> from the US"?
> - "tell me how many page views occurred between 12/01/2010 and 12/31/2010
> from the US in the 9th hour of the day (in gmt)"?
>
> Time slicing and dimension slicing seems like it might be very challenging
> (especially since the windows of time would not be known in advance).
>
> Thanks
> Dave Viner
>


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-04 Thread Roshan Dawrani
Hi Victor / Patricio,

I have been using Hector library's TimeUUIDUtils. I also just looked at
TimeUUIDUtilsTest also but didn't find anything similar being tested there.

Here is what I am trying and it's not working - I am creating a Time UUID,
extracting its timestamp value and with that I create another Time UUID and
I am expecting both time UUIDs to have the same timestamp() value - am I
doing / expecting something wrong here?:

===
import java.util.UUID;
import me.prettyprint.cassandra.utils.TimeUUIDUtils;

public class TryHector {
public static void main(String[] args) throws Exception {
UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
long timestamp1 = someUUID.timestamp();

UUID otherUUID = TimeUUIDUtils.getTimeUUID(timestamp1);
long timestamp2 = otherUUID.timestamp();

System.out.println(timestamp1);
System.out.println(timestamp2);
}
}
===

I have to create the timestamp() equivalent of my time UUIDs so I can send
it to my UI client, for which it will be simpler to compare "long" timestamp
than comparing UUIDs. Then for the "long" timestamp chosen by the client, I
need to re-create the equivalent time UUID and go and filter the data from
Cassandra database.

-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani 
Skype: roshandawrani

On Wed, Jan 5, 2011 at 1:32 AM, Victor Kabdebon
wrote:

> Hi Roshan,
>
> Sorry I misunderstood your problem.It is weird that it doesn't work, it
> works for me...
> As Patricio pointed out use hector "standard" way of creating TimeUUID and
> tell us if it still doesn't work.
> Maybe you can paste here some of the code you use to query your columns
> too.
>
> Victor K.
> http://www.voxnucleus.fr
>
> 2011/1/4 Patricio Echagüe 
>
> In Hector framework, take a look at TimeUUIDUtils.java
>>
>> You can create a UUID using   TimeUUIDUtils.getTimeUUID(long time); or
>> TimeUUIDUtils.getTimeUUID(ClockResolution clock)
>>
>> and later on, TimeUUIDUtils.getTimeFromUUID(..) or just UUID.timestamp();
>>
>> There are some example in TimeUUIDUtilsTest.java
>>
>> Let me know if it helps.
>>
>>
>>
>>
>> On Tue, Jan 4, 2011 at 10:27 AM, Roshan Dawrani 
>> wrote:
>>
>>> Hello Victor,
>>>
>>> It is actually not that I need the 2 UUIDs to be exactly same - they need
>>> to be same timestamp wise.
>>>
>>> So, what I need is to extract the timestamp portion from a time UUID
>>> (say, U1) and then later in the cycle, use the same long timestamp value to
>>> re-create a UUID (say, U2) that is equivalent of the previous one in terms
>>> of its timestamp portion - i.e., I should be able to give this U2 and filter
>>> the data from a column family - and it should be same as if I had used the
>>> original UUID U1.
>>>
>>> Does it make any more sense than before? Any way I can do that?
>>>
>>> rgds,
>>> Roshan
>>>
>>>
>>> On Tue, Jan 4, 2011 at 11:46 PM, Victor Kabdebon <
>>> victor.kabde...@gmail.com> wrote:
>>>
 Hello Roshan,

 Well it is normal to do not be able to get the exact same UUID from a
 timestamp, it is its purpose.
 When you create an UUID you have in fact two information : random 64
 bits number - 64 bits timestamp. You put that together and you have your
 uuid.
 .
 So unless you save your random number two UUID for the same milli( or
 micro) second are different.

 Best regards,
 Victor K.
 http://www.voxnucleus.fr

 2011/1/4 Roshan Dawrani 

 Hi,
> I am having a little difficulty converting a time UUID to its timestamp
> equivalent and back. Can someone please help?
>
> Here is what I am trying. Is it not the right way to do it?
>
> ===
> UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
>
> long time = someUUID.timestamp(); /* convery from UUID to a
> long timestamp */
> UUID otherUUID = TimeUUIDUtils.getTimeUUID(time); /* do the
> reverse and get back the UUID from timestamp */
>
> System.out.println(someUUID); /* someUUID and otherUUID should
> be same, but are different */
> System.out.println(otherUUID);
> ===
>
> --
> Roshan
> Blog: http://roshandawrani.wordpress.com/
> Twitter: @roshandawrani 
> Skype: roshandawrani
>
>

>>>
>>>
>>> --
>>> Roshan
>>> Blog: http://roshandawrani.wordpress.com/
>>> Twitter: @roshandawrani 
>>> Skype: roshandawrani
>>>
>>>
>>
>>
>> --
>> Patricio.-
>>
>
>


anyone using Cassandra as an analytics/data warehouse?

2011-01-04 Thread Dave Viner
Does anyone use Cassandra to power an analytics or data warehouse
implementation?

As a concrete example, one could imagine Cassandra storing data for
something that reports on page-views on a website.  The basic notions might
be simple (url as row-key and columns as timeuuids of viewers).  But, how
would one store things like ip-geolocation to set of pages viewed?  Or
hour-of-day to pages viewed?

Also, how would one do a query like
- "tell me how many page views occurred between 12/01/2010 and 12/31/2010"?
- "tell me how many page views occurred between 12/01/2010 and 12/31/2010
from the US"?
- "tell me how many page views occurred between 12/01/2010 and 12/31/2010
from the US in the 9th hour of the day (in gmt)"?

Time slicing and dimension slicing seems like it might be very challenging
(especially since the windows of time would not be known in advance).

Thanks
Dave Viner


Re: Deletion via SliceRange

2011-01-04 Thread Jonathan Ellis
It's not on anyone's short list, that I know of.

https://issues.apache.org/jira/browse/CASSANDRA-494

On Tue, Jan 4, 2011 at 5:18 PM, mike dooley  wrote:
> any idea when Deletion via SliceRanges will be supported?
>
>     [java] Caused by: InvalidRequestException(why:Deletion does not yet 
> support SliceRange predicates.)
>     [java]     at 
> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:16477)
>
> thanks,
> -mike



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Bootstrapping taking long

2011-01-04 Thread Nate McCall
Does the new node have itself in the list of seeds per chance? This
could cause some issues if so.

On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory  wrote:
> I'm still at lost.   I haven't been able to resolve this. I tried
> adding another node at a different location on the ring but this node
> too remains stuck in the bootstrapping state for many hours without
> any of the other nodes being busy with anti compaction or anything
> else. I don't know what's keeping it from finishing the bootstrap,no
> CPU, no io, files were already streamed so what is it waiting for?
> I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
> be anything addressing a similar issue so I figured there was no point
> in upgrading. But let me know if you think there is.
> Or any other advice...
>
> On Tuesday, January 4, 2011, Ran Tavory  wrote:
>> Thanks Jake, but unfortunately the streams directory is empty so I don't 
>> think that any of the nodes is anti-compacting data right now or had been in 
>> the past 5 hours. It seems that all the data was already transferred to the 
>> joining host but the joining node, after having received the data would 
>> still remain in bootstrapping mode and not join the cluster. I'm not sure 
>> that *all* data was transferred (perhaps other nodes need to transfer more 
>> data) but nothing is actually happening so I assume all has been moved.
>> Perhaps it's a configuration error from my part. Should I use I use 
>> AutoBootstrap=true ? Anything else I should look out for in the 
>> configuration file or something else?
>>
>>
>> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani  wrote:
>>
>> In 0.6, locate the node doing anti-compaction and look in the "streams" 
>> subdirectory in the keyspace data dir to monitor the anti-compaction 
>> progress (it puts new SSTables for bootstrapping node in there)
>>
>>
>> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory  wrote:
>>
>>
>> Running nodetool decommission didn't help. Actually the node refused to 
>> decommission itself (b/c it wasn't part of the ring). So I simply stopped 
>> the process, deleted all the data directories and started it again. It 
>> worked in the sense of the node bootstrapped again but as before, after it 
>> had finished moving the data nothing happened for a long time (I'm still 
>> waiting, but nothing seems to be happening).
>>
>>
>>
>>
>> Any hints how to analyze a "stuck" bootstrapping node??thanks
>> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:
>> Thanks Shimi, so indeed anticompaction was run on one of the other nodes 
>> from the same DC but to my understanding it has already ended. A few hour 
>> ago...
>>
>>
>>
>> I plenty of log messages such as [1] which ended a couple of hours ago, and 
>> I've seen the new node streaming and accepting the data from the node which 
>> performed the anticompaction and so far it was normal so it seemed that data 
>> is at its right place. But now the new node seems sort of stuck. None of the 
>> other nodes is anticompacting right now or had been anticompacting since 
>> then.
>>
>>
>>
>>
>> The new node's CPU is close to zero, it's iostats are almost zero so I can't 
>> find another bottleneck that would keep it hanging.
>> On the IRC someone suggested I'd maybe retry to join this node, 
>> e.g. decommission and rejoin it again. I'll try it now...
>>
>>
>>
>>
>>
>>
>> [1] INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 CompactionManager.java 
>> (line 338) AntiCompacting 
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>
>>
>>
>>
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java 
>> (line 338) AntiCompacting 
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
>>
>>
>>
>>
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java 
>> (line 338) AntiCompacting 
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
>>
>>
>>
>>
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 CompactionManager.java 
>> (line 338) AntiCompacting 
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>
>>
>>
>>
>>
>> On Tue, Jan 4, 2011 at 12:45 PM, shimi  wrote:
>>
>>
>>
>>
>>
>> In my experience most of the time it takes for a node to join the cluster is 
>> the anticompaction on the other nodes. The streaming part is very fast.
>> Check the other nodes logs to see if there is a

Deletion via SliceRange

2011-01-04 Thread mike dooley
any idea when Deletion via SliceRanges will be supported?

 [java] Caused by: InvalidRequestException(why:Deletion does not yet 
support SliceRange predicates.)
 [java] at 
org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:16477)

thanks,
-mike

Cassandra LongType data insertion problem

2011-01-04 Thread Jaydeep Chovatia
Hi,

I have configured Cassandra Column Family (standard CF) of LongType. If I try 
to insert data (using batch_mutate) in this Column Family then it shows me 
following error: "A long is exactly 8 bytes". I have tried assigning column 
name of 8 bytes, 7 bytes, etc. but it shows same error.

Please find my sample program details:
Platform: Linux
Language: C++, Cassandra Thrift interface

Column c1;
c1.name = "12345678";
c1.value = SString(len).AsPtr();
c1.timestamp = curTime;
columns.push_back(c1);

Any help on this would be appreciated.

Thank you,
Jaydeep


Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
I'm still at lost.   I haven't been able to resolve this. I tried
adding another node at a different location on the ring but this node
too remains stuck in the bootstrapping state for many hours without
any of the other nodes being busy with anti compaction or anything
else. I don't know what's keeping it from finishing the bootstrap,no
CPU, no io, files were already streamed so what is it waiting for?
I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
be anything addressing a similar issue so I figured there was no point
in upgrading. But let me know if you think there is.
Or any other advice...

On Tuesday, January 4, 2011, Ran Tavory  wrote:
> Thanks Jake, but unfortunately the streams directory is empty so I don't 
> think that any of the nodes is anti-compacting data right now or had been in 
> the past 5 hours. It seems that all the data was already transferred to the 
> joining host but the joining node, after having received the data would still 
> remain in bootstrapping mode and not join the cluster. I'm not sure that 
> *all* data was transferred (perhaps other nodes need to transfer more data) 
> but nothing is actually happening so I assume all has been moved.
> Perhaps it's a configuration error from my part. Should I use I use 
> AutoBootstrap=true ? Anything else I should look out for in the configuration 
> file or something else?
>
>
> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani  wrote:
>
> In 0.6, locate the node doing anti-compaction and look in the "streams" 
> subdirectory in the keyspace data dir to monitor the anti-compaction progress 
> (it puts new SSTables for bootstrapping node in there)
>
>
> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory  wrote:
>
>
> Running nodetool decommission didn't help. Actually the node refused to 
> decommission itself (b/c it wasn't part of the ring). So I simply stopped the 
> process, deleted all the data directories and started it again. It worked in 
> the sense of the node bootstrapped again but as before, after it had finished 
> moving the data nothing happened for a long time (I'm still waiting, but 
> nothing seems to be happening).
>
>
>
>
> Any hints how to analyze a "stuck" bootstrapping node??thanks
> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:
> Thanks Shimi, so indeed anticompaction was run on one of the other nodes from 
> the same DC but to my understanding it has already ended. A few hour ago...
>
>
>
> I plenty of log messages such as [1] which ended a couple of hours ago, and 
> I've seen the new node streaming and accepting the data from the node which 
> performed the anticompaction and so far it was normal so it seemed that data 
> is at its right place. But now the new node seems sort of stuck. None of the 
> other nodes is anticompacting right now or had been anticompacting since then.
>
>
>
>
> The new node's CPU is close to zero, it's iostats are almost zero so I can't 
> find another bottleneck that would keep it hanging.
> On the IRC someone suggested I'd maybe retry to join this node, 
> e.g. decommission and rejoin it again. I'll try it now...
>
>
>
>
>
>
> [1] INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 CompactionManager.java 
> (line 338) AntiCompacting 
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>
>
>
>
>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java 
> (line 338) AntiCompacting 
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
>
>
>
>
>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java 
> (line 338) AntiCompacting 
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
>
>
>
>
>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 CompactionManager.java 
> (line 338) AntiCompacting 
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>
>
>
>
>
> On Tue, Jan 4, 2011 at 12:45 PM, shimi  wrote:
>
>
>
>
>
> In my experience most of the time it takes for a node to join the cluster is 
> the anticompaction on the other nodes. The streaming part is very fast.
> Check the other nodes logs to see if there is any node doing anticompaction.I 
> don't remember how much data I had in the cluster when I needed to add/remove 
> nodes. I do remember that it took a few hours.
>
>
>
>
>
>
> The node will join the ring only when it will finish the bootstrap.
> --
> /Ran
>
>

-- 
/Ran


Re: Insert LongType with ruby

2011-01-04 Thread vicent roca daniel
I don't know.
Looking the table with de cli I see this results:

Using app.insert(:Numers, 'device1-cpu', {Time.now => i.to_s }) :

=> (column=5300944406187227576, value=3, timestamp=1294175880417061)
=> (column=5300944406181604704, value=2, timestamp=1294175880415584)
=> (column=5300944406071978530, value=1, timestamp=1294175880413584)


Using app.insert(:Numers, 'device1-cpu', {Time.stamp => i.to_s }) :

=> (column=1294176156967820, value=3, timestamp=1294176156967851)
=> (column=1294176156966904, value=2, timestamp=1294176156966949)
=> (column=1294176156957286, value=1, timestamp=1294176156965795)

Which I think it makes more sense since this columns names are timestamps.
I'll keep working on this.
Thanks for your help ryan :)



On Tue, Jan 4, 2011 at 10:18 PM, Ryan King  wrote:

> On Tue, Jan 4, 2011 at 12:50 PM, vicent roca daniel 
> wrote:
> > I'm getting more consistent results using Time.stamp instead of Time
> > From:
> https://github.com/fauna/cassandra/blob/master/lib/cassandra/long.rb
>
> Yeah, you were probably overwriting values then.
>
> -ryan
>


Re: Insert LongType with ruby

2011-01-04 Thread Ryan King
On Tue, Jan 4, 2011 at 12:50 PM, vicent roca daniel  wrote:
> I'm getting more consistent results using Time.stamp instead of Time
> From: https://github.com/fauna/cassandra/blob/master/lib/cassandra/long.rb

Yeah, you were probably overwriting values then.

-ryan


Re: Insert LongType with ruby

2011-01-04 Thread vicent roca daniel
I'm getting more consistent results using Time.stamp instead of Time

From: https://github.com/fauna/cassandra/blob/master/lib/cassandra/long.rb

when NilClass, Time
# Time.stamp is 52 bytes, so we have 12 bytes of entropy left over
int = ((bytes || Time).stamp << 12) + rand(2**12)

I'll keep looking at this.
Thanks! :)


On Mon, Jan 3, 2011 at 10:24 PM, vicent roca daniel wrote:

> The problem I think I have is that I think I'm not storing the correct
> value.
> If I do this (for example):
>
> app.insert(:NumData, 'device1-cpu', { Time.now + 1 minut => 10.to_s })
> app.insert(:NumData, 'device1-cpu', { Time.now + 1 minu => 10.to_s })
> app.insert(:NumData, 'device1-cpu', { Time.now + 1 minu => 10.to_s })
> app.insert(:NumData, 'device1-cpu', { Time.now + 1 minu => 10.to_s })
>
> and I do a get query with :start=> first_Time.now and :finish=> the second
> Time, I should get two columns, but I'm getting none.
> I suspect that the column name is not a valid Time.
>
> ¿That make sense?
> I'm really new, so please, understand me if I did something crazy :)
>
>
> On Mon, Jan 3, 2011 at 10:17 PM, Ryan King  wrote:
>
>> On Mon, Jan 3, 2011 at 1:15 PM, vicent roca daniel 
>> wrote:
>> > hi,
>> > no I'n not getting any exception.
>>
>> Then what problem are you seeing?
>>
>> -ryan
>>
>> > The value gets inserted withou problem.
>> > If I try to convert to string I get:
>> > Cassandra::Comparable::TypeError: Expected "2011-01-03 22:14:40 +0100"
>> to
>> > cast to a Cassandra::Long (invalid bytecount)
>> > from
>> >
>> /Users/armandolalala/.rvm/gems/ruby-1.9.2-p0/gems/cassandra-0.9.0/lib/cassandra/long.rb:20:in
>> > `initialize'
>> > from
>> >
>> /Users/armandolalala/.rvm/gems/ruby-1.9.2-p0/gems/cassandra-0.9.0/lib/cassandra/0.6/columns.rb:10:in
>> > `new'
>> > from
>> >
>> /Users/armandolalala/.rvm/gems/ruby-1.9.2-p0/gems/cassandra-0.9.0/lib/cassandra/0.6/columns.rb:10:in
>> > `_standard_insert_mutation'
>> > from
>> >
>> /Users/armandolalala/.rvm/gems/ruby-1.9.2-p0/gems/cassandra-0.9.0/lib/cassandra/cassandra.rb:125:in
>> > `block in insert'
>> > from
>> >
>> /Users/armandolalala/.rvm/gems/ruby-1.9.2-p0/gems/cassandra-0.9.0/lib/cassandra/cassandra.rb:125:in
>> > `each'
>> > from
>> >
>> /Users/armandolalala/.rvm/gems/ruby-1.9.2-p0/gems/cassandra-0.9.0/lib/cassandra/cassandra.rb:125:in
>> > `collect'
>> > from
>> >
>> /Users/armandolalala/.rvm/gems/ruby-1.9.2-p0/gems/cassandra-0.9.0/lib/cassandra/cassandra.rb:125:in
>> > `insert'
>> > from (irb):6
>> > from /Users/armandolalala/.rvm/rubies/ruby-1.9.2-p0/bin/irb:17:in
>> `'
>> >
>> > On Mon, Jan 3, 2011 at 10:06 PM, Ryan King  wrote:
>> >>
>> >> On Mon, Jan 3, 2011 at 12:56 PM, vicent roca daniel 
>> >> wrote:
>> >> > Hi again!
>> >> > code:
>> >> > require 'rubygems'
>> >> > require 'cassandra'
>> >> > app = Cassandra.new('AOM', servers = "127.0.0.1:9160")
>> >> > app.insert(:NumData, 'device1-cpu', { Time.now => 10.to_s })
>> >>
>> >> I'm going to assume you're getting an exception here? I think you need
>> >> to convert the time to a string.
>> >>
>> >> > 
>> >> > storag-confl.xm.
>> >> > 
>> >> > 
>> >> > 
>> >> >
>> >> > Thanks!!
>> >>
>> >> -ryan
>> >
>> >
>>
>
>


Re: Reclaim deleted rows space

2011-01-04 Thread Peter Schuller
> I don't have a problem with disk space. I have a problem with the data
> size.

[snip]

> Bottom line is that I want to reduce the number of requests that goes to
> disk. Since there is enough data that is no longer valid I can do it by
> reclaiming the space. The only way to do it is by running Major compaction.
> I can wait and let Cassandra do it for me but then the data size will get
> even bigger and the response time will be worst. I can do it manually but I
> prefer it to happen in the background with less impact on the system

Ok - that makes perfect sense then. Sorry for misunderstanding :)

So essentially, for workloads that are teetering on the edge of cache
warmness and is subject to significant overwrites or removals, it may
be beneficial to perform much more aggressive background compaction
even though it might waste lots of CPU, to keep the in-memory working
set down.

There was talk (I think in the compaction redesign ticket) about
potentially improving the use of bloom filters such that obsolete data
in sstables could be eliminated from the read set without
necessitating actual compaction; that might help address cases like
these too.

I don't think there's a pre-existing silver bullet in a current
release; you probably have to live with the need for
greater-than-theoretically-optimal memory requirements to keep the
working set in memory.

-- 
/ Peter Schuller


Re: Cassandra disk usage and failure recovery

2011-01-04 Thread Peter Schuller
> That is correct.  In 0.6, an anticompaction was performed and a temporary
> SSTable was written out to disk, then streamed to the recipient.  The way
> this is now done in 0.7 requires no extra disk space on the source node.

Great. So that should at least mean that running out of diskspace
should always be solvable in terms of the cluster by adding news in
between other pre-existing nodes. That is provided that the internal
node issues (allowing compaction to take place to space can actually
be freed) are solved in some way. And assuming the compaction redesign
happens at some point, that should be a minor issue because it should
be easy to avoid the possibility of getting to the point of not
fitting a single (size limited) sstable.

-- 
/ Peter Schuller


Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
Thanks Jake, but unfortunately the streams directory is empty so I don't
think that any of the nodes is anti-compacting data right now or had been in
the past 5 hours.
It seems that all the data was already transferred to the joining host but
the joining node, after having received the data would still remain in
bootstrapping mode and not join the cluster. I'm not sure that *all* data
was transferred (perhaps other nodes need to transfer more data) but nothing
is actually happening so I assume all has been moved.
Perhaps it's a configuration error from my part. Should I use I use
AutoBootstrap=true ? Anything else I should look out for in the
configuration file or something else?


On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani  wrote:

> In 0.6, locate the node doing anti-compaction and look in the "streams"
> subdirectory in the keyspace data dir to monitor the anti-compaction
> progress (it puts new SSTables for bootstrapping node in there)
>
>
> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory  wrote:
>
>> Running nodetool decommission didn't help. Actually the node refused to
>> decommission itself (b/c it wasn't part of the ring). So I simply stopped
>> the process, deleted all the data directories and started it again. It
>> worked in the sense of the node bootstrapped again but as before, after it
>> had finished moving the data nothing happened for a long time (I'm still
>> waiting, but nothing seems to be happening).
>>
>> Any hints how to analyze a "stuck" bootstrapping node??
>> thanks
>>
>> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:
>>
>>> Thanks Shimi, so indeed anticompaction was run on one of the other nodes
>>> from the same DC but to my understanding it has already ended. A few hour
>>> ago...
>>> I plenty of log messages such as [1] which ended a couple of hours ago,
>>> and I've seen the new node streaming and accepting the data from the node
>>> which performed the anticompaction and so far it was normal so it seemed
>>> that data is at its right place. But now the new node seems sort of stuck.
>>> None of the other nodes is anticompacting right now or had been
>>> anticompacting since then.
>>> The new node's CPU is close to zero, it's iostats are almost zero so I
>>> can't find another bottleneck that would keep it hanging.
>>>
>>> On the IRC someone suggested I'd maybe retry to join this node,
>>> e.g. decommission and rejoin it again. I'll try it now...
>>>
>>>
>>> [1]
>>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 CompactionManager.java
>>> (line 338) AntiCompacting
>>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java
>>> (line 338) AntiCompacting
>>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
>>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java
>>> (line 338) AntiCompacting
>>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
>>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 CompactionManager.java
>>> (line 338) AntiCompacting
>>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>>
>>> On Tue, Jan 4, 2011 at 12:45 PM, shimi  wrote:
>>>
 In my experience most of the time it takes for a node to join the
 cluster is the anticompaction on the other nodes. The streaming part is 
 very
 fast.
 Check the other nodes logs to see if there is any node doing
 anticompaction.
 I don't remember how much data I had in the cluster when I needed to
 add/remove nodes. I do remember that it took a few hours.

 The node will join the ring only when it will finish the bootstrap.

 Shimi


 On Tue, Jan 4, 2011 at 12:28 PM, Ran Tavory  wrote:

> I asked the same question on the IRC but no luck there, everyone's
> asleep ;)...
>
> Using 0.6.6 I'm adding a new node to the cluster.
> It starts out fine but then gets stuck on the bootstrapping state for
> too long. More than an hour and still counting.
>
> $ bin/nodetool -p 9004 -h localhost streams
>> Mode: Bootstrapping
>> Not sending any streams.
>> Not receiving any streams.
>
>
> It seemed to have streamed data from other nodes and indeed the load is
> non-zero but I'm not clear what's keeping it right now from finishing.
>
>> $ bin/nodetoo

Re: Hector version

2011-01-04 Thread Nate McCall
0.6.0-19 still relies on the local lib directory and execution targets
in the pom file for tracking Cassandra versions. This is fixed in
0.6.0 branch, but has not been released although it is stable (we
could probably stand to do a release here and certainly will with
0.6.9 of Cassandra).

As for migrating from 0.6.0-16, this thread provides some high level
details (as well as additional maven explanation):
http://groups.google.com/group/hector-users/browse_thread/thread/ada58caca0174858/e8dd164ff10cc649?lnk=gst&q=release+0.6#

For future reference, hector-specific questions might be better
directed towards hector-us...@googlegroups.com as you will probably
get a more direct reply quicker.


On Tue, Jan 4, 2011 at 12:18 PM, Hugo Zwaal  wrote:
> Hi,
>
> I'm also using Hector on Cassandra 0.6.8, but could not get Hector 0.6.0-19
> to work in Maven. It refers to an unexistent
> or.apache.cassandra/cassandra/0.6.5 package. I suspect this should be
> org.apache.cassandra/apache-cassandra/0.6.5. I also noticed there exists a
> 0.6.0-20 version. My questions are;
>
> 1) Wouldn's it be better to use 0.6.0-20 over 0.6.0-19, or is the latter one
> preferred (an why)?
> 2) Is there some documentation on migrating from 0.6.0-16? I noticed that
> several things changed in a backward-incompatible way.
>
> Thanks, Hugo.
>
> On 12/31/2010 8:52 AM, Ran Tavory wrote:
>>
>> Use 0.6.0-19
>>
>> On Friday, December 31, 2010, Zhidong She  wrote:
>>>
>>> Hi guys,
>>>
>>> We are trying Cassandra 0.6.8, and could you please kindly tell me which
>>> Hector Java client is suitable for 0.6.8?
>>> The Hector 0.7.0 says it's for Cassandra 0.7.X, and shall we use Hector
>>> 0.6.0?
>>>
>>> Thanks,
>>> Br
>>> Zhidong
>>>
>
>


Re: Cassandra disk usage and failure recovery

2011-01-04 Thread Tyler Hobbs
> Anti-compaction and streaming is done to move data from nodes that
> have it (that are in the replica set). This implies CPU and I/O and
> networking load on the source node, so it does have an impact. See
> http://wiki.apache.org/cassandra/Streaming among others.
>
> (Here's where I'm not sure, someone please confirm/deny) In 0.6, I
> believe this required diskspace on the originating node. In 0.7, I
> *think* the need for disk space on the source node is removed.
>

That is correct.  In 0.6, an anticompaction was performed and a temporary
SSTable was written out to disk, then streamed to the recipient.  The way
this is now done in 0.7 requires no extra disk space on the source node.

- Tyler


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-04 Thread Patricio Echagüe
In Hector framework, take a look at TimeUUIDUtils.java

You can create a UUID using   TimeUUIDUtils.getTimeUUID(long time); or
TimeUUIDUtils.getTimeUUID(ClockResolution clock)

and later on, TimeUUIDUtils.getTimeFromUUID(..) or just UUID.timestamp();

There are some example in TimeUUIDUtilsTest.java

Let me know if it helps.



On Tue, Jan 4, 2011 at 10:27 AM, Roshan Dawrani wrote:

> Hello Victor,
>
> It is actually not that I need the 2 UUIDs to be exactly same - they need
> to be same timestamp wise.
>
> So, what I need is to extract the timestamp portion from a time UUID (say,
> U1) and then later in the cycle, use the same long timestamp value to
> re-create a UUID (say, U2) that is equivalent of the previous one in terms
> of its timestamp portion - i.e., I should be able to give this U2 and filter
> the data from a column family - and it should be same as if I had used the
> original UUID U1.
>
> Does it make any more sense than before? Any way I can do that?
>
> rgds,
> Roshan
>
>
> On Tue, Jan 4, 2011 at 11:46 PM, Victor Kabdebon <
> victor.kabde...@gmail.com> wrote:
>
>> Hello Roshan,
>>
>> Well it is normal to do not be able to get the exact same UUID from a
>> timestamp, it is its purpose.
>> When you create an UUID you have in fact two information : random 64 bits
>> number - 64 bits timestamp. You put that together and you have your uuid.
>> .
>> So unless you save your random number two UUID for the same milli( or
>> micro) second are different.
>>
>> Best regards,
>> Victor K.
>> http://www.voxnucleus.fr
>>
>> 2011/1/4 Roshan Dawrani 
>>
>> Hi,
>>> I am having a little difficulty converting a time UUID to its timestamp
>>> equivalent and back. Can someone please help?
>>>
>>> Here is what I am trying. Is it not the right way to do it?
>>>
>>> ===
>>> UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
>>>
>>> long time = someUUID.timestamp(); /* convery from UUID to a long
>>> timestamp */
>>> UUID otherUUID = TimeUUIDUtils.getTimeUUID(time); /* do the
>>> reverse and get back the UUID from timestamp */
>>>
>>> System.out.println(someUUID); /* someUUID and otherUUID should be
>>> same, but are different */
>>> System.out.println(otherUUID);
>>> ===
>>>
>>> --
>>> Roshan
>>> Blog: http://roshandawrani.wordpress.com/
>>> Twitter: @roshandawrani 
>>> Skype: roshandawrani
>>>
>>>
>>
>
>
> --
> Roshan
> Blog: http://roshandawrani.wordpress.com/
> Twitter: @roshandawrani 
> Skype: roshandawrani
>
>


-- 
Patricio.-


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-04 Thread Roshan Dawrani
Hello Victor,

It is actually not that I need the 2 UUIDs to be exactly same - they need to
be same timestamp wise.

So, what I need is to extract the timestamp portion from a time UUID (say,
U1) and then later in the cycle, use the same long timestamp value to
re-create a UUID (say, U2) that is equivalent of the previous one in terms
of its timestamp portion - i.e., I should be able to give this U2 and filter
the data from a column family - and it should be same as if I had used the
original UUID U1.

Does it make any more sense than before? Any way I can do that?

rgds,
Roshan

On Tue, Jan 4, 2011 at 11:46 PM, Victor Kabdebon
wrote:

> Hello Roshan,
>
> Well it is normal to do not be able to get the exact same UUID from a
> timestamp, it is its purpose.
> When you create an UUID you have in fact two information : random 64 bits
> number - 64 bits timestamp. You put that together and you have your uuid.
> .
> So unless you save your random number two UUID for the same milli( or
> micro) second are different.
>
> Best regards,
> Victor K.
> http://www.voxnucleus.fr
>
> 2011/1/4 Roshan Dawrani 
>
> Hi,
>> I am having a little difficulty converting a time UUID to its timestamp
>> equivalent and back. Can someone please help?
>>
>> Here is what I am trying. Is it not the right way to do it?
>>
>> ===
>> UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
>>
>> long time = someUUID.timestamp(); /* convery from UUID to a long
>> timestamp */
>> UUID otherUUID = TimeUUIDUtils.getTimeUUID(time); /* do the
>> reverse and get back the UUID from timestamp */
>>
>> System.out.println(someUUID); /* someUUID and otherUUID should be
>> same, but are different */
>> System.out.println(otherUUID);
>> ===
>>
>> --
>> Roshan
>> Blog: http://roshandawrani.wordpress.com/
>> Twitter: @roshandawrani 
>> Skype: roshandawrani
>>
>>
>


-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani 
Skype: roshandawrani


Re: Cassandra disk usage and failure recovery

2011-01-04 Thread Peter Schuller
This will be a very selective response, not at all as exhaustive as it
should be to truly cover what you bring up. Sorry, but here goes some
random tidbits.

> On the cassandra user list, I noticed a thread on a user that literally
> wrote his cluster to death.  Correct me if I'm wrong, but based on that
> thread, it seemed like if one node runs out of space, it kills the
> cluster.  I hope that is not the case.

A single node running out of disk doesn't kill a cluster as such.
However, if you have a cluster of nodes where each is roughly the same
size (and assuming even balancing of data) and you are so close to
being full everywhere that one of them actually becomes full, you are
in dangerous territory if you're worried about uptime.

I'd say there are two main types of issues here:

(1) Monitoring and predicting disk space usage with Cassandra is a bit
more difficult than in a traditional system because of the way it
varies over time and the way writes can have impacts on disk space
that last for some time beyond the write itself. I.e., delayed
effects.

(2) The fact that operational things like moving nodes and repair
operations need disk space, means it can be more difficult to get out
of a bad situation.

> If I have a cluster of 3 computers that is filling up in disk space and
> I add a bunch of machines, how does cassandra deal with that situation?
> Does cassandra know not to keep writing to the initial three machines
> because the disk is near full and write to the new machines only?  At
> some point my machines are going to be full of data and I don't want to
> write any more data to those machines until I add more storage to those
> machines.  Is this possible to do?

Cassandra selects replica placement based on its replication strategy
and ring layout (node's and their tokens). There is no "fallback" to
putting data elsewhere because nodes are full (doing so would likely
introduce tremendous complexity I think).

If a node goes catastrophically out of disk, it will probably stop
working and effectively be offline in the cluster. Not necessarily
though; it could be that e.g. compactions are not completing but
writes still work as do reads, and over time reads become less
performant due to lack of compaction. Alternatively if disks are
completely full and memtables cannot be flushed, I believe you'd
expect it to essentially go down.

I would say that running out of disk space is not something which is
very gracefully handled right now, although there is some code to
mitigate it. For example during compaction there is some logic to
check for disk space availability and try to compact smaller files
first to possibly make room for larger compactions, but that is not
addressing the overall issue.

>  I read somewhere that it is a bad
> idea to use more than 50% of the drive utilization because of the
> compaction requirements.

For a single column family, compaction (when major, i.e., all data is
involved) compactions can currently double disk space if data is not
overwritten or removed. In addition things like nodetool repair and
nodetool cleanup need disk space too.

Here's where I'm not really up on all the details. It would be nice to
arrive at a figure which is the absolute worst-case possible disk
space expansion that is possible.

> I was planning on using ssd for the sstables
> and standard hard drives for the compaction and writelogs.  Is this
> possible to do?

It doesn't make sense for compaction because compaction involves
replacing sstables with new ones. You could write new sstables to
separate storage and then copy them into the sstable storage location,
but that doesn't really do anything but increase the total amount of
I/O that you're doing.

>I didn't see any options for specifying where the write
> logs are written and compaction is done.

There is a commitlog_directory option. Alternatively the directory
might be a symlink.

Compaction is not in a separate location (see above).

>Also, is it possible to add
> more drives to a running machine and have cassandra utilize a mounted
> directory with free space?

Sort of, but not really the way you want it. Short version is,
"pretend the answer is no". The real answer is somewhere between yes
and no (if someone feels like doing a write-up).

I would suggest volume growth and file system resize if that is
something you plan on doing. Assuming you have a setup where such
operations are sufficiently trusted. Probably RAID + LVM. But beware
of LVM and implications on correctness (e.g., write barriers). Err..
basically, no, I don't recommend planning to fiddle with that. The
potential for problems is probably too high. KISS.

> Cassandra should know to stop writing to the node once the directory
> with the sstables is near full, but it doesn't seem like it does
> anything like that.

It's kind of non-trivial to gracefully handle it in a way that both
satisfies the "I don't want to be stuck" requirement while also
satisfying the "I don'

Re: Hector version

2011-01-04 Thread Hugo Zwaal

Hi,

I'm also using Hector on Cassandra 0.6.8, but could not get Hector 
0.6.0-19 to work in Maven. It refers to an unexistent 
or.apache.cassandra/cassandra/0.6.5 package. I suspect this should be 
org.apache.cassandra/apache-cassandra/0.6.5. I also noticed there exists 
a 0.6.0-20 version. My questions are;


1) Wouldn's it be better to use 0.6.0-20 over 0.6.0-19, or is the latter 
one preferred (an why)?
2) Is there some documentation on migrating from 0.6.0-16? I noticed 
that several things changed in a backward-incompatible way.


Thanks, Hugo.

On 12/31/2010 8:52 AM, Ran Tavory wrote:

Use 0.6.0-19

On Friday, December 31, 2010, Zhidong She  wrote:

Hi guys,

We are trying Cassandra 0.6.8, and could you please kindly tell me which Hector 
Java client is suitable for 0.6.8?
The Hector 0.7.0 says it's for Cassandra 0.7.X, and shall we use Hector 0.6.0?

Thanks,
Br
Zhidong





Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-04 Thread Roshan Dawrani
Hi,
I am having a little difficulty converting a time UUID to its timestamp
equivalent and back. Can someone please help?

Here is what I am trying. Is it not the right way to do it?

===
UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();

long time = someUUID.timestamp(); /* convery from UUID to a long
timestamp */
UUID otherUUID = TimeUUIDUtils.getTimeUUID(time); /* do the reverse
and get back the UUID from timestamp */

System.out.println(someUUID); /* someUUID and otherUUID should be
same, but are different */
System.out.println(otherUUID);
===

-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani 
Skype: roshandawrani


Re: Reclaim deleted rows space

2011-01-04 Thread shimi
Yes I am aware of that.
This is the reason I upgraded to 0.6.8.
Still all the deleted rows in the biggest SSTable will be remove in a major
compaction

Shimi

On Tue, Jan 4, 2011 at 6:40 PM, Robert Coli  wrote:

> On Tue, Jan 4, 2011 at 4:33 AM, Peter Schuller
>  wrote:
> > For some cases this will be beneficial, but not always. It's been
> > further improved for 0.7 too w.r.t. tomb stone handling in non-major
> > compactions (I don't have the JIRA ticket number handy).
>
> https://issues.apache.org/jira/browse/CASSANDRA-1074
>
> (For those playing along at home..)
>
> =Rob
>


Re: Reclaim deleted rows space

2011-01-04 Thread Robert Coli
On Tue, Jan 4, 2011 at 4:33 AM, Peter Schuller
 wrote:
> For some cases this will be beneficial, but not always. It's been
> further improved for 0.7 too w.r.t. tomb stone handling in non-major
> compactions (I don't have the JIRA ticket number handy).

https://issues.apache.org/jira/browse/CASSANDRA-1074

(For those playing along at home..)

=Rob


Re: drop column family bug

2011-01-04 Thread Jonathan Ellis
Data files are explained on the page I linked.

Snapshots must be deleted manually.

2011/1/4 陶敏 

>  Hello!
>
> I would like to ask a question, data files and snapshot files will be deleted 
> at what time? When I run compact successor remain
>
> please help me. Thank you.
>
>
>
> Sincerely,
>
>
>
> Salute
>
>
>
>
>
> *[image: Picon]* 
>  Re: drop column family bug
>
> Jonathan Ellis  gmail.com>
> 2011-01-04 14:08:03 GMT
>
> It's normal for the sstables to remain temporarily.  See discussion of
>
> "compaction marker" at *http://wiki.apache.org/cassandra/MemtableSSTable.* 
> 
>
>
>
> It's also normal for a snapshot to be taken before drop commands.
>
>
>
> 2011/1/4 陶敏  taobao.com>
>
>
>
> >  Hello!
>
> > When I use cassandra 0.7 rc4 delete column family, the data file 
> > still
>
> > exists. But also appeared in the snapshot data backup, this is a bug or my 
> > version
>
> > of the parameters wrong, please help me. Thank you.
>
> >
>
> > Sincerely,
>
> >
>
> >Salute
>
>
>
>
> __ Information from ESET NOD32 Antivirus, version of virus
> signature database 5691 (20101210) __
>
> The message was checked by ESET NOD32 Antivirus.
>
> http://www.eset.com
>
> --
>
> This email (including any attachments) is confidential and may be legally
> privileged. If you received this email in error, please delete it
> immediately and do not copy it or use it for any purpose or disclose its
> contents to any other person. Thank you.
>
>
> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Bootstrapping taking long

2011-01-04 Thread shimi
You will have something new to talk about in your talk tomorrow :)

You said that the anti compaction was only on a single node? I think that
your new node should get data from at least two other nodes (depending on
the replication factor). Maybe the problem is not in the new node.
In old version (I think prior to 0.6.3) there was case of stuck bootstrap
that required restart to the new node and the nodes which were suppose to
stream data to it. As far as I remember this case was resolved. I haven't
seen this problem since then.

Shimi

On Tue, Jan 4, 2011 at 3:01 PM, Ran Tavory  wrote:

> Running nodetool decommission didn't help. Actually the node refused to
> decommission itself (b/c it wasn't part of the ring). So I simply stopped
> the process, deleted all the data directories and started it again. It
> worked in the sense of the node bootstrapped again but as before, after it
> had finished moving the data nothing happened for a long time (I'm still
> waiting, but nothing seems to be happening).
>
> Any hints how to analyze a "stuck" bootstrapping node??
> thanks
>
> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:
>
>> Thanks Shimi, so indeed anticompaction was run on one of the other nodes
>> from the same DC but to my understanding it has already ended. A few hour
>> ago...
>> I plenty of log messages such as [1] which ended a couple of hours ago,
>> and I've seen the new node streaming and accepting the data from the node
>> which performed the anticompaction and so far it was normal so it seemed
>> that data is at its right place. But now the new node seems sort of stuck.
>> None of the other nodes is anticompacting right now or had been
>> anticompacting since then.
>> The new node's CPU is close to zero, it's iostats are almost zero so I
>> can't find another bottleneck that would keep it hanging.
>>
>> On the IRC someone suggested I'd maybe retry to join this node,
>> e.g. decommission and rejoin it again. I'll try it now...
>>
>>
>> [1]
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 CompactionManager.java
>> (line 338) AntiCompacting
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java
>> (line 338) AntiCompacting
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java
>> (line 338) AntiCompacting
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 CompactionManager.java
>> (line 338) AntiCompacting
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>
>> On Tue, Jan 4, 2011 at 12:45 PM, shimi  wrote:
>>
>>> In my experience most of the time it takes for a node to join the cluster
>>> is the anticompaction on the other nodes. The streaming part is very fast.
>>> Check the other nodes logs to see if there is any node doing
>>> anticompaction.
>>> I don't remember how much data I had in the cluster when I needed to
>>> add/remove nodes. I do remember that it took a few hours.
>>>
>>> The node will join the ring only when it will finish the bootstrap.
>>>
>>> Shimi
>>>
>>>
>>> On Tue, Jan 4, 2011 at 12:28 PM, Ran Tavory  wrote:
>>>
 I asked the same question on the IRC but no luck there, everyone's
 asleep ;)...

 Using 0.6.6 I'm adding a new node to the cluster.
 It starts out fine but then gets stuck on the bootstrapping state for
 too long. More than an hour and still counting.

 $ bin/nodetool -p 9004 -h localhost streams
> Mode: Bootstrapping
> Not sending any streams.
> Not receiving any streams.


 It seemed to have streamed data from other nodes and indeed the load is
 non-zero but I'm not clear what's keeping it right now from finishing.

> $ bin/nodetool -p 9004 -h localhost info
> 51042355038140769519506191114765231716
> Load : 22.49 GB
> Generation No: 1294133781
> Uptime (seconds) : 1795
> Heap Memory (MB) : 315.31 / 6117.00


 nodetool ring does not list this new node in the ring, although nodetool
 can happily talk to the new node, it's just not listing itself as a member
 of the ring. This is expected when the node is still bootstrapping, so the
 question is still how long might the b

Looking for London-based users of Cassandra

2011-01-04 Thread Dave Gardner
I am looking for London-based users of Cassandra who would be interested in
giving a short talk on _how_ they make use of Cassandra, hopefully including
details such as data layout, types of query, load etc.. This is for the
Cassandra London user group -- this month we are planning to have more than
one talk based around "use cases".

If anyone is interested then please get in contact.

http://www.meetup.com/Cassandra-London/calendar/15490565/


Dave


Re: Bootstrapping taking long

2011-01-04 Thread Jake Luciani
In 0.6, locate the node doing anti-compaction and look in the "streams"
subdirectory in the keyspace data dir to monitor the anti-compaction
progress (it puts new SSTables for bootstrapping node in there)

On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory  wrote:

> Running nodetool decommission didn't help. Actually the node refused to
> decommission itself (b/c it wasn't part of the ring). So I simply stopped
> the process, deleted all the data directories and started it again. It
> worked in the sense of the node bootstrapped again but as before, after it
> had finished moving the data nothing happened for a long time (I'm still
> waiting, but nothing seems to be happening).
>
> Any hints how to analyze a "stuck" bootstrapping node??
> thanks
>
> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:
>
>> Thanks Shimi, so indeed anticompaction was run on one of the other nodes
>> from the same DC but to my understanding it has already ended. A few hour
>> ago...
>> I plenty of log messages such as [1] which ended a couple of hours ago,
>> and I've seen the new node streaming and accepting the data from the node
>> which performed the anticompaction and so far it was normal so it seemed
>> that data is at its right place. But now the new node seems sort of stuck.
>> None of the other nodes is anticompacting right now or had been
>> anticompacting since then.
>> The new node's CPU is close to zero, it's iostats are almost zero so I
>> can't find another bottleneck that would keep it hanging.
>>
>> On the IRC someone suggested I'd maybe retry to join this node,
>> e.g. decommission and rejoin it again. I'll try it now...
>>
>>
>> [1]
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 CompactionManager.java
>> (line 338) AntiCompacting
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java
>> (line 338) AntiCompacting
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java
>> (line 338) AntiCompacting
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
>>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 CompactionManager.java
>> (line 338) AntiCompacting
>> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>>
>> On Tue, Jan 4, 2011 at 12:45 PM, shimi  wrote:
>>
>>> In my experience most of the time it takes for a node to join the cluster
>>> is the anticompaction on the other nodes. The streaming part is very fast.
>>> Check the other nodes logs to see if there is any node doing
>>> anticompaction.
>>> I don't remember how much data I had in the cluster when I needed to
>>> add/remove nodes. I do remember that it took a few hours.
>>>
>>> The node will join the ring only when it will finish the bootstrap.
>>>
>>> Shimi
>>>
>>>
>>> On Tue, Jan 4, 2011 at 12:28 PM, Ran Tavory  wrote:
>>>
 I asked the same question on the IRC but no luck there, everyone's
 asleep ;)...

 Using 0.6.6 I'm adding a new node to the cluster.
 It starts out fine but then gets stuck on the bootstrapping state for
 too long. More than an hour and still counting.

 $ bin/nodetool -p 9004 -h localhost streams
> Mode: Bootstrapping
> Not sending any streams.
> Not receiving any streams.


 It seemed to have streamed data from other nodes and indeed the load is
 non-zero but I'm not clear what's keeping it right now from finishing.

> $ bin/nodetool -p 9004 -h localhost info
> 51042355038140769519506191114765231716
> Load : 22.49 GB
> Generation No: 1294133781
> Uptime (seconds) : 1795
> Heap Memory (MB) : 315.31 / 6117.00


 nodetool ring does not list this new node in the ring, although nodetool
 can happily talk to the new node, it's just not listing itself as a member
 of the ring. This is expected when the node is still bootstrapping, so the
 question is still how long might the bootstrap take and whether is it 
 stuck.

 The data ins't huge so I find it hard to believe that streaming or anti
 compaction are the bottlenecks. I have ~20G on each node and the new node
 already has just about that so it seems that all data had already been
 streamed to it successfully, or at least most of t

Reading data problems during bootstrap. [pycassa 0.7.0 rc4]

2011-01-04 Thread Mateusz Korniak
hi ! 
As cassandra newbie, I am trying to convert my single node cluster to cluster 
with two nodes with RF=2.

I have one node cluster , RF=1  all data accessible:
nodetool -h 192.168.3.8  ring
Address Status State   LoadOwnsToken
192.168.3.8 Up Normal  1.59 GB 100.00% 
150705614882854895881815284349323762700

  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Replication Factor: 1


If I switch to RF=2 here my cluster will refuse updates and reads. So I 
bootstrap 2nd node (192.168.3.4) :

$ nodetool -h 192.168.3.8  ring
Address Status State   LoadOwnsToken
   
150705614882854895881815284349323762700
192.168.3.4 Up Joining 120.94 KB   49.80%  
65291865063116528976449321853009633660
192.168.3.8 Up Normal  1.59 GB 50.20%  
150705614882854895881815284349323762700

question 1: Why in that point I am unable to read data from part of keys ? 

Even when node is up:

nodetool -h 192.168.3.8  ring
Address Status State   LoadOwnsToken
   
150705614882854895881815284349323762700
192.168.3.4 Up Normal  721.37 MB   49.80%  
65291865063116528976449321853009633660
192.168.3.8 Up Normal  1.59 GB 50.20%  
150705614882854895881815284349323762700

I am still unable to read part of my original set of data with 
ConsistencyLevel.ONE  (NotFoundException in pycassa) :/

question 2: Why is that ? And what should I do to have cluster with full data 
?

Next I planned to do:
update keyspacewith replication_factor=2;
repair both nodes,
and this point have fully working 2 node cluster with RF=2

question 3: Is this proper approach or there is better one ?


question 4: I hoped that during above operations I would be able to _read_ 
whole dataset as it was at beginning in one node cluster, is it possible ?


Thanks in advance for any answers, regards,
-- 
Mateusz Korniak


Re: Reclaim deleted rows space

2011-01-04 Thread shimi
I think I didn't make myself clear.
I don't have a problem with disk space. I have a problem with the data
size.
I have a simple crud application. Most of the requests are read but there
are update/delete and when the time pass the number of deleted rows is big
enough in order to free some disk space (a matter of days and not hours).
Since not all of the data can fit to RAM (and I have a lot of RAM) the rest
is served from disk. Since disk is slow I want to reduce as much as possible
the number of requests that goes to the disk. The more requests to the disk,
the disk wait time gets longer and it takes more time to return a response.

Bottom line is that I want to reduce the number of requests that goes to
disk. Since there is enough data that is no longer valid I can do it by
reclaiming the space. The only way to do it is by running Major compaction.
I can wait and let Cassandra do it for me but then the data size will get
even bigger and the response time will be worst. I can do it manually but I
prefer it to happen in the background with less impact on the system

Shimi


On Tue, Jan 4, 2011 at 2:33 PM, Peter Schuller
wrote:

> > This is what I thought. I was wishing there might be another way to
> reclaim
> > the space.
>
> Be sure you really need this first :) Normally you just let it happen in
> the bg.
>
> > The problem is that the more data you have the more time it will take to
> > Cassandra to response.
>
> Relative to what though? There are definitely important side-effects
> of having very large data sets, and part of that involves compactions,
> but in a normal steady state type of system you should never be in the
> position to "wait" for a major compaction to run. Compactions are
> something that is intended to run every now and then in the
> background. It will result in variations in disk space within certain
> bounds, which is expected.
>
> Certainly the situation can be improved and the current disk space
> utilization situation is not perfect, but the above suggests to me
> that you're trying to do something that is not really intended to be
> done.
>
> > Reclaim space of deleted rows in the biggest SSTable requires Major
> > compaction. This compaction can be triggered by adding x2 data (or x4
> data
> > in the default configuration) to the system or by executing it manually
> > using JMX.
>
> You can indeed choose to trigger major compactions by e.g. cron jobs.
> But just be aware that if you're operating under conditions where you
> are close to disk space running out, you have other concerns too -
> such as periodic repair operations also needing disk space.
>
> Also; suppose you're overwriting lots of data (or replacing by
> deleting and adding other data). It is not necessarily true that you
> need 4x the space relative to what you otherwise do just because of
> the compaction threshold.
>
> Keep in mind that compactions already need extra space anyway. If
> you're *not* overwriting or adding data, a compaction of a single CF
> is expected to need up to twice the amount of space that it occupies.
> If you're doing more overwrites and deletions though, as you point out
> you will have more "dead" data at any given point in time. But on the
> other hand, the peak disk space usage during compactions is lower. So
> the actual peak disk space usage (which is what matters since you must
> have this much disk space) is actually helped by the
> deletions/overwrites too.
>
> Further, suppose you trigger major compactions more often. That means
> each compaction will have a higher relative spike of disk usage
> because less data has had time to be overwritten or removed.
>
> So in a sense, it's like the disk space demands is being moved between
> the category of "dead data retained for longer than necessary" and
> "peak disk usage during compaction".
>
> Also keep in mind that the *low* peak of disk space usage is not
> subject to any fragmentation concerns. Depending on the size of your
> data compared to e.g. column names, that disk space usage might be
> significantly lower than what you would get with an in-place updating
> database. There are lots of trade-offs :)
>
> You say you have to "wait" for deletions though which sounds like
> you're doing something unusual. Are you doing stuff like deleting lots
> of data in bulk from one CF, only to then write data to *another* CF?
> Such that you're actually having to wait for disk space to be freed to
> make room for data somewhere else?
>
> > In case of a system that deletes data regularly, which needs to serve
> > customers all day and the time it takes should be in ms, this is a
> problem.
>
> Not in general. I am afraid there may be some misunderstanding here.
> Unless disk space is a problem for you (i.e., you're running out of
> space), there is no need to wait for compactions. And certainly
> whether you can serve traffic 24/7 at low-ms latencies is an important
> consideration, and does become complex when disk I/O is involved, but

Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
Running nodetool decommission didn't help. Actually the node refused to
decommission itself (b/c it wasn't part of the ring). So I simply stopped
the process, deleted all the data directories and started it again. It
worked in the sense of the node bootstrapped again but as before, after it
had finished moving the data nothing happened for a long time (I'm still
waiting, but nothing seems to be happening).

Any hints how to analyze a "stuck" bootstrapping node??
thanks

On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:

> Thanks Shimi, so indeed anticompaction was run on one of the other nodes
> from the same DC but to my understanding it has already ended. A few hour
> ago...
> I plenty of log messages such as [1] which ended a couple of hours ago, and
> I've seen the new node streaming and accepting the data from the node which
> performed the anticompaction and so far it was normal so it seemed that data
> is at its right place. But now the new node seems sort of stuck. None of the
> other nodes is anticompacting right now or had been anticompacting since
> then.
> The new node's CPU is close to zero, it's iostats are almost zero so I
> can't find another bottleneck that would keep it hanging.
>
> On the IRC someone suggested I'd maybe retry to join this node,
> e.g. decommission and rejoin it again. I'll try it now...
>
>
> [1]
>  INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 CompactionManager.java
> (line 338) AntiCompacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java
> (line 338) AntiCompacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java
> (line 338) AntiCompacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
>  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 CompactionManager.java
> (line 338) AntiCompacting
> [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
>
> On Tue, Jan 4, 2011 at 12:45 PM, shimi  wrote:
>
>> In my experience most of the time it takes for a node to join the cluster
>> is the anticompaction on the other nodes. The streaming part is very fast.
>> Check the other nodes logs to see if there is any node doing
>> anticompaction.
>> I don't remember how much data I had in the cluster when I needed to
>> add/remove nodes. I do remember that it took a few hours.
>>
>> The node will join the ring only when it will finish the bootstrap.
>>
>> Shimi
>>
>>
>> On Tue, Jan 4, 2011 at 12:28 PM, Ran Tavory  wrote:
>>
>>> I asked the same question on the IRC but no luck there, everyone's asleep
>>> ;)...
>>>
>>> Using 0.6.6 I'm adding a new node to the cluster.
>>> It starts out fine but then gets stuck on the bootstrapping state for too
>>> long. More than an hour and still counting.
>>>
>>> $ bin/nodetool -p 9004 -h localhost streams
 Mode: Bootstrapping
 Not sending any streams.
 Not receiving any streams.
>>>
>>>
>>> It seemed to have streamed data from other nodes and indeed the load is
>>> non-zero but I'm not clear what's keeping it right now from finishing.
>>>
 $ bin/nodetool -p 9004 -h localhost info
 51042355038140769519506191114765231716
 Load : 22.49 GB
 Generation No: 1294133781
 Uptime (seconds) : 1795
 Heap Memory (MB) : 315.31 / 6117.00
>>>
>>>
>>> nodetool ring does not list this new node in the ring, although nodetool
>>> can happily talk to the new node, it's just not listing itself as a member
>>> of the ring. This is expected when the node is still bootstrapping, so the
>>> question is still how long might the bootstrap take and whether is it stuck.
>>>
>>> The data ins't huge so I find it hard to believe that streaming or anti
>>> compaction are the bottlenecks. I have ~20G on each node and the new node
>>> already has just about that so it seems that all data had already been
>>> streamed to it successfully, or at least most of the data... So what is it
>>> waiting for now? (same question, rephrased... ;)
>>>
>>> I tried:
>>> 1. Restarting the new node. No good. All logs seem normal but at the end
>>> the node is still in bootstrap mode.
>>> 2. As someone suggested I increased the rpc timeout from 10k to 30k
>>> (RpcTimeoutInMillis) but that didn't seem to help. I did this only on the
>

Re: Reclaim deleted rows space

2011-01-04 Thread Peter Schuller
> This is what I thought. I was wishing there might be another way to reclaim
> the space.

Be sure you really need this first :) Normally you just let it happen in the bg.

> The problem is that the more data you have the more time it will take to
> Cassandra to response.

Relative to what though? There are definitely important side-effects
of having very large data sets, and part of that involves compactions,
but in a normal steady state type of system you should never be in the
position to "wait" for a major compaction to run. Compactions are
something that is intended to run every now and then in the
background. It will result in variations in disk space within certain
bounds, which is expected.

Certainly the situation can be improved and the current disk space
utilization situation is not perfect, but the above suggests to me
that you're trying to do something that is not really intended to be
done.

> Reclaim space of deleted rows in the biggest SSTable requires Major
> compaction. This compaction can be triggered by adding x2 data (or x4 data
> in the default configuration) to the system or by executing it manually
> using JMX.

You can indeed choose to trigger major compactions by e.g. cron jobs.
But just be aware that if you're operating under conditions where you
are close to disk space running out, you have other concerns too -
such as periodic repair operations also needing disk space.

Also; suppose you're overwriting lots of data (or replacing by
deleting and adding other data). It is not necessarily true that you
need 4x the space relative to what you otherwise do just because of
the compaction threshold.

Keep in mind that compactions already need extra space anyway. If
you're *not* overwriting or adding data, a compaction of a single CF
is expected to need up to twice the amount of space that it occupies.
If you're doing more overwrites and deletions though, as you point out
you will have more "dead" data at any given point in time. But on the
other hand, the peak disk space usage during compactions is lower. So
the actual peak disk space usage (which is what matters since you must
have this much disk space) is actually helped by the
deletions/overwrites too.

Further, suppose you trigger major compactions more often. That means
each compaction will have a higher relative spike of disk usage
because less data has had time to be overwritten or removed.

So in a sense, it's like the disk space demands is being moved between
the category of "dead data retained for longer than necessary" and
"peak disk usage during compaction".

Also keep in mind that the *low* peak of disk space usage is not
subject to any fragmentation concerns. Depending on the size of your
data compared to e.g. column names, that disk space usage might be
significantly lower than what you would get with an in-place updating
database. There are lots of trade-offs :)

You say you have to "wait" for deletions though which sounds like
you're doing something unusual. Are you doing stuff like deleting lots
of data in bulk from one CF, only to then write data to *another* CF?
Such that you're actually having to wait for disk space to be freed to
make room for data somewhere else?

> In case of a system that deletes data regularly, which needs to serve
> customers all day and the time it takes should be in ms, this is a problem.

Not in general. I am afraid there may be some misunderstanding here.
Unless disk space is a problem for you (i.e., you're running out of
space), there is no need to wait for compactions. And certainly
whether you can serve traffic 24/7 at low-ms latencies is an important
consideration, and does become complex when disk I/O is involved, but
it is not about disk *space*. If you have important performance
requirements, make sure you can service the read load at all given
your data set size. If you're runnning out of disk, I presume your
data is big. See
http://wiki.apache.org/cassandra/LargeDataSetConsiderations

Perhaps if you can describe your situation in more detail?

> It appears to me that in order to use Cassandra you must have a process that
> will trigger major compaction on the nodes once in X amount of time.

For some cases this will be beneficial, but not always. It's been
further improved for 0.7 too w.r.t. tomb stone handling in non-major
compactions (I don't have the JIRA ticket number handy). It's
certainly not a hard requirement and would only ever be relevant if
you're operating nodes that are significantly full.

> One case where you would do that is when you don't (or hardly) delete data.

Or just in most cases where you don't push disk space concerns.

> Another one is when your upper limit of time it should take to response is
> very high so major compaction will not hurt you.

To be really clear: Compaction is a background operation. It is never
the case that reads or writes somehow "wait" for compaction to
complete.

-- 
/ Peter Schuller


Re: Bootstrapping taking long

2011-01-04 Thread Ran Tavory
Thanks Shimi, so indeed anticompaction was run on one of the other nodes
from the same DC but to my understanding it has already ended. A few hour
ago...
I plenty of log messages such as [1] which ended a couple of hours ago, and
I've seen the new node streaming and accepting the data from the node which
performed the anticompaction and so far it was normal so it seemed that data
is at its right place. But now the new node seems sort of stuck. None of the
other nodes is anticompacting right now or had been anticompacting since
then.
The new node's CPU is close to zero, it's iostats are almost zero so I can't
find another bottleneck that would keep it hanging.

On the IRC someone suggested I'd maybe retry to join this node,
e.g. decommission and rejoin it again. I'll try it now...


[1]
 INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 CompactionManager.java
(line 338) AntiCompacting
[org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
 INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java
(line 338) AntiCompacting
[org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
 INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java
(line 338) AntiCompacting
[org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
 INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 CompactionManager.java
(line 338) AntiCompacting
[org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]

On Tue, Jan 4, 2011 at 12:45 PM, shimi  wrote:

> In my experience most of the time it takes for a node to join the cluster
> is the anticompaction on the other nodes. The streaming part is very fast.
> Check the other nodes logs to see if there is any node doing
> anticompaction.
> I don't remember how much data I had in the cluster when I needed to
> add/remove nodes. I do remember that it took a few hours.
>
> The node will join the ring only when it will finish the bootstrap.
>
> Shimi
>
>
> On Tue, Jan 4, 2011 at 12:28 PM, Ran Tavory  wrote:
>
>> I asked the same question on the IRC but no luck there, everyone's asleep
>> ;)...
>>
>> Using 0.6.6 I'm adding a new node to the cluster.
>> It starts out fine but then gets stuck on the bootstrapping state for too
>> long. More than an hour and still counting.
>>
>> $ bin/nodetool -p 9004 -h localhost streams
>>> Mode: Bootstrapping
>>> Not sending any streams.
>>> Not receiving any streams.
>>
>>
>> It seemed to have streamed data from other nodes and indeed the load is
>> non-zero but I'm not clear what's keeping it right now from finishing.
>>
>>> $ bin/nodetool -p 9004 -h localhost info
>>> 51042355038140769519506191114765231716
>>> Load : 22.49 GB
>>> Generation No: 1294133781
>>> Uptime (seconds) : 1795
>>> Heap Memory (MB) : 315.31 / 6117.00
>>
>>
>> nodetool ring does not list this new node in the ring, although nodetool
>> can happily talk to the new node, it's just not listing itself as a member
>> of the ring. This is expected when the node is still bootstrapping, so the
>> question is still how long might the bootstrap take and whether is it stuck.
>>
>> The data ins't huge so I find it hard to believe that streaming or anti
>> compaction are the bottlenecks. I have ~20G on each node and the new node
>> already has just about that so it seems that all data had already been
>> streamed to it successfully, or at least most of the data... So what is it
>> waiting for now? (same question, rephrased... ;)
>>
>> I tried:
>> 1. Restarting the new node. No good. All logs seem normal but at the end
>> the node is still in bootstrap mode.
>> 2. As someone suggested I increased the rpc timeout from 10k to 30k
>> (RpcTimeoutInMillis) but that didn't seem to help. I did this only on the
>> new node. Should I have done that on all (old) nodes as well? Or maybe only
>> on the ones that were supposed to stream data to that node.
>> 3. Logging level at DEBUG now but nothing interesting going on except
>> for occasional messages such as [1] or [2]
>>
>> So the question is: what's keeping the new node from finishing the
>> bootstrap and how can I check its status?
>> Thanks
>>
>> [1] DEBUG [Timer-1] 2011-01-04 05:21:24,402 LoadDisseminator.java (line
>> 36) Disseminating load info ...
>> [2] DEBUG [RMI TCP Connection(22)-192.168.252.88] 2011-01-04 05:12:48,033
>> StorageService.java (line 1189) computing ranges for
>> 283

Re: Reclaim deleted rows space

2011-01-04 Thread shimi
This is what I thought. I was wishing there might be another way to reclaim
the space.
The problem is that the more data you have the more time it will take to
Cassandra to response.
Reclaim space of deleted rows in the biggest SSTable requires Major
compaction. This compaction can be triggered by adding x2 data (or x4 data
in the default configuration) to the system or by executing it manually
using JMX.
In case of a system that deletes data regularly, which needs to serve
customers all day and the time it takes should be in ms, this is a problem.

It appears to me that in order to use Cassandra you must have a process that
will trigger major compaction on the nodes once in X amount of time.
One case where you would do that is when you don't (or hardly) delete data.
Another one is when your upper limit of time it should take to response is
very high so major compaction will not hurt you.

It might be that the only way to solve this problem is by having at least
two copies of each row in each data center and use a dynamic snitch.

Shimi

On Mon, Jan 3, 2011 at 7:55 PM, Peter Schuller
wrote:

> > Major compaction does it, but only if GCGraceSeconds has elapsed. See:
> >
> >
> http://spyced.blogspot.com/2010/02/distributed-deletes-in-cassandra.html
>
> But to be clear, under the assumption that your data is a lot smaller
> than the tombstones, a major compaction will definitely reclaim space
> even if GCGraceSeconds has not elapsed. So actually my original
> response is a bit misleading.
>
> --
> / Peter Schuller
>


Re: Bootstrapping taking long

2011-01-04 Thread shimi
In my experience most of the time it takes for a node to join the cluster is
the anticompaction on the other nodes. The streaming part is very fast.
Check the other nodes logs to see if there is any node doing anticompaction.
I don't remember how much data I had in the cluster when I needed to
add/remove nodes. I do remember that it took a few hours.

The node will join the ring only when it will finish the bootstrap.

Shimi


On Tue, Jan 4, 2011 at 12:28 PM, Ran Tavory  wrote:

> I asked the same question on the IRC but no luck there, everyone's asleep
> ;)...
>
> Using 0.6.6 I'm adding a new node to the cluster.
> It starts out fine but then gets stuck on the bootstrapping state for too
> long. More than an hour and still counting.
>
> $ bin/nodetool -p 9004 -h localhost streams
>> Mode: Bootstrapping
>> Not sending any streams.
>> Not receiving any streams.
>
>
> It seemed to have streamed data from other nodes and indeed the load is
> non-zero but I'm not clear what's keeping it right now from finishing.
>
>> $ bin/nodetool -p 9004 -h localhost info
>> 51042355038140769519506191114765231716
>> Load : 22.49 GB
>> Generation No: 1294133781
>> Uptime (seconds) : 1795
>> Heap Memory (MB) : 315.31 / 6117.00
>
>
> nodetool ring does not list this new node in the ring, although nodetool
> can happily talk to the new node, it's just not listing itself as a member
> of the ring. This is expected when the node is still bootstrapping, so the
> question is still how long might the bootstrap take and whether is it stuck.
>
> The data ins't huge so I find it hard to believe that streaming or anti
> compaction are the bottlenecks. I have ~20G on each node and the new node
> already has just about that so it seems that all data had already been
> streamed to it successfully, or at least most of the data... So what is it
> waiting for now? (same question, rephrased... ;)
>
> I tried:
> 1. Restarting the new node. No good. All logs seem normal but at the end
> the node is still in bootstrap mode.
> 2. As someone suggested I increased the rpc timeout from 10k to 30k
> (RpcTimeoutInMillis) but that didn't seem to help. I did this only on the
> new node. Should I have done that on all (old) nodes as well? Or maybe only
> on the ones that were supposed to stream data to that node.
> 3. Logging level at DEBUG now but nothing interesting going on except
> for occasional messages such as [1] or [2]
>
> So the question is: what's keeping the new node from finishing the
> bootstrap and how can I check its status?
> Thanks
>
> [1] DEBUG [Timer-1] 2011-01-04 05:21:24,402 LoadDisseminator.java (line 36)
> Disseminating load info ...
> [2] DEBUG [RMI TCP Connection(22)-192.168.252.88] 2011-01-04 05:12:48,033
> StorageService.java (line 1189) computing ranges for
> 28356863910078205288614550619314017621,
> 56713727820156410577229101238628035242,
>  85070591730234615865843651857942052863,
> 113427455640312821154458202477256070484,
> 141784319550391026443072753096570088105,
> 170141183460469231731687303715884105727
>
> --
> /Ran
>
>


Bootstrapping taking long

2011-01-04 Thread Ran Tavory
I asked the same question on the IRC but no luck there, everyone's asleep
;)...

Using 0.6.6 I'm adding a new node to the cluster.
It starts out fine but then gets stuck on the bootstrapping state for too
long. More than an hour and still counting.

$ bin/nodetool -p 9004 -h localhost streams
> Mode: Bootstrapping
> Not sending any streams.
> Not receiving any streams.


It seemed to have streamed data from other nodes and indeed the load is
non-zero but I'm not clear what's keeping it right now from finishing.

> $ bin/nodetool -p 9004 -h localhost info
> 51042355038140769519506191114765231716
> Load : 22.49 GB
> Generation No: 1294133781
> Uptime (seconds) : 1795
> Heap Memory (MB) : 315.31 / 6117.00


nodetool ring does not list this new node in the ring, although nodetool can
happily talk to the new node, it's just not listing itself as a member of
the ring. This is expected when the node is still bootstrapping, so the
question is still how long might the bootstrap take and whether is it stuck.

The data ins't huge so I find it hard to believe that streaming or anti
compaction are the bottlenecks. I have ~20G on each node and the new node
already has just about that so it seems that all data had already been
streamed to it successfully, or at least most of the data... So what is it
waiting for now? (same question, rephrased... ;)

I tried:
1. Restarting the new node. No good. All logs seem normal but at the end the
node is still in bootstrap mode.
2. As someone suggested I increased the rpc timeout from 10k to 30k
(RpcTimeoutInMillis) but that didn't seem to help. I did this only on the
new node. Should I have done that on all (old) nodes as well? Or maybe only
on the ones that were supposed to stream data to that node.
3. Logging level at DEBUG now but nothing interesting going on except
for occasional messages such as [1] or [2]

So the question is: what's keeping the new node from finishing the bootstrap
and how can I check its status?
Thanks

[1] DEBUG [Timer-1] 2011-01-04 05:21:24,402 LoadDisseminator.java (line 36)
Disseminating load info ...
[2] DEBUG [RMI TCP Connection(22)-192.168.252.88] 2011-01-04 05:12:48,033
StorageService.java (line 1189) computing ranges for
28356863910078205288614550619314017621,
56713727820156410577229101238628035242,
 85070591730234615865843651857942052863,
113427455640312821154458202477256070484,
141784319550391026443072753096570088105,
170141183460469231731687303715884105727

-- 
/Ran


somebody interested in hacking some very simple php client

2011-01-04 Thread nicolas lattuada

Yesterday i made it real quick, maybe it can help someone.

Here it is:

http://pastebin.com/bAyWMfXD

Hope it helps.

Nicolas