Riptano Cassandra trainings in Baltimore and Santa Clara

2011-01-05 Thread Jonathan Ellis
Riptano has two Apache Cassandra training days coming up: Baltimore on
Jan 19 and Santa Clara on Feb 4.

The Baltimore training will be taught by Jake Luciani, author of
Lucandra/Solandra.  The Santa Clara training will be taught by Ben
Coverston, Riptano's director of operations.

These are both full-day, hands-on events covering application design
and operations with the new features in Cassandra 0.7.  For more
details, see http://www.eventbrite.com/org/474011012.

See you there!

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Roshan Dawrani
Hi Patricio,

Some thoughts inline.

2011/1/6 Patricio Echagüe 

> Roshan, the first 64 bits does contain the version. The method
> UUID.timestamp() indeed takes it out before returning. You are right in that
> point. I based my comment on the UUID spec.
>

I know 64 bits have the version, but timestamp() doesn't and hence it is OK
to use it for chronological ordering. Anyway, we agree on it now and this
point is out.


>
> What I am not convinced is that the framework should provide support to
> create an almost identical UUID where only the timestamp is the same
> (between the two UUIDs).
>

Well, I didn't really ask for framework to provide me such an almost
identical UUID. What I raised was that since Hector is computing UTC time in
100 nano-sec units as

utcTime = msec * 1000 + 0x01B21DD213814000L
(NUM_100NS_INTERVALS_SINCE_UUID_EPOCH), it should, at the minimum, give a
utility function to do the opposite

msec  =  (utcTime - 0x01B21DD213814000L / 1000), so that if someone has to
create an almost identical UUID, where timestamp is same, as I needed, *he
shouldn't need to deal with such magic numbers that are linked to Hector's
guts.*

So, I don't mind creating the UUID myself, but I don't want to do magic
calculations that should be done inside Hector to-and-fro, as it is an
Hector's internal design thing.


>
> UUID.equals() and UUID.compareTo() does compare the whole bit set to say
> that two objects are the same. It does compare the first 64 bits to avoid
> comparing the rest in case the most significant bits already show a
> difference.
>

I know it may need to look at all 128 bits eventually - but it first looks
at first 64 bits (time stamp) and then the next 64. That's why I qualified
it with "for my usecase". It works for me, because the data I am filtering
is already within a particular user's data-set - and the possibility of a
user having 2 data-points at the same nano-second value (so that
clockseq+node bits come into picture) is functionally nil.


>
> But coming to your point, should Hector provide that kind of support or do
> you feel that the problem you have is specific to your application ?
>

As covered above, half of my solution should go inside Hector API, I feel.
Other half about re-creating the same-timestamp-UUID and comparison using it
is specific to my application.


>
> I feel like UUID is as it says an Unique Identifier and creating a sort-of
> UUID based on a previous timestamp disregarding the least significant bits
> is not the right support Hector should expose.
>

The support Hector should expose is to keep its magic calculations inside
to-and-fro.

Does it make any sense?

-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani 
Skype: roshandawrani


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Patricio Echagüe
Roshan, the first 64 bits does contain the version. The method
UUID.timestamp() indeed takes it out before returning. You are right in that
point. I based my comment on the UUID spec.

What I am not convinced is that the framework should provide support to
create an almost identical UUID where only the timestamp is the same
(between the two UUIDs).

UUID.equals() and UUID.compareTo() does compare the whole bit set to say
that two objects are the same. It does compare the first 64 bits to avoid
comparing the rest in case the most significant bits already show a
difference.

But coming to your point, should Hector provide that kind of support or do
you feel that the problem you have is specific to your application ?

I feel like UUID is as it says an Unique Identifier and creating a sort-of
UUID based on a previous timestamp disregarding the least significant bits
is not the right support Hector should expose.

Thoughts?

On Wed, Jan 5, 2011 at 6:30 PM, Roshan Dawrani wrote:

> Hi Patricio,
>
> Thanks for your comment. Replying inline.
>
> 2011/1/5 Patricio Echagüe 
>
> Roshan, just a comment in your solution. The time returned is not a simple
>> long. It also contains some bits indicating the version.
>
>
> I don't think so. The version bits from the most significant 64 bits of the
> UUID are not used in creating timestamp() value. It uses only time_low,
> time_mid and time_hi fields of the UUID and not version, as documented here:
>
> http://download.oracle.com/javase/1.5.0/docs/api/java/util/UUID.html#timestamp%28%29.
>
>
> When the same timestamp comes back and I call
> TimeUUIDUtils.getTimeUUID(tmp), it internally puts the version back in it
> and makes it a Time UUID.
>
>
>> On the other hand, you are assuming that the same machine is processing
>> your request and recreating a UUID base on a long you provide. The
>> clockseqAndNode id will vary if another machine takes care of the request
>> (referring to your use case) .
>>
>
> When I recreate my UUID using the timestamp() value, my requirement is not
> to arrive at the exactly same UUID from which timestamp() was derived in the
> first place. I need a recreated UUID *that should be equivalent in terms
> of its time value* - so that filtering the time-sorted columns using this
> time UUID works fine. So, if the lower order 64 bits (clockseq + node)
> become different, I don't think it is of any concern because the UUID
> comparison first goes by most significant 64 bits, i.e. the time value and
> that should settle the time comparison in my use case.
>
>
>> Is it possible for you to send the UUID to the view? I think that would be
>> the correct behavior as a simple long does not contain enough information to
>> recreate the original UUID.
>>
>
> In my use case, the non-Java clients will be receiving a number of such
> UUIDs then and they will have to sort them chronologically. I wanted to
> avoid bits based UUID comparison in these clients. Long timestamp() value is
> perfect for such ordering of data elements and I send much lesser amount of
> data over the wire.
>
>
>>  Does it make sense?
>>
>
> Nearly everything makes sense to me :-)
>
> --
> Roshan
> Blog: http://roshandawrani.wordpress.com/
> Twitter: @roshandawrani 
> Skype: roshandawrani
>
>


-- 
Patricio.-


Re: Reclaim deleted rows space

2011-01-05 Thread Tyler Hobbs
Although it's not exactly the ability to list specific SSTables, the ability
to only compact specific CFs will be in upcoming releases:

https://issues.apache.org/jira/browse/CASSANDRA-1812

- Tyler

On Wed, Jan 5, 2011 at 7:46 PM, Edward Capriolo wrote:

> On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis  wrote:
> > Pretty sure there's logic in there that says "don't bother compacting
> > a single sstable."
> >
> > On Wed, Jan 5, 2011 at 2:26 PM, shimi  wrote:
> >> How does minor compaction is triggered? Is it triggered Only when a new
> >> SStable is added?
> >>
> >> I was wondering if triggering a compaction
> with minimumCompactionThreshold
> >> set to 1 would be useful. If this can happen I assume it will do
> compaction
> >> on files with similar size and remove deleted rows on the rest.
> >> Shimi
> >> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller <
> peter.schul...@infidyne.com>
> >> wrote:
> >>>
> >>> > I don't have a problem with disk space. I have a problem with the
> data
> >>> > size.
> >>>
> >>> [snip]
> >>>
> >>> > Bottom line is that I want to reduce the number of requests that goes
> to
> >>> > disk. Since there is enough data that is no longer valid I can do it
> by
> >>> > reclaiming the space. The only way to do it is by running Major
> >>> > compaction.
> >>> > I can wait and let Cassandra do it for me but then the data size will
> >>> > get
> >>> > even bigger and the response time will be worst. I can do it manually
> >>> > but I
> >>> > prefer it to happen in the background with less impact on the system
> >>>
> >>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
> >>>
> >>> So essentially, for workloads that are teetering on the edge of cache
> >>> warmness and is subject to significant overwrites or removals, it may
> >>> be beneficial to perform much more aggressive background compaction
> >>> even though it might waste lots of CPU, to keep the in-memory working
> >>> set down.
> >>>
> >>> There was talk (I think in the compaction redesign ticket) about
> >>> potentially improving the use of bloom filters such that obsolete data
> >>> in sstables could be eliminated from the read set without
> >>> necessitating actual compaction; that might help address cases like
> >>> these too.
> >>>
> >>> I don't think there's a pre-existing silver bullet in a current
> >>> release; you probably have to live with the need for
> >>> greater-than-theoretically-optimal memory requirements to keep the
> >>> working set in memory.
> >>>
> >>> --
> >>> / Peter Schuller
> >>
> >>
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder of Riptano, the source for professional Cassandra support
> > http://riptano.com
> >
>
> I was wording if it made sense to have a JMX operation that can
> compact a list of tables by file name. This opens it up for power
> users to have more options then compact entire keyspace.
>


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Roshan Dawrani
Hi Patricio,

Thanks for your comment. Replying inline.

2011/1/5 Patricio Echagüe 

> Roshan, just a comment in your solution. The time returned is not a simple
> long. It also contains some bits indicating the version.


I don't think so. The version bits from the most significant 64 bits of the
UUID are not used in creating timestamp() value. It uses only time_low,
time_mid and time_hi fields of the UUID and not version, as documented here:
http://download.oracle.com/javase/1.5.0/docs/api/java/util/UUID.html#timestamp%28%29.


When the same timestamp comes back and I call
TimeUUIDUtils.getTimeUUID(tmp), it internally puts the version back in it
and makes it a Time UUID.


> On the other hand, you are assuming that the same machine is processing
> your request and recreating a UUID base on a long you provide. The
> clockseqAndNode id will vary if another machine takes care of the request
> (referring to your use case) .
>

When I recreate my UUID using the timestamp() value, my requirement is not
to arrive at the exactly same UUID from which timestamp() was derived in the
first place. I need a recreated UUID *that should be equivalent in terms of
its time value* - so that filtering the time-sorted columns using this time
UUID works fine. So, if the lower order 64 bits (clockseq + node) become
different, I don't think it is of any concern because the UUID comparison
first goes by most significant 64 bits, i.e. the time value and that should
settle the time comparison in my use case.


> Is it possible for you to send the UUID to the view? I think that would be
> the correct behavior as a simple long does not contain enough information to
> recreate the original UUID.
>

In my use case, the non-Java clients will be receiving a number of such
UUIDs then and they will have to sort them chronologically. I wanted to
avoid bits based UUID comparison in these clients. Long timestamp() value is
perfect for such ordering of data elements and I send much lesser amount of
data over the wire.


>  Does it make sense?
>

Nearly everything makes sense to me :-)

-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani 
Skype: roshandawrani


Re: Reclaim deleted rows space

2011-01-05 Thread Edward Capriolo
On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis  wrote:
> Pretty sure there's logic in there that says "don't bother compacting
> a single sstable."
>
> On Wed, Jan 5, 2011 at 2:26 PM, shimi  wrote:
>> How does minor compaction is triggered? Is it triggered Only when a new
>> SStable is added?
>>
>> I was wondering if triggering a compaction with minimumCompactionThreshold
>> set to 1 would be useful. If this can happen I assume it will do compaction
>> on files with similar size and remove deleted rows on the rest.
>> Shimi
>> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller 
>> wrote:
>>>
>>> > I don't have a problem with disk space. I have a problem with the data
>>> > size.
>>>
>>> [snip]
>>>
>>> > Bottom line is that I want to reduce the number of requests that goes to
>>> > disk. Since there is enough data that is no longer valid I can do it by
>>> > reclaiming the space. The only way to do it is by running Major
>>> > compaction.
>>> > I can wait and let Cassandra do it for me but then the data size will
>>> > get
>>> > even bigger and the response time will be worst. I can do it manually
>>> > but I
>>> > prefer it to happen in the background with less impact on the system
>>>
>>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>>
>>> So essentially, for workloads that are teetering on the edge of cache
>>> warmness and is subject to significant overwrites or removals, it may
>>> be beneficial to perform much more aggressive background compaction
>>> even though it might waste lots of CPU, to keep the in-memory working
>>> set down.
>>>
>>> There was talk (I think in the compaction redesign ticket) about
>>> potentially improving the use of bloom filters such that obsolete data
>>> in sstables could be eliminated from the read set without
>>> necessitating actual compaction; that might help address cases like
>>> these too.
>>>
>>> I don't think there's a pre-existing silver bullet in a current
>>> release; you probably have to live with the need for
>>> greater-than-theoretically-optimal memory requirements to keep the
>>> working set in memory.
>>>
>>> --
>>> / Peter Schuller
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

I was wording if it made sense to have a JMX operation that can
compact a list of tables by file name. This opens it up for power
users to have more options then compact entire keyspace.


Re: pig cassandra contribution

2011-01-05 Thread felix gao
Ignore the above error, I somehow passed that stage. However, I am still
having problem with it.

grunt> register /home/felix/pig-0.7.0/pig-0.7.1-dev.jar; register
/home/felix/cassandra/lib/libthrift.jar;
grunt> rows = LOAD 'cassandra://test/data' USING CassandraStorage();
grunt> cols = FOREACH rows GENERATE flatten($1);
grunt> colnames = FOREACH cols GENERATE $0;
grunt> limit_colnames = limit colnames 10;
grunt> dump limit_colnames
2011-01-05 15:44:17,378 [main] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with
processName=JobTracker, sessionId=
2011-01-05 15:44:17,460 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name:
Store(file:/tmp/temp-1545399343/tmp576746049:org.apache.pig.builtin.BinStorage)
- 1-27 Operator Key: 1-27)
2011-01-05 15:44:17,507 [main] INFO
 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
2011-01-05 15:44:17,507 [main] INFO
 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1
2011-01-05 15:44:17,533 [main] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:17,539 [main] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:17,539 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-01-05 15:44:21,785 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-01-05 15:44:21,841 [main] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:21,842 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-01-05 15:44:21,846 [Thread-5] WARN  org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2011-01-05 15:44:22,115 [Thread-5] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:22,133 [Thread-5] INFO
 org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics
with processName=JobTracker, sessionId= - already initialized
2011-01-05 15:44:22,344 [main] INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-01-05 15:44:22,348 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2117: Unexpected error when launching map reduce job.
Details at logfile: /home/felix/cassandra/contrib/pig/pig_1294263823129.log


cat pig_1294263823129.log
Pig Stack Trace
---
ERROR 2117: Unexpected error when launching map reduce job.

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
open iterator for alias limit_colnames
at org.apache.pig.PigServer.openIterator(PigServer.java:521)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:544)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:357)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002:
Unable to store alias limit_colnames
at org.apache.pig.PigServer.store(PigServer.java:577)
at org.apache.pig.PigServer.openIterator(PigServer.java:504)
... 6 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117:
Unexpected error when launching map reduce job.
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:209)
at
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:308)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:835)
at org.apache.pig.PigServer.store(PigServer.java:569)
... 7 more
Caused by: java.lang.RuntimeException: Could not resolve error that occured
when launching map reduce job: java.lang.ExceptionInInitializerError
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:510)
at java.lang.Thread.dispatchUncaughtException(Thread.java:1831)




On Wed, Jan 5, 2011 at 12:02 PM, felix gao  wrote:

> I am having problem running the cassandra_loadfun

Re: Reclaim deleted rows space

2011-01-05 Thread Jonathan Ellis
Pretty sure there's logic in there that says "don't bother compacting
a single sstable."

On Wed, Jan 5, 2011 at 2:26 PM, shimi  wrote:
> How does minor compaction is triggered? Is it triggered Only when a new
> SStable is added?
>
> I was wondering if triggering a compaction with minimumCompactionThreshold
> set to 1 would be useful. If this can happen I assume it will do compaction
> on files with similar size and remove deleted rows on the rest.
> Shimi
> On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller 
> wrote:
>>
>> > I don't have a problem with disk space. I have a problem with the data
>> > size.
>>
>> [snip]
>>
>> > Bottom line is that I want to reduce the number of requests that goes to
>> > disk. Since there is enough data that is no longer valid I can do it by
>> > reclaiming the space. The only way to do it is by running Major
>> > compaction.
>> > I can wait and let Cassandra do it for me but then the data size will
>> > get
>> > even bigger and the response time will be worst. I can do it manually
>> > but I
>> > prefer it to happen in the background with less impact on the system
>>
>> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>>
>> So essentially, for workloads that are teetering on the edge of cache
>> warmness and is subject to significant overwrites or removals, it may
>> be beneficial to perform much more aggressive background compaction
>> even though it might waste lots of CPU, to keep the in-memory working
>> set down.
>>
>> There was talk (I think in the compaction redesign ticket) about
>> potentially improving the use of bloom filters such that obsolete data
>> in sstables could be eliminated from the read set without
>> necessitating actual compaction; that might help address cases like
>> these too.
>>
>> I don't think there's a pre-existing silver bullet in a current
>> release; you probably have to live with the need for
>> greater-than-theoretically-optimal memory requirements to keep the
>> working set in memory.
>>
>> --
>> / Peter Schuller
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Cassandra Meetup in San Francisco Bay Area

2011-01-05 Thread Jonathan Ellis
Thanks for organizing this, Mubarak!

A little more detail -- I'll explain the new features in Cassandra 0.7
including column time-to-live, columnfamily truncation, and secondary
indexes, as well as some of the features that have been backported to
recent 0.6 releases (aka Why You Should Upgrade Yesterday). The focus
will primarily be on how these affect application design, but we'll
also touch on operational considerations.

I'm excited to meet everyone!  I hear there will be pizza, too. :)

On Wed, Jan 5, 2011 at 1:31 PM, Mubarak Seyed  wrote:
> We are hosting a Cassandra meetup in BayArea. Jonathan will give a talk on
> Cassandra 0.7
>
> The link to the meetup page is at
> http://www.meetup.com/Cassandra-User-Group-Meeting/
>
> Thanks,
> Mubarak
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Reclaim deleted rows space

2011-01-05 Thread shimi
How does minor compaction is triggered? Is it triggered Only when a new
SStable is added?

I was wondering if triggering a compaction with minimumCompactionThreshold
set to 1 would be useful. If this can happen I assume it will do compaction
on files with similar size and remove deleted rows on the rest.

Shimi

On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller
wrote:

> > I don't have a problem with disk space. I have a problem with the data
> > size.
>
> [snip]
>
> > Bottom line is that I want to reduce the number of requests that goes to
> > disk. Since there is enough data that is no longer valid I can do it by
> > reclaiming the space. The only way to do it is by running Major
> compaction.
> > I can wait and let Cassandra do it for me but then the data size will get
> > even bigger and the response time will be worst. I can do it manually but
> I
> > prefer it to happen in the background with less impact on the system
>
> Ok - that makes perfect sense then. Sorry for misunderstanding :)
>
> So essentially, for workloads that are teetering on the edge of cache
> warmness and is subject to significant overwrites or removals, it may
> be beneficial to perform much more aggressive background compaction
> even though it might waste lots of CPU, to keep the in-memory working
> set down.
>
> There was talk (I think in the compaction redesign ticket) about
> potentially improving the use of bloom filters such that obsolete data
> in sstables could be eliminated from the read set without
> necessitating actual compaction; that might help address cases like
> these too.
>
> I don't think there's a pre-existing silver bullet in a current
> release; you probably have to live with the need for
> greater-than-theoretically-optimal memory requirements to keep the
> working set in memory.
>
> --
> / Peter Schuller
>


pig cassandra contribution

2011-01-05 Thread felix gao
I am having problem running the cassandra_loadfunc.jar on my build of
cassandra.
PIG_CLASSPATH=:bin/../build/cassandra_loadfunc.jar::bin/../../..//lib/antlr-3.1.3.jar:bin/../../..//lib/avro-1.2.0-dev.jar:bin/../../..//lib/clhm-production.jar:bin/../../..//lib/commons-cli-1.1.jar:bin/../../..//lib/commons-codec-1.2.jar:bin/../../..//lib/commons-collections-3.2.1.jar:bin/../../..//lib/commons-lang-2.4.jar:bin/../../..//lib/google-collections-1.0.jar:bin/../../..//lib/hadoop-core-0.20.1.jar:bin/../../..//lib/high-scale-lib.jar:bin/../../..//lib/jackson-core-asl-1.4.0.jar:bin/../../..//lib/jackson-mapper-asl-1.4.0.jar:bin/../../..//lib/jline-0.9.94.jar:bin/../../..//lib/json-simple-1.1.jar:bin/../../..//lib/libthrift.jar:bin/../../..//lib/log4j-1.2.14.jar:bin/../../..//lib/slf4j-api-1.5.8.jar:bin/../../..//lib/slf4j-log4j12-1.5.8.jar:bin/../../..//lib/spymemcached-2.4.2.jar:bin/../../..//lib/zapcat-1.2.jar:bin/../../..//build/lib/jars/ant-1.6.5.jar:bin/../../..//build/lib/jars/apache-rat-0.6.jar:bin/../../..//build/lib/jars/apache-rat-core-0.6.jar:bin/../../..//build/lib/jars/apache-rat-tasks-0.6.jar:bin/../../..//build/lib/jars/asm-3.2.jar:bin/../../..//build/lib/jars/avalon-framework-4.1.3.jar:bin/../../..//build/lib/jars/commons-cli-1.1.jar:bin/../../..//build/lib/jars/commons-collections-3.2.jar:bin/../../..//build/lib/jars/commons-lang-2.1.jar:bin/../../..//build/lib/jars/commons-logging-1.1.1.jar:bin/../../..//build/lib/jars/junit-4.6.jar:bin/../../..//build/lib/jars/log4j-1.2.12.jar:bin/../../..//build/lib/jars/logkit-1.0.1.jar:bin/../../..//build/lib/jars/paranamer-ant-2.1.jar:bin/../../..//build/lib/jars/paranamer-generator-2.1.jar:bin/../../..//build/lib/jars/qdox-1.10.jar:bin/../../..//build/lib/jars/servlet-api-2.3.jar:bin/../../..//build/apache-cassandra-0.6.4.jar:bin/../../..//build/ivy-2.1.0.jar:/usr/local/pig-0.7.0/pig.jar

In Grunt I did register again just in case it is not picked up by the
classpath
register /usr/local/pig-0.7.0/pig.jar; register
/home/felix/cassandra/lib/libthrift.jar; register
/home/felix/cassandra/contrib/pig/build/cassandra_loadfunc.jar
grunt> rows = LOAD 'cassandra://test.data' USING CassandraStorge();

  2011-01-05 13:50:50,071 [main] ERROR
org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve
CassandraStorge using imports: [org.apache.cassandra.hadoop.pig., ,
org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Details at logfile: /home/felix/cassandra/contrib/pig/pig_1294257032719.log


the log file contains

Pig Stack Trace
---
ERROR 1070: Could not resolve CassandraStorge using imports:
[org.apache.cassandra.hadoop.pig., , org.apache.pig.builtin.,
org.apache.pig.impl.builtin.]

java.lang.RuntimeException: Cannot instantiate:CassandraStorge
at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:455)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.NonEvalFuncSpec(QueryParser.java:5087)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1434)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:357)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070:
Could not resolve CassandraStorge using imports:
[org.apache.cassandra.hadoop.pig., , org.apache.pig.builtin.,
org.apache.pig.impl.builtin.]
at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:440)
at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:452)
... 15 more

Running hadoop 0.20.2 with pig0.7.0 and have to use cassandra 0.6.4.

Thanks,

Felix


Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
I see. Thanks for claryfing Jonathan.

On Wednesday, January 5, 2011, Jonathan Ellis  wrote:
> 1676 says "Avoid dropping messages off the client request path."
> Bootstrap messages are "off the client requst path."  So, if some of
> the nodes involved were loaded enough that they were dropping messages
> older than RPC_TIMEOUT to cope, it could lose part of the bootstrap
> communication permanently.
>
> On Wed, Jan 5, 2011 at 10:01 AM, Ran Tavory  wrote:
>> OK, thanks, so I see we had the same problem (I too had multiple keyspace,
>> not that I know why it matters to the problem at hand) and I see that by
>> upgrading to 0.6.7 you solved your problem (I didn't try it, had a different
>> workaround) but frankly, I don't understand
>> how https://issues.apache.org/jira/browse/CASSANDRA-1676 would relate the
>> the "stuck bootstrap" problem (I'm not saying that it isn't, I'd just like
>> to understand why...)
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

-- 
/Ran


Cassandra Meetup in San Francisco Bay Area

2011-01-05 Thread Mubarak Seyed
We are hosting a Cassandra meetup in BayArea. Jonathan will give a talk on
Cassandra 0.7

The link to the meetup page is at
http://www.meetup.com/Cassandra-User-Group-Meeting/

Thanks,
Mubarak


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Patricio Echagüe
Roshan, just a comment in your solution. The time returned is not a simple
long. It also contains some bits indicating the version.
On the other hand, you are assuming that the same machine is processing your
request and recreating a UUID base on a long you provide. The
clockseqAndNode id will vary if another machine takes care of the request
(referring to your use case) .

Is it possible for you to send the UUID to the view? I think that would be
the correct behavior as a simple long does not contain enough information to
recreate the original UUID.

Does it make sense?

On Wed, Jan 5, 2011 at 8:36 AM, Nate McCall  wrote:

> It was our original intention on discussing this feature was to have
> back-and-forth conversion from timestamps (we were modelling similar
> functionality in Pycassa). It's lack of inclusion may have just been
> an oversight. We will add this in Hector trunk shortly - thanks for
> the complete code sample.
>
>
>
> On Tue, Jan 4, 2011 at 10:06 PM, Roshan Dawrani 
> wrote:
> > Ok, found the solution - finally ! - by applying opposite of what
> > createTime() does in TimeUUIDUtils. Ideally I would have preferred for
> this
> > solution to come from Hector API, so I didn't have to be tied to the
> private
> > createTime() implementation.
> >
> > 
> > import java.util.UUID;
> > import me.prettyprint.cassandra.utils.TimeUUIDUtils;
> >
> > public class TryHector {
> > public static void main(String[] args) throws Exception {
> > final long NUM_100NS_INTERVALS_SINCE_UUID_EPOCH =
> > 0x01b21dd213814000L;
> >
> > UUID u1 = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
> > final long t1 = u1.timestamp();
> >
> > long tmp = (t1 - NUM_100NS_INTERVALS_SINCE_UUID_EPOCH) / 1;
> >
> > UUID u2 = TimeUUIDUtils.getTimeUUID(tmp);
> > long t2 = u2.timestamp();
> >
> > System.out.println(u2.equals(u1));
> > System.out.println(t2 == t1);
> > }
> >
> > }
> >  
> >
> >
> > On Wed, Jan 5, 2011 at 8:15 AM, Roshan Dawrani 
> > wrote:
> >>
> >> If I use com.eaio.uuid.UUID directly, then I am able to do what I need
> >> (attached a Java program for the same), but unfortunately I need to deal
> >> with java.util.UUID in my application and I don't have its equivalent
> >> com.eaio.uuid.UUID at the point where I need the timestamp value.
> >>
> >> Any suggestion on how I can achieve the equivalent using Hector
> library's
> >> TimeUUIDUtils?
> >>
> >> On Wed, Jan 5, 2011 at 7:21 AM, Roshan Dawrani  >
> >> wrote:
> >>>
> >>> Hi Victor / Patricio,
> >>>
> >>> I have been using Hector library's TimeUUIDUtils. I also just looked at
> >>> TimeUUIDUtilsTest also but didn't find anything similar being tested
> there.
> >>>
> >>> Here is what I am trying and it's not working - I am creating a Time
> >>> UUID, extracting its timestamp value and with that I create another
> Time
> >>> UUID and I am expecting both time UUIDs to have the same timestamp()
> value -
> >>> am I doing / expecting something wrong here?:
> >>>
> >>> ===
> >>> import java.util.UUID;
> >>> import me.prettyprint.cassandra.utils.TimeUUIDUtils;
> >>>
> >>> public class TryHector {
> >>> public static void main(String[] args) throws Exception {
> >>> UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
> >>> long timestamp1 = someUUID.timestamp();
> >>>
> >>> UUID otherUUID = TimeUUIDUtils.getTimeUUID(timestamp1);
> >>> long timestamp2 = otherUUID.timestamp();
> >>>
> >>> System.out.println(timestamp1);
> >>> System.out.println(timestamp2);
> >>> }
> >>> }
> >>> ===
> >>>
> >>> I have to create the timestamp() equivalent of my time UUIDs so I can
> >>> send it to my UI client, for which it will be simpler to compare "long"
> >>> timestamp than comparing UUIDs. Then for the "long" timestamp chosen by
> the
> >>> client, I need to re-create the equivalent time UUID and go and filter
> the
> >>> data from Cassandra database.
> >>>
> >>> --
> >>> Roshan
> >>> Blog: http://roshandawrani.wordpress.com/
> >>> Twitter: @roshandawrani
> >>> Skype: roshandawrani
> >>>
> >>> On Wed, Jan 5, 2011 at 1:32 AM, Victor Kabdebon
> >>>  wrote:
> 
>  Hi Roshan,
>  Sorry I misunderstood your problem.It is weird that it doesn't work,
> it
>  works for me...
>  As Patricio pointed out use hector "standard" way of creating TimeUUID
>  and tell us if it still doesn't work.
>  Maybe you can paste here some of the code you use to query your
> columns
>  too.
> 
>  Victor K.
>  http://www.voxnucleus.fr
> 
>  2011/1/4 Patricio Echagüe 
> >
> > In Hector framework, take a look at TimeUUIDUtils.java
> >
> > You can create a UUID using   TimeUUIDUtils.getTimeUUI

Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread Peter Schuller
> I know that there's a limit, and I just assumed that the CLI set it to 100,
> until I saw more than 100 results.

Ooh, sorry. Didn't read carefully enough. Not sure why you see that
behavior. Sounds strange; should not be supported at the thrift level
AFAIK.

-- 
/ Peter Schuller


Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread David Boxenhorn
I know that there's a limit, and I just assumed that the CLI set it to 100,
until I saw more than 100 results.

On Wed, Jan 5, 2011 at 6:56 PM, Peter Schuller
wrote:

> > The CLI sometimes gets only 100 results (even though there are more) -
> and
> > sometimes gets all the results, even when there are more than 100!
> >
> > What is going on here? Is there some logic that says if there are too
> many
> > results return 100, even though "too many" can be more than 100?
>
> API calls have a limit since streaming is not supported and you could
> potentially have almost arbitrary large result sets. I believe
> cassandra-cli will allow you to set the limit if you look at the
> 'help' output and look for the word 'limit'.
>
> The way to iterate over large amounts of data is to do paging, with
> multiple queries.
>
> --
> / Peter Schuller
>


Re: Question about replication

2011-01-05 Thread Jonathan Ellis
No.

On Wed, Jan 5, 2011 at 10:38 AM, Mayuresh Kulkarni  wrote:
>
> Hello,
>
> Is it possible to set the replication factor to some kind of "ALL" setting
> so that all data gets replicated to all nodes and if a new node is
> dynamically added to the cluster, the current nodes replicate their data to
> it?
>
> Thanks,
> Mayuresh
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Cassandra 0.7 - Query on network topology

2011-01-05 Thread Peter Schuller
> 1. Some way to send requests for keys whose token fall between 0-25 to B and
> never to C even though C will have the data due to it being replica of B.

If your data set is large, be mindful of the fact that this will cause
C to be completely cold in terms of caches. I.e., when B does go down,
C will take lots of iops.

-- 
/ Peter Schuller


Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread Peter Schuller
> The CLI sometimes gets only 100 results (even though there are more) - and
> sometimes gets all the results, even when there are more than 100!
>
> What is going on here? Is there some logic that says if there are too many
> results return 100, even though "too many" can be more than 100?

API calls have a limit since streaming is not supported and you could
potentially have almost arbitrary large result sets. I believe
cassandra-cli will allow you to set the limit if you look at the
'help' output and look for the word 'limit'.

The way to iterate over large amounts of data is to do paging, with
multiple queries.

-- 
/ Peter Schuller


Question about replication

2011-01-05 Thread Mayuresh Kulkarni


Hello,

Is it possible to set the replication factor to some kind of "ALL" setting 
so that all data gets replicated to all nodes and if a new node is 
dynamically added to the cluster, the current nodes replicate their data 
to it?


Thanks,
Mayuresh


Re: Converting a TimeUUID to a long (timestamp) and vice-versa

2011-01-05 Thread Nate McCall
It was our original intention on discussing this feature was to have
back-and-forth conversion from timestamps (we were modelling similar
functionality in Pycassa). It's lack of inclusion may have just been
an oversight. We will add this in Hector trunk shortly - thanks for
the complete code sample.



On Tue, Jan 4, 2011 at 10:06 PM, Roshan Dawrani  wrote:
> Ok, found the solution - finally ! - by applying opposite of what
> createTime() does in TimeUUIDUtils. Ideally I would have preferred for this
> solution to come from Hector API, so I didn't have to be tied to the private
> createTime() implementation.
>
> 
> import java.util.UUID;
> import me.prettyprint.cassandra.utils.TimeUUIDUtils;
>
> public class TryHector {
>     public static void main(String[] args) throws Exception {
>         final long NUM_100NS_INTERVALS_SINCE_UUID_EPOCH =
> 0x01b21dd213814000L;
>
>         UUID u1 = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
>         final long t1 = u1.timestamp();
>
>         long tmp = (t1 - NUM_100NS_INTERVALS_SINCE_UUID_EPOCH) / 1;
>
>         UUID u2 = TimeUUIDUtils.getTimeUUID(tmp);
>         long t2 = u2.timestamp();
>
>         System.out.println(u2.equals(u1));
>         System.out.println(t2 == t1);
>     }
>
> }
>  
>
>
> On Wed, Jan 5, 2011 at 8:15 AM, Roshan Dawrani 
> wrote:
>>
>> If I use com.eaio.uuid.UUID directly, then I am able to do what I need
>> (attached a Java program for the same), but unfortunately I need to deal
>> with java.util.UUID in my application and I don't have its equivalent
>> com.eaio.uuid.UUID at the point where I need the timestamp value.
>>
>> Any suggestion on how I can achieve the equivalent using Hector library's
>> TimeUUIDUtils?
>>
>> On Wed, Jan 5, 2011 at 7:21 AM, Roshan Dawrani 
>> wrote:
>>>
>>> Hi Victor / Patricio,
>>>
>>> I have been using Hector library's TimeUUIDUtils. I also just looked at
>>> TimeUUIDUtilsTest also but didn't find anything similar being tested there.
>>>
>>> Here is what I am trying and it's not working - I am creating a Time
>>> UUID, extracting its timestamp value and with that I create another Time
>>> UUID and I am expecting both time UUIDs to have the same timestamp() value -
>>> am I doing / expecting something wrong here?:
>>>
>>> ===
>>> import java.util.UUID;
>>> import me.prettyprint.cassandra.utils.TimeUUIDUtils;
>>>
>>> public class TryHector {
>>>     public static void main(String[] args) throws Exception {
>>>         UUID someUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis();
>>>         long timestamp1 = someUUID.timestamp();
>>>
>>>         UUID otherUUID = TimeUUIDUtils.getTimeUUID(timestamp1);
>>>         long timestamp2 = otherUUID.timestamp();
>>>
>>>         System.out.println(timestamp1);
>>>         System.out.println(timestamp2);
>>>     }
>>> }
>>> ===
>>>
>>> I have to create the timestamp() equivalent of my time UUIDs so I can
>>> send it to my UI client, for which it will be simpler to compare "long"
>>> timestamp than comparing UUIDs. Then for the "long" timestamp chosen by the
>>> client, I need to re-create the equivalent time UUID and go and filter the
>>> data from Cassandra database.
>>>
>>> --
>>> Roshan
>>> Blog: http://roshandawrani.wordpress.com/
>>> Twitter: @roshandawrani
>>> Skype: roshandawrani
>>>
>>> On Wed, Jan 5, 2011 at 1:32 AM, Victor Kabdebon
>>>  wrote:

 Hi Roshan,
 Sorry I misunderstood your problem.It is weird that it doesn't work, it
 works for me...
 As Patricio pointed out use hector "standard" way of creating TimeUUID
 and tell us if it still doesn't work.
 Maybe you can paste here some of the code you use to query your columns
 too.

 Victor K.
 http://www.voxnucleus.fr

 2011/1/4 Patricio Echagüe 
>
> In Hector framework, take a look at TimeUUIDUtils.java
>
> You can create a UUID using   TimeUUIDUtils.getTimeUUID(long time); or
> TimeUUIDUtils.getTimeUUID(ClockResolution clock)
>
> and later on, TimeUUIDUtils.getTimeFromUUID(..) or just
> UUID.timestamp();
>
> There are some example in TimeUUIDUtilsTest.java
>
> Let me know if it helps.
>
>
>
> On Tue, Jan 4, 2011 at 10:27 AM, Roshan Dawrani
>  wrote:
>>
>> Hello Victor,
>>
>> It is actually not that I need the 2 UUIDs to be exactly same - they
>> need to be same timestamp wise.
>>
>> So, what I need is to extract the timestamp portion from a time UUID
>> (say, U1) and then later in the cycle, use the same long timestamp value 
>> to
>> re-create a UUID (say, U2) that is equivalent of the previous one in 
>> terms
>> of its timestamp portion - i.e., I should be able to give this U2 and 
>> filter
>> the data from a 

Re: Bootstrapping taking long

2011-01-05 Thread Jonathan Ellis
1676 says "Avoid dropping messages off the client request path."
Bootstrap messages are "off the client requst path."  So, if some of
the nodes involved were loaded enough that they were dropping messages
older than RPC_TIMEOUT to cope, it could lose part of the bootstrap
communication permanently.

On Wed, Jan 5, 2011 at 10:01 AM, Ran Tavory  wrote:
> OK, thanks, so I see we had the same problem (I too had multiple keyspace,
> not that I know why it matters to the problem at hand) and I see that by
> upgrading to 0.6.7 you solved your problem (I didn't try it, had a different
> workaround) but frankly, I don't understand
> how https://issues.apache.org/jira/browse/CASSANDRA-1676 would relate the
> the "stuck bootstrap" problem (I'm not saying that it isn't, I'd just like
> to understand why...)

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
OK, thanks, so I see we had the same problem (I too had multiple keyspace,
not that I know why it matters to the problem at hand) and I see that by
upgrading to 0.6.7 you solved your problem (I didn't try it, had a different
workaround) but frankly, I don't understand how
https://issues.apache.org/jira/browse/CASSANDRA-1676 would relate the the
"stuck bootstrap" problem (I'm not saying that it isn't, I'd just like to
understand why...)


On Wed, Jan 5, 2011 at 5:42 PM, Thibaut Britz  wrote:

> Had the same Problem a while ago. Upgrading solved the problem (Don't know
> if you have to redeploy your cluster though)
>
> http://www.mail-archive.com/user@cassandra.apache.org/msg07106.html
>
>
>
> On Wed, Jan 5, 2011 at 4:29 PM, Ran Tavory  wrote:
>
>> @Thibaut wrong email? Or how's "Avoid dropping messages off the client
>> request path" (CASSANDRA-1676) related to the bootstrap questions I had?
>>
>>
>> On Wed, Jan 5, 2011 at 5:23 PM, Thibaut Britz <
>> thibaut.br...@trendiction.com> wrote:
>>
>>> https://issues.apache.org/jira/browse/CASSANDRA-1676
>>>
>>> you have to use at least 0.6.7
>>>
>>>
>>>
>>> On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo 
>>> wrote:
>>>
 On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory  wrote:
 > In storage-conf I see this comment [1] from which I understand that
 the
 > recommended way to bootstrap a new node is to set AutoBootstrap=true
 and
 > remove itself from the seeds list.
 > Moreover, I did try to set AutoBootstrap=true and have the node in its
 own
 > seeds list, but it would not bootstrap. I don't recall the exact
 message but
 > it was something like "I found myself in the seeds list therefore I'm
 not
 > going to bootstrap even though AutoBootstrap is true".
 >
 > [1]
 >   
 >   false
 > On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn 
 wrote:
 >>
 >> If "seed list should be the same across the cluster" that means that
 nodes
 >> *should* have themselves as a seed. If that doesn't work for Ran,
 then that
 >> is the first problem, no?
 >>
 >>
 >> On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani 
 wrote:
 >>>
 >>> Well your ring issues don't make sense to me, seed list should be
 the
 >>> same across the cluster.
 >>> I'm just thinking of other things to try, non-boostrapped nodes
 should
 >>> join the ring instantly but reads will fail if you aren't using
 quorum.
 >>>
 >>> On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory 
 wrote:
 
  I haven't tried repair.  Should I?
 
  On Jan 5, 2011 3:48 PM, "Jake Luciani"  wrote:
  > Have you tried not bootstrapping but setting the token and
 manually
  > calling
  > repair?
  >
  > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory 
 wrote:
  >
  >> My conclusion is lame: I tried this on several hosts and saw the
 same
  >> behavior, the only way I was able to join new nodes was to first
  >> start them
  >> when they are *not in* their own seeds list and after they
  >> finish transferring the data, then restart them with themselves
 *in*
  >> their
  >> own seeds list. After doing that the node would join the ring.
  >> This is either my misunderstanding or a bug, but the only place
 I
  >> found it
  >> documented stated that the new node should not be in its own
 seeds
  >> list.
  >> Version 0.6.6.
  >>
  >> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
  >> wrote:
  >>
  >>> My nodes all have themselves in their list of seeds - always
 did -
  >>> and
  >>> everything works. (You may ask why I did this. I don't know, I
 must
  >>> have
  >>> copied it from an example somewhere.)
  >>>
  >>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory 
 wrote:
  >>>
   I was able to make the node join the ring but I'm confused.
   What I did is, first when adding the node, this node was not
 in the
   seeds
   list of itself. AFAIK this is how it's supposed to be. So it
 was
   able to
   transfer all data to itself from other nodes but then it
 stayed in
   the
   bootstrapping state.
   So what I did (and I don't know why it works), is add this
 node to
   the
   seeds list in its own storage-conf.xml file. Then restart the
   server and
   then I finally see it in the ring...
   If I had added the node to the seeds list of itself when first
   joining
   it, it would not join the ring but if I do it in two phases it
 did
   work.
   So it's either my misunderstanding

Re: Bootstrapping taking long

2011-01-05 Thread Thibaut Britz
Had the same Problem a while ago. Upgrading solved the problem (Don't know
if you have to redeploy your cluster though)

http://www.mail-archive.com/user@cassandra.apache.org/msg07106.html


On Wed, Jan 5, 2011 at 4:29 PM, Ran Tavory  wrote:

> @Thibaut wrong email? Or how's "Avoid dropping messages off the client
> request path" (CASSANDRA-1676) related to the bootstrap questions I had?
>
>
> On Wed, Jan 5, 2011 at 5:23 PM, Thibaut Britz <
> thibaut.br...@trendiction.com> wrote:
>
>> https://issues.apache.org/jira/browse/CASSANDRA-1676
>>
>> you have to use at least 0.6.7
>>
>>
>>
>> On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo wrote:
>>
>>> On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory  wrote:
>>> > In storage-conf I see this comment [1] from which I understand that the
>>> > recommended way to bootstrap a new node is to set AutoBootstrap=true
>>> and
>>> > remove itself from the seeds list.
>>> > Moreover, I did try to set AutoBootstrap=true and have the node in its
>>> own
>>> > seeds list, but it would not bootstrap. I don't recall the exact
>>> message but
>>> > it was something like "I found myself in the seeds list therefore I'm
>>> not
>>> > going to bootstrap even though AutoBootstrap is true".
>>> >
>>> > [1]
>>> >   
>>> >   false
>>> > On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn 
>>> wrote:
>>> >>
>>> >> If "seed list should be the same across the cluster" that means that
>>> nodes
>>> >> *should* have themselves as a seed. If that doesn't work for Ran, then
>>> that
>>> >> is the first problem, no?
>>> >>
>>> >>
>>> >> On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani 
>>> wrote:
>>> >>>
>>> >>> Well your ring issues don't make sense to me, seed list should be the
>>> >>> same across the cluster.
>>> >>> I'm just thinking of other things to try, non-boostrapped nodes
>>> should
>>> >>> join the ring instantly but reads will fail if you aren't using
>>> quorum.
>>> >>>
>>> >>> On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory  wrote:
>>> 
>>>  I haven't tried repair.  Should I?
>>> 
>>>  On Jan 5, 2011 3:48 PM, "Jake Luciani"  wrote:
>>>  > Have you tried not bootstrapping but setting the token and
>>> manually
>>>  > calling
>>>  > repair?
>>>  >
>>>  > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory 
>>> wrote:
>>>  >
>>>  >> My conclusion is lame: I tried this on several hosts and saw the
>>> same
>>>  >> behavior, the only way I was able to join new nodes was to first
>>>  >> start them
>>>  >> when they are *not in* their own seeds list and after they
>>>  >> finish transferring the data, then restart them with themselves
>>> *in*
>>>  >> their
>>>  >> own seeds list. After doing that the node would join the ring.
>>>  >> This is either my misunderstanding or a bug, but the only place I
>>>  >> found it
>>>  >> documented stated that the new node should not be in its own
>>> seeds
>>>  >> list.
>>>  >> Version 0.6.6.
>>>  >>
>>>  >> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
>>>  >> wrote:
>>>  >>
>>>  >>> My nodes all have themselves in their list of seeds - always did
>>> -
>>>  >>> and
>>>  >>> everything works. (You may ask why I did this. I don't know, I
>>> must
>>>  >>> have
>>>  >>> copied it from an example somewhere.)
>>>  >>>
>>>  >>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory 
>>> wrote:
>>>  >>>
>>>   I was able to make the node join the ring but I'm confused.
>>>   What I did is, first when adding the node, this node was not in
>>> the
>>>   seeds
>>>   list of itself. AFAIK this is how it's supposed to be. So it
>>> was
>>>   able to
>>>   transfer all data to itself from other nodes but then it stayed
>>> in
>>>   the
>>>   bootstrapping state.
>>>   So what I did (and I don't know why it works), is add this node
>>> to
>>>   the
>>>   seeds list in its own storage-conf.xml file. Then restart the
>>>   server and
>>>   then I finally see it in the ring...
>>>   If I had added the node to the seeds list of itself when first
>>>   joining
>>>   it, it would not join the ring but if I do it in two phases it
>>> did
>>>   work.
>>>   So it's either my misunderstanding or a bug...
>>>  
>>>  
>>>   On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory 
>>>   wrote:
>>>  
>>>  > The new node does not see itself as part of the ring, it sees
>>> all
>>>  > others
>>>  > but itself, so from that perspective the view is consistent.
>>>  > The only problem is that the node never finishes to bootstrap.
>>> It
>>>  > stays
>>>  > in this state for hours (It's been 20 hours now...)
>>>  >
>>>  >
>>>  > $ bin/nodetool -p 9004 -h localhost streams
>>>  >> Mode: Bootstrapping
>>>  >> Not sending any stream

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
@Thibaut wrong email? Or how's "Avoid dropping messages off the client
request path" (CASSANDRA-1676) related to the bootstrap questions I had?

On Wed, Jan 5, 2011 at 5:23 PM, Thibaut Britz  wrote:

> https://issues.apache.org/jira/browse/CASSANDRA-1676
>
> you have to use at least 0.6.7
>
>
>
> On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo wrote:
>
>> On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory  wrote:
>> > In storage-conf I see this comment [1] from which I understand that the
>> > recommended way to bootstrap a new node is to set AutoBootstrap=true and
>> > remove itself from the seeds list.
>> > Moreover, I did try to set AutoBootstrap=true and have the node in its
>> own
>> > seeds list, but it would not bootstrap. I don't recall the exact message
>> but
>> > it was something like "I found myself in the seeds list therefore I'm
>> not
>> > going to bootstrap even though AutoBootstrap is true".
>> >
>> > [1]
>> >   
>> >   false
>> > On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn 
>> wrote:
>> >>
>> >> If "seed list should be the same across the cluster" that means that
>> nodes
>> >> *should* have themselves as a seed. If that doesn't work for Ran, then
>> that
>> >> is the first problem, no?
>> >>
>> >>
>> >> On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani  wrote:
>> >>>
>> >>> Well your ring issues don't make sense to me, seed list should be the
>> >>> same across the cluster.
>> >>> I'm just thinking of other things to try, non-boostrapped nodes should
>> >>> join the ring instantly but reads will fail if you aren't using
>> quorum.
>> >>>
>> >>> On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory  wrote:
>> 
>>  I haven't tried repair.  Should I?
>> 
>>  On Jan 5, 2011 3:48 PM, "Jake Luciani"  wrote:
>>  > Have you tried not bootstrapping but setting the token and manually
>>  > calling
>>  > repair?
>>  >
>>  > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory 
>> wrote:
>>  >
>>  >> My conclusion is lame: I tried this on several hosts and saw the
>> same
>>  >> behavior, the only way I was able to join new nodes was to first
>>  >> start them
>>  >> when they are *not in* their own seeds list and after they
>>  >> finish transferring the data, then restart them with themselves
>> *in*
>>  >> their
>>  >> own seeds list. After doing that the node would join the ring.
>>  >> This is either my misunderstanding or a bug, but the only place I
>>  >> found it
>>  >> documented stated that the new node should not be in its own seeds
>>  >> list.
>>  >> Version 0.6.6.
>>  >>
>>  >> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
>>  >> wrote:
>>  >>
>>  >>> My nodes all have themselves in their list of seeds - always did
>> -
>>  >>> and
>>  >>> everything works. (You may ask why I did this. I don't know, I
>> must
>>  >>> have
>>  >>> copied it from an example somewhere.)
>>  >>>
>>  >>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory 
>> wrote:
>>  >>>
>>   I was able to make the node join the ring but I'm confused.
>>   What I did is, first when adding the node, this node was not in
>> the
>>   seeds
>>   list of itself. AFAIK this is how it's supposed to be. So it was
>>   able to
>>   transfer all data to itself from other nodes but then it stayed
>> in
>>   the
>>   bootstrapping state.
>>   So what I did (and I don't know why it works), is add this node
>> to
>>   the
>>   seeds list in its own storage-conf.xml file. Then restart the
>>   server and
>>   then I finally see it in the ring...
>>   If I had added the node to the seeds list of itself when first
>>   joining
>>   it, it would not join the ring but if I do it in two phases it
>> did
>>   work.
>>   So it's either my misunderstanding or a bug...
>>  
>>  
>>   On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory 
>>   wrote:
>>  
>>  > The new node does not see itself as part of the ring, it sees
>> all
>>  > others
>>  > but itself, so from that perspective the view is consistent.
>>  > The only problem is that the node never finishes to bootstrap.
>> It
>>  > stays
>>  > in this state for hours (It's been 20 hours now...)
>>  >
>>  >
>>  > $ bin/nodetool -p 9004 -h localhost streams
>>  >> Mode: Bootstrapping
>>  >> Not sending any streams.
>>  >> Not receiving any streams.
>>  >
>>  >
>>  > On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall 
>>  > wrote:
>>  >
>>  >> Does the new node have itself in the list of seeds per chance?
>>  >> This
>>  >> could cause some issues if so.
>>  >>
>>  >> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory 
>>  >> wrote:
>>  >> > I'm still

Re: Bootstrapping taking long

2011-01-05 Thread Thibaut Britz
https://issues.apache.org/jira/browse/CASSANDRA-1676

you have to use at least 0.6.7


On Wed, Jan 5, 2011 at 4:19 PM, Edward Capriolo wrote:

> On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory  wrote:
> > In storage-conf I see this comment [1] from which I understand that the
> > recommended way to bootstrap a new node is to set AutoBootstrap=true and
> > remove itself from the seeds list.
> > Moreover, I did try to set AutoBootstrap=true and have the node in its
> own
> > seeds list, but it would not bootstrap. I don't recall the exact message
> but
> > it was something like "I found myself in the seeds list therefore I'm not
> > going to bootstrap even though AutoBootstrap is true".
> >
> > [1]
> >   
> >   false
> > On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn 
> wrote:
> >>
> >> If "seed list should be the same across the cluster" that means that
> nodes
> >> *should* have themselves as a seed. If that doesn't work for Ran, then
> that
> >> is the first problem, no?
> >>
> >>
> >> On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani  wrote:
> >>>
> >>> Well your ring issues don't make sense to me, seed list should be the
> >>> same across the cluster.
> >>> I'm just thinking of other things to try, non-boostrapped nodes should
> >>> join the ring instantly but reads will fail if you aren't using quorum.
> >>>
> >>> On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory  wrote:
> 
>  I haven't tried repair.  Should I?
> 
>  On Jan 5, 2011 3:48 PM, "Jake Luciani"  wrote:
>  > Have you tried not bootstrapping but setting the token and manually
>  > calling
>  > repair?
>  >
>  > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory 
> wrote:
>  >
>  >> My conclusion is lame: I tried this on several hosts and saw the
> same
>  >> behavior, the only way I was able to join new nodes was to first
>  >> start them
>  >> when they are *not in* their own seeds list and after they
>  >> finish transferring the data, then restart them with themselves
> *in*
>  >> their
>  >> own seeds list. After doing that the node would join the ring.
>  >> This is either my misunderstanding or a bug, but the only place I
>  >> found it
>  >> documented stated that the new node should not be in its own seeds
>  >> list.
>  >> Version 0.6.6.
>  >>
>  >> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
>  >> wrote:
>  >>
>  >>> My nodes all have themselves in their list of seeds - always did -
>  >>> and
>  >>> everything works. (You may ask why I did this. I don't know, I
> must
>  >>> have
>  >>> copied it from an example somewhere.)
>  >>>
>  >>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory 
> wrote:
>  >>>
>   I was able to make the node join the ring but I'm confused.
>   What I did is, first when adding the node, this node was not in
> the
>   seeds
>   list of itself. AFAIK this is how it's supposed to be. So it was
>   able to
>   transfer all data to itself from other nodes but then it stayed
> in
>   the
>   bootstrapping state.
>   So what I did (and I don't know why it works), is add this node
> to
>   the
>   seeds list in its own storage-conf.xml file. Then restart the
>   server and
>   then I finally see it in the ring...
>   If I had added the node to the seeds list of itself when first
>   joining
>   it, it would not join the ring but if I do it in two phases it
> did
>   work.
>   So it's either my misunderstanding or a bug...
>  
>  
>   On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory 
>   wrote:
>  
>  > The new node does not see itself as part of the ring, it sees
> all
>  > others
>  > but itself, so from that perspective the view is consistent.
>  > The only problem is that the node never finishes to bootstrap.
> It
>  > stays
>  > in this state for hours (It's been 20 hours now...)
>  >
>  >
>  > $ bin/nodetool -p 9004 -h localhost streams
>  >> Mode: Bootstrapping
>  >> Not sending any streams.
>  >> Not receiving any streams.
>  >
>  >
>  > On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall 
>  > wrote:
>  >
>  >> Does the new node have itself in the list of seeds per chance?
>  >> This
>  >> could cause some issues if so.
>  >>
>  >> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory 
>  >> wrote:
>  >> > I'm still at lost. I haven't been able to resolve this. I
> tried
>  >> > adding another node at a different location on the ring but
>  >> > this node
>  >> > too remains stuck in the bootstrapping state for many hours
>  >> > without
>  >> > any of the other nodes being busy with anti compaction or
>  >

Re: Bootstrapping taking long

2011-01-05 Thread Edward Capriolo
On Wed, Jan 5, 2011 at 10:05 AM, Ran Tavory  wrote:
> In storage-conf I see this comment [1] from which I understand that the
> recommended way to bootstrap a new node is to set AutoBootstrap=true and
> remove itself from the seeds list.
> Moreover, I did try to set AutoBootstrap=true and have the node in its own
> seeds list, but it would not bootstrap. I don't recall the exact message but
> it was something like "I found myself in the seeds list therefore I'm not
> going to bootstrap even though AutoBootstrap is true".
>
> [1]
>   
>   false
> On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn  wrote:
>>
>> If "seed list should be the same across the cluster" that means that nodes
>> *should* have themselves as a seed. If that doesn't work for Ran, then that
>> is the first problem, no?
>>
>>
>> On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani  wrote:
>>>
>>> Well your ring issues don't make sense to me, seed list should be the
>>> same across the cluster.
>>> I'm just thinking of other things to try, non-boostrapped nodes should
>>> join the ring instantly but reads will fail if you aren't using quorum.
>>>
>>> On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory  wrote:

 I haven't tried repair.  Should I?

 On Jan 5, 2011 3:48 PM, "Jake Luciani"  wrote:
 > Have you tried not bootstrapping but setting the token and manually
 > calling
 > repair?
 >
 > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory  wrote:
 >
 >> My conclusion is lame: I tried this on several hosts and saw the same
 >> behavior, the only way I was able to join new nodes was to first
 >> start them
 >> when they are *not in* their own seeds list and after they
 >> finish transferring the data, then restart them with themselves *in*
 >> their
 >> own seeds list. After doing that the node would join the ring.
 >> This is either my misunderstanding or a bug, but the only place I
 >> found it
 >> documented stated that the new node should not be in its own seeds
 >> list.
 >> Version 0.6.6.
 >>
 >> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn
 >> wrote:
 >>
 >>> My nodes all have themselves in their list of seeds - always did -
 >>> and
 >>> everything works. (You may ask why I did this. I don't know, I must
 >>> have
 >>> copied it from an example somewhere.)
 >>>
 >>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory  wrote:
 >>>
  I was able to make the node join the ring but I'm confused.
  What I did is, first when adding the node, this node was not in the
  seeds
  list of itself. AFAIK this is how it's supposed to be. So it was
  able to
  transfer all data to itself from other nodes but then it stayed in
  the
  bootstrapping state.
  So what I did (and I don't know why it works), is add this node to
  the
  seeds list in its own storage-conf.xml file. Then restart the
  server and
  then I finally see it in the ring...
  If I had added the node to the seeds list of itself when first
  joining
  it, it would not join the ring but if I do it in two phases it did
  work.
  So it's either my misunderstanding or a bug...
 
 
  On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory 
  wrote:
 
 > The new node does not see itself as part of the ring, it sees all
 > others
 > but itself, so from that perspective the view is consistent.
 > The only problem is that the node never finishes to bootstrap. It
 > stays
 > in this state for hours (It's been 20 hours now...)
 >
 >
 > $ bin/nodetool -p 9004 -h localhost streams
 >> Mode: Bootstrapping
 >> Not sending any streams.
 >> Not receiving any streams.
 >
 >
 > On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall 
 > wrote:
 >
 >> Does the new node have itself in the list of seeds per chance?
 >> This
 >> could cause some issues if so.
 >>
 >> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory 
 >> wrote:
 >> > I'm still at lost. I haven't been able to resolve this. I tried
 >> > adding another node at a different location on the ring but
 >> > this node
 >> > too remains stuck in the bootstrapping state for many hours
 >> > without
 >> > any of the other nodes being busy with anti compaction or
 >> > anything
 >> > else. I don't know what's keeping it from finishing the
 >> > bootstrap,no
 >> > CPU, no io, files were already streamed so what is it waiting
 >> > for?
 >> > I read the release notes of 0.6.7 and 0.6.8 and there didn't
 >> > seem to
 >> > be anything addressing a similar issue so I figured there was
 >> > no
 >

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
In storage-conf I see this comment [1] from which I understand that the
recommended way to bootstrap a new node is to set AutoBootstrap=true and
remove itself from the seeds list.
Moreover, I did try to set AutoBootstrap=true and have the node in its own
seeds list, but it would not bootstrap. I don't recall the exact message but
it was something like "I found myself in the seeds list therefore I'm not
going to bootstrap even though AutoBootstrap is true".

[1]
  
  false

On Wed, Jan 5, 2011 at 4:58 PM, David Boxenhorn  wrote:

> If "seed list should be the same across the cluster" that means that nodes
> *should* have themselves as a seed. If that doesn't work for Ran, then that
> is the first problem, no?
>
>
> On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani  wrote:
>
>> Well your ring issues don't make sense to me, seed list should be the same
>> across the cluster.
>> I'm just thinking of other things to try, non-boostrapped nodes should
>> join the ring instantly but reads will fail if you aren't using quorum.
>>
>>
>> On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory  wrote:
>>
>>> I haven't tried repair.  Should I?
>>> On Jan 5, 2011 3:48 PM, "Jake Luciani"  wrote:
>>> > Have you tried not bootstrapping but setting the token and manually
>>> calling
>>> > repair?
>>> >
>>> > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory  wrote:
>>> >
>>> >> My conclusion is lame: I tried this on several hosts and saw the same
>>> >> behavior, the only way I was able to join new nodes was to first start
>>> them
>>> >> when they are *not in* their own seeds list and after they
>>> >> finish transferring the data, then restart them with themselves *in*
>>> their
>>> >> own seeds list. After doing that the node would join the ring.
>>> >> This is either my misunderstanding or a bug, but the only place I
>>> found it
>>> >> documented stated that the new node should not be in its own seeds
>>> list.
>>> >> Version 0.6.6.
>>> >>
>>> >> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn >> >wrote:
>>> >>
>>> >>> My nodes all have themselves in their list of seeds - always did -
>>> and
>>> >>> everything works. (You may ask why I did this. I don't know, I must
>>> have
>>> >>> copied it from an example somewhere.)
>>> >>>
>>> >>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory  wrote:
>>> >>>
>>>  I was able to make the node join the ring but I'm confused.
>>>  What I did is, first when adding the node, this node was not in the
>>> seeds
>>>  list of itself. AFAIK this is how it's supposed to be. So it was
>>> able to
>>>  transfer all data to itself from other nodes but then it stayed in
>>> the
>>>  bootstrapping state.
>>>  So what I did (and I don't know why it works), is add this node to
>>> the
>>>  seeds list in its own storage-conf.xml file. Then restart the server
>>> and
>>>  then I finally see it in the ring...
>>>  If I had added the node to the seeds list of itself when first
>>> joining
>>>  it, it would not join the ring but if I do it in two phases it did
>>> work.
>>>  So it's either my misunderstanding or a bug...
>>> 
>>> 
>>>  On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory 
>>> wrote:
>>> 
>>> > The new node does not see itself as part of the ring, it sees all
>>> others
>>> > but itself, so from that perspective the view is consistent.
>>> > The only problem is that the node never finishes to bootstrap. It
>>> stays
>>> > in this state for hours (It's been 20 hours now...)
>>> >
>>> >
>>> > $ bin/nodetool -p 9004 -h localhost streams
>>> >> Mode: Bootstrapping
>>> >> Not sending any streams.
>>> >> Not receiving any streams.
>>> >
>>> >
>>> > On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall 
>>> wrote:
>>> >
>>> >> Does the new node have itself in the list of seeds per chance?
>>> This
>>> >> could cause some issues if so.
>>> >>
>>> >> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory 
>>> wrote:
>>> >> > I'm still at lost. I haven't been able to resolve this. I tried
>>> >> > adding another node at a different location on the ring but this
>>> node
>>> >> > too remains stuck in the bootstrapping state for many hours
>>> without
>>> >> > any of the other nodes being busy with anti compaction or
>>> anything
>>> >> > else. I don't know what's keeping it from finishing the
>>> bootstrap,no
>>> >> > CPU, no io, files were already streamed so what is it waiting
>>> for?
>>> >> > I read the release notes of 0.6.7 and 0.6.8 and there didn't
>>> seem to
>>> >> > be anything addressing a similar issue so I figured there was no
>>> >> point
>>> >> > in upgrading. But let me know if you think there is.
>>> >> > Or any other advice...
>>> >> >
>>> >> > On Tuesday, January 4, 2011, Ran Tavory 
>>> wrote:
>>> >> >> Thanks Jake, but unfortunately the streams directory is empty
>>> so I
>>> >> don't think that any of the nodes is anti-compacting data right
>>> 

Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
If "seed list should be the same across the cluster" that means that nodes
*should* have themselves as a seed. If that doesn't work for Ran, then that
is the first problem, no?


On Wed, Jan 5, 2011 at 3:56 PM, Jake Luciani  wrote:

> Well your ring issues don't make sense to me, seed list should be the same
> across the cluster.
> I'm just thinking of other things to try, non-boostrapped nodes should join
> the ring instantly but reads will fail if you aren't using quorum.
>
>
> On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory  wrote:
>
>> I haven't tried repair.  Should I?
>> On Jan 5, 2011 3:48 PM, "Jake Luciani"  wrote:
>> > Have you tried not bootstrapping but setting the token and manually
>> calling
>> > repair?
>> >
>> > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory  wrote:
>> >
>> >> My conclusion is lame: I tried this on several hosts and saw the same
>> >> behavior, the only way I was able to join new nodes was to first start
>> them
>> >> when they are *not in* their own seeds list and after they
>> >> finish transferring the data, then restart them with themselves *in*
>> their
>> >> own seeds list. After doing that the node would join the ring.
>> >> This is either my misunderstanding or a bug, but the only place I found
>> it
>> >> documented stated that the new node should not be in its own seeds
>> list.
>> >> Version 0.6.6.
>> >>
>> >> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn > >wrote:
>> >>
>> >>> My nodes all have themselves in their list of seeds - always did - and
>> >>> everything works. (You may ask why I did this. I don't know, I must
>> have
>> >>> copied it from an example somewhere.)
>> >>>
>> >>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory  wrote:
>> >>>
>>  I was able to make the node join the ring but I'm confused.
>>  What I did is, first when adding the node, this node was not in the
>> seeds
>>  list of itself. AFAIK this is how it's supposed to be. So it was able
>> to
>>  transfer all data to itself from other nodes but then it stayed in
>> the
>>  bootstrapping state.
>>  So what I did (and I don't know why it works), is add this node to
>> the
>>  seeds list in its own storage-conf.xml file. Then restart the server
>> and
>>  then I finally see it in the ring...
>>  If I had added the node to the seeds list of itself when first
>> joining
>>  it, it would not join the ring but if I do it in two phases it did
>> work.
>>  So it's either my misunderstanding or a bug...
>> 
>> 
>>  On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory  wrote:
>> 
>> > The new node does not see itself as part of the ring, it sees all
>> others
>> > but itself, so from that perspective the view is consistent.
>> > The only problem is that the node never finishes to bootstrap. It
>> stays
>> > in this state for hours (It's been 20 hours now...)
>> >
>> >
>> > $ bin/nodetool -p 9004 -h localhost streams
>> >> Mode: Bootstrapping
>> >> Not sending any streams.
>> >> Not receiving any streams.
>> >
>> >
>> > On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall 
>> wrote:
>> >
>> >> Does the new node have itself in the list of seeds per chance? This
>> >> could cause some issues if so.
>> >>
>> >> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory 
>> wrote:
>> >> > I'm still at lost. I haven't been able to resolve this. I tried
>> >> > adding another node at a different location on the ring but this
>> node
>> >> > too remains stuck in the bootstrapping state for many hours
>> without
>> >> > any of the other nodes being busy with anti compaction or
>> anything
>> >> > else. I don't know what's keeping it from finishing the
>> bootstrap,no
>> >> > CPU, no io, files were already streamed so what is it waiting
>> for?
>> >> > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem
>> to
>> >> > be anything addressing a similar issue so I figured there was no
>> >> point
>> >> > in upgrading. But let me know if you think there is.
>> >> > Or any other advice...
>> >> >
>> >> > On Tuesday, January 4, 2011, Ran Tavory 
>> wrote:
>> >> >> Thanks Jake, but unfortunately the streams directory is empty so
>> I
>> >> don't think that any of the nodes is anti-compacting data right now
>> or had
>> >> been in the past 5 hours. It seems that all the data was already
>> transferred
>> >> to the joining host but the joining node, after having received the
>> data
>> >> would still remain in bootstrapping mode and not join the cluster.
>> I'm not
>> >> sure that *all* data was transferred (perhaps other nodes need to
>> transfer
>> >> more data) but nothing is actually happening so I assume all has
>> been moved.
>> >> >> Perhaps it's a configuration error from my part. Should I use I
>> use
>> >> AutoBootstrap=true ? Anything else I should look out for in the
>> >> configuration file or something else?
>> >>>

Re: The size of the data, I must be doing smth wrong....

2011-01-05 Thread Edward Capriolo
On Wed, Jan 5, 2011 at 9:52 AM, Jonathan Ellis  wrote:
> It's normal for Cassandra to use more disk space than MySQL.  It's
> part of what we trade for not having to rewrite every row when you add
> a new column.
>
> "SSTables that are obsoleted by a compaction are deleted
> asynchronously when the JVM performs a GC."
> http://wiki.apache.org/cassandra/MemtableSSTable
>
> On Wed, Jan 5, 2011 at 8:35 AM, nicolas lattuada
>  wrote:
>> Hi
>>
>> i have some data size issues:
>>
>> i am storing super columns with the following content:
>>
>> {a=>1, b=>2, c=>3...n=>14}
>>
>> i am storing it 300 000 times and i have a data size on the disk about 283Mo
>>
>> And in other side i have a mysql table which stores a bunch of data the
>> schema follows:
>> 6 varchars +100
>> 5 ints +6
>>
>> I put about 1 300 000 records on it and end up with 150Mo of data and 57Mo
>> of index.
>>
>> Then i think i am certainly doing something wrong...
>>
>> The other thing is when i run flush and then compact the size of my data
>> increases, then i imagine something is copied up on compaction
>> So is there a way to remove the unused data? (cleanup doesn t seem to do the
>> job).
>>
>> Any help to reduce the size of the data would be greatly apreciated!
>> Greetings
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Unlike datastores that are delimited or have fixed column sizes
Cassandra does not. Each row is a Sorted Map of columns. A Column is a
tupple of {columnname,columnvalue,time}. Also the data is not stored
as tersely as it is inside mysql.


Re: The size of the data, I must be doing smth wrong....

2011-01-05 Thread Jonathan Ellis
It's normal for Cassandra to use more disk space than MySQL.  It's
part of what we trade for not having to rewrite every row when you add
a new column.

"SSTables that are obsoleted by a compaction are deleted
asynchronously when the JVM performs a GC."
http://wiki.apache.org/cassandra/MemtableSSTable

On Wed, Jan 5, 2011 at 8:35 AM, nicolas lattuada
 wrote:
> Hi
>
> i have some data size issues:
>
> i am storing super columns with the following content:
>
> {a=>1, b=>2, c=>3...n=>14}
>
> i am storing it 300 000 times and i have a data size on the disk about 283Mo
>
> And in other side i have a mysql table which stores a bunch of data the
> schema follows:
> 6 varchars +100
> 5 ints +6
>
> I put about 1 300 000 records on it and end up with 150Mo of data and 57Mo
> of index.
>
> Then i think i am certainly doing something wrong...
>
> The other thing is when i run flush and then compact the size of my data
> increases, then i imagine something is copied up on compaction
> So is there a way to remove the unused data? (cleanup doesn t seem to do the
> job).
>
> Any help to reduce the size of the data would be greatly apreciated!
> Greetings
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Cassandra 0.7 - Query on network topology

2011-01-05 Thread Jonathan Ellis
On Wed, Jan 5, 2011 at 3:37 AM, Narendra Sharma
 wrote:
> What I am looking for is:
> 1. Some way to send requests for keys whose token fall between 0-25 to B and
> never to C even though C will have the data due to it being replica of B.
> 2. Only when B is down or not reachable, the request should go to C.
> 3. Once the requests start going to C, they should continue unless C is down
> and in which case the requests should then go to B.
>
> My understanding is that SimpleSnitch should fit here except for the
> enforcing #3 above.

Right, with the caveat that you'll probably want to set the dynamic
snitch badness threshold to allow switching to B even if C merely gets
overloaded rather than completely down.  The alternative is disabling
the dynamic snitch entirely.

> will SimpleSnitch come into
> picture if the request from client reaches node C directly?

Yes.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


The size of the data, I must be doing smth wrong....

2011-01-05 Thread nicolas lattuada

Hi 

i have some data size issues:

i am storing super columns with the following content:

{a=>1, b=>2, c=>3...n=>14}

i am storing it 300 000 times and i have a data size on the disk about 283Mo

And in other side i have a mysql table which stores a bunch of data the schema 
follows:
6 varchars +100
5 ints +6

I put about 1 300 000 records on it and end up with 150Mo of data and 57Mo of 
index.

Then i think i am certainly doing something wrong...

The other thing is when i run flush and then compact the size of my data 
increases, then i imagine something is copied up on compaction
So is there a way to remove the unused data? (cleanup doesn t seem to do the 
job).

Any help to reduce the size of the data would be greatly apreciated!
Greetings

  

The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread David Boxenhorn
The CLI sometimes gets only 100 results (even though there are more) - and
sometimes gets all the results, even when there are more than 100!

What is going on here? Is there some logic that says if there are too many
results return 100, even though "too many" can be more than 100?


Re: Bootstrapping taking long

2011-01-05 Thread Jake Luciani
Well your ring issues don't make sense to me, seed list should be the same
across the cluster.
I'm just thinking of other things to try, non-boostrapped nodes should join
the ring instantly but reads will fail if you aren't using quorum.


On Wed, Jan 5, 2011 at 8:51 AM, Ran Tavory  wrote:

> I haven't tried repair.  Should I?
> On Jan 5, 2011 3:48 PM, "Jake Luciani"  wrote:
> > Have you tried not bootstrapping but setting the token and manually
> calling
> > repair?
> >
> > On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory  wrote:
> >
> >> My conclusion is lame: I tried this on several hosts and saw the same
> >> behavior, the only way I was able to join new nodes was to first start
> them
> >> when they are *not in* their own seeds list and after they
> >> finish transferring the data, then restart them with themselves *in*
> their
> >> own seeds list. After doing that the node would join the ring.
> >> This is either my misunderstanding or a bug, but the only place I found
> it
> >> documented stated that the new node should not be in its own seeds list.
> >> Version 0.6.6.
> >>
> >> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn  >wrote:
> >>
> >>> My nodes all have themselves in their list of seeds - always did - and
> >>> everything works. (You may ask why I did this. I don't know, I must
> have
> >>> copied it from an example somewhere.)
> >>>
> >>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory  wrote:
> >>>
>  I was able to make the node join the ring but I'm confused.
>  What I did is, first when adding the node, this node was not in the
> seeds
>  list of itself. AFAIK this is how it's supposed to be. So it was able
> to
>  transfer all data to itself from other nodes but then it stayed in the
>  bootstrapping state.
>  So what I did (and I don't know why it works), is add this node to the
>  seeds list in its own storage-conf.xml file. Then restart the server
> and
>  then I finally see it in the ring...
>  If I had added the node to the seeds list of itself when first joining
>  it, it would not join the ring but if I do it in two phases it did
> work.
>  So it's either my misunderstanding or a bug...
> 
> 
>  On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory  wrote:
> 
> > The new node does not see itself as part of the ring, it sees all
> others
> > but itself, so from that perspective the view is consistent.
> > The only problem is that the node never finishes to bootstrap. It
> stays
> > in this state for hours (It's been 20 hours now...)
> >
> >
> > $ bin/nodetool -p 9004 -h localhost streams
> >> Mode: Bootstrapping
> >> Not sending any streams.
> >> Not receiving any streams.
> >
> >
> > On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall 
> wrote:
> >
> >> Does the new node have itself in the list of seeds per chance? This
> >> could cause some issues if so.
> >>
> >> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory 
> wrote:
> >> > I'm still at lost. I haven't been able to resolve this. I tried
> >> > adding another node at a different location on the ring but this
> node
> >> > too remains stuck in the bootstrapping state for many hours
> without
> >> > any of the other nodes being busy with anti compaction or anything
> >> > else. I don't know what's keeping it from finishing the
> bootstrap,no
> >> > CPU, no io, files were already streamed so what is it waiting for?
> >> > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem
> to
> >> > be anything addressing a similar issue so I figured there was no
> >> point
> >> > in upgrading. But let me know if you think there is.
> >> > Or any other advice...
> >> >
> >> > On Tuesday, January 4, 2011, Ran Tavory  wrote:
> >> >> Thanks Jake, but unfortunately the streams directory is empty so
> I
> >> don't think that any of the nodes is anti-compacting data right now
> or had
> >> been in the past 5 hours. It seems that all the data was already
> transferred
> >> to the joining host but the joining node, after having received the
> data
> >> would still remain in bootstrapping mode and not join the cluster.
> I'm not
> >> sure that *all* data was transferred (perhaps other nodes need to
> transfer
> >> more data) but nothing is actually happening so I assume all has
> been moved.
> >> >> Perhaps it's a configuration error from my part. Should I use I
> use
> >> AutoBootstrap=true ? Anything else I should look out for in the
> >> configuration file or something else?
> >> >>
> >> >>
> >> >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani 
> >> wrote:
> >> >>
> >> >> In 0.6, locate the node doing anti-compaction and look in the
> >> "streams" subdirectory in the keyspace data dir to monitor the
> >> anti-compaction progress (it puts new SSTables for bootstrapping
> node in
> >> there)
> >> >>
> >> >>

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
I haven't tried repair.  Should I?
On Jan 5, 2011 3:48 PM, "Jake Luciani"  wrote:
> Have you tried not bootstrapping but setting the token and manually
calling
> repair?
>
> On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory  wrote:
>
>> My conclusion is lame: I tried this on several hosts and saw the same
>> behavior, the only way I was able to join new nodes was to first start
them
>> when they are *not in* their own seeds list and after they
>> finish transferring the data, then restart them with themselves *in*
their
>> own seeds list. After doing that the node would join the ring.
>> This is either my misunderstanding or a bug, but the only place I found
it
>> documented stated that the new node should not be in its own seeds list.
>> Version 0.6.6.
>>
>> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn wrote:
>>
>>> My nodes all have themselves in their list of seeds - always did - and
>>> everything works. (You may ask why I did this. I don't know, I must have
>>> copied it from an example somewhere.)
>>>
>>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory  wrote:
>>>
 I was able to make the node join the ring but I'm confused.
 What I did is, first when adding the node, this node was not in the
seeds
 list of itself. AFAIK this is how it's supposed to be. So it was able
to
 transfer all data to itself from other nodes but then it stayed in the
 bootstrapping state.
 So what I did (and I don't know why it works), is add this node to the
 seeds list in its own storage-conf.xml file. Then restart the server
and
 then I finally see it in the ring...
 If I had added the node to the seeds list of itself when first joining
 it, it would not join the ring but if I do it in two phases it did
work.
 So it's either my misunderstanding or a bug...


 On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory  wrote:

> The new node does not see itself as part of the ring, it sees all
others
> but itself, so from that perspective the view is consistent.
> The only problem is that the node never finishes to bootstrap. It
stays
> in this state for hours (It's been 20 hours now...)
>
>
> $ bin/nodetool -p 9004 -h localhost streams
>> Mode: Bootstrapping
>> Not sending any streams.
>> Not receiving any streams.
>
>
> On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall  wrote:
>
>> Does the new node have itself in the list of seeds per chance? This
>> could cause some issues if so.
>>
>> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory  wrote:
>> > I'm still at lost. I haven't been able to resolve this. I tried
>> > adding another node at a different location on the ring but this
node
>> > too remains stuck in the bootstrapping state for many hours without
>> > any of the other nodes being busy with anti compaction or anything
>> > else. I don't know what's keeping it from finishing the
bootstrap,no
>> > CPU, no io, files were already streamed so what is it waiting for?
>> > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem
to
>> > be anything addressing a similar issue so I figured there was no
>> point
>> > in upgrading. But let me know if you think there is.
>> > Or any other advice...
>> >
>> > On Tuesday, January 4, 2011, Ran Tavory  wrote:
>> >> Thanks Jake, but unfortunately the streams directory is empty so I
>> don't think that any of the nodes is anti-compacting data right now
or had
>> been in the past 5 hours. It seems that all the data was already
transferred
>> to the joining host but the joining node, after having received the
data
>> would still remain in bootstrapping mode and not join the cluster.
I'm not
>> sure that *all* data was transferred (perhaps other nodes need to
transfer
>> more data) but nothing is actually happening so I assume all has been
moved.
>> >> Perhaps it's a configuration error from my part. Should I use I
use
>> AutoBootstrap=true ? Anything else I should look out for in the
>> configuration file or something else?
>> >>
>> >>
>> >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani 
>> wrote:
>> >>
>> >> In 0.6, locate the node doing anti-compaction and look in the
>> "streams" subdirectory in the keyspace data dir to monitor the
>> anti-compaction progress (it puts new SSTables for bootstrapping node
in
>> there)
>> >>
>> >>
>> >> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory 
>> wrote:
>> >>
>> >>
>> >> Running nodetool decommission didn't help. Actually the node
refused
>> to decommission itself (b/c it wasn't part of the ring). So I simply
stopped
>> the process, deleted all the data directories and started it again.
It
>> worked in the sense of the node bootstrapped again but as before,
after it
>> had finished moving the data nothing happened for a long time (I'm
still
>> waiting, but nothing seems to

Re: Bootstrapping taking long

2011-01-05 Thread Jake Luciani
Have you tried not bootstrapping but setting the token and manually calling
repair?

On Wed, Jan 5, 2011 at 7:07 AM, Ran Tavory  wrote:

> My conclusion is lame: I tried this on several hosts and saw the same
> behavior, the only way I was able to join new nodes was to first start them
> when they are *not in* their own seeds list and after they
> finish transferring the data, then restart them with themselves *in* their
> own seeds list. After doing that the node would join the ring.
> This is either my misunderstanding or a bug, but the only place I found it
> documented stated that the new node should not be in its own seeds list.
> Version 0.6.6.
>
> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn wrote:
>
>> My nodes all have themselves in their list of seeds - always did - and
>> everything works. (You may ask why I did this. I don't know, I must have
>> copied it from an example somewhere.)
>>
>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory  wrote:
>>
>>> I was able to make the node join the ring but I'm confused.
>>> What I did is, first when adding the node, this node was not in the seeds
>>> list of itself. AFAIK this is how it's supposed to be. So it was able to
>>> transfer all data to itself from other nodes but then it stayed in the
>>> bootstrapping state.
>>> So what I did (and I don't know why it works), is add this node to the
>>> seeds list in its own storage-conf.xml file. Then restart the server and
>>> then I finally see it in the ring...
>>> If I had added the node to the seeds list of itself when first joining
>>> it, it would not join the ring but if I do it in two phases it did work.
>>> So it's either my misunderstanding or a bug...
>>>
>>>
>>> On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory  wrote:
>>>
 The new node does not see itself as part of the ring, it sees all others
 but itself, so from that perspective the view is consistent.
 The only problem is that the node never finishes to bootstrap. It stays
 in this state for hours (It's been 20 hours now...)


 $ bin/nodetool -p 9004 -h localhost streams
> Mode: Bootstrapping
> Not sending any streams.
> Not receiving any streams.


 On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall  wrote:

> Does the new node have itself in the list of seeds per chance? This
> could cause some issues if so.
>
> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory  wrote:
> > I'm still at lost.   I haven't been able to resolve this. I tried
> > adding another node at a different location on the ring but this node
> > too remains stuck in the bootstrapping state for many hours without
> > any of the other nodes being busy with anti compaction or anything
> > else. I don't know what's keeping it from finishing the bootstrap,no
> > CPU, no io, files were already streamed so what is it waiting for?
> > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
> > be anything addressing a similar issue so I figured there was no
> point
> > in upgrading. But let me know if you think there is.
> > Or any other advice...
> >
> > On Tuesday, January 4, 2011, Ran Tavory  wrote:
> >> Thanks Jake, but unfortunately the streams directory is empty so I
> don't think that any of the nodes is anti-compacting data right now or had
> been in the past 5 hours. It seems that all the data was already 
> transferred
> to the joining host but the joining node, after having received the data
> would still remain in bootstrapping mode and not join the cluster. I'm not
> sure that *all* data was transferred (perhaps other nodes need to transfer
> more data) but nothing is actually happening so I assume all has been 
> moved.
> >> Perhaps it's a configuration error from my part. Should I use I use
> AutoBootstrap=true ? Anything else I should look out for in the
> configuration file or something else?
> >>
> >>
> >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani 
> wrote:
> >>
> >> In 0.6, locate the node doing anti-compaction and look in the
> "streams" subdirectory in the keyspace data dir to monitor the
> anti-compaction progress (it puts new SSTables for bootstrapping node in
> there)
> >>
> >>
> >> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory 
> wrote:
> >>
> >>
> >> Running nodetool decommission didn't help. Actually the node refused
> to decommission itself (b/c it wasn't part of the ring). So I simply 
> stopped
> the process, deleted all the data directories and started it again. It
> worked in the sense of the node bootstrapped again but as before, after it
> had finished moving the data nothing happened for a long time (I'm still
> waiting, but nothing seems to be happening).
> >>
> >>
> >>
> >>
> >> Any hints how to analyze a "stuck" bootstrapping node??thanks
> >> On Tue, Jan 4, 2011 at 1:51 P

Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
I started all my nodes the first time with seeds in their own lists, and it
worked. I think I started them in 0.6.1, but I'm not sure. (I'm now using
0.6.8).


On Wed, Jan 5, 2011 at 2:07 PM, Ran Tavory  wrote:

> My conclusion is lame: I tried this on several hosts and saw the same
> behavior, the only way I was able to join new nodes was to first start them
> when they are *not in* their own seeds list and after they
> finish transferring the data, then restart them with themselves *in* their
> own seeds list. After doing that the node would join the ring.
> This is either my misunderstanding or a bug, but the only place I found it
> documented stated that the new node should not be in its own seeds list.
> Version 0.6.6.
>
> On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn wrote:
>
>> My nodes all have themselves in their list of seeds - always did - and
>> everything works. (You may ask why I did this. I don't know, I must have
>> copied it from an example somewhere.)
>>
>> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory  wrote:
>>
>>> I was able to make the node join the ring but I'm confused.
>>> What I did is, first when adding the node, this node was not in the seeds
>>> list of itself. AFAIK this is how it's supposed to be. So it was able to
>>> transfer all data to itself from other nodes but then it stayed in the
>>> bootstrapping state.
>>> So what I did (and I don't know why it works), is add this node to the
>>> seeds list in its own storage-conf.xml file. Then restart the server and
>>> then I finally see it in the ring...
>>> If I had added the node to the seeds list of itself when first joining
>>> it, it would not join the ring but if I do it in two phases it did work.
>>> So it's either my misunderstanding or a bug...
>>>
>>>
>>> On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory  wrote:
>>>
 The new node does not see itself as part of the ring, it sees all others
 but itself, so from that perspective the view is consistent.
 The only problem is that the node never finishes to bootstrap. It stays
 in this state for hours (It's been 20 hours now...)


 $ bin/nodetool -p 9004 -h localhost streams
> Mode: Bootstrapping
> Not sending any streams.
> Not receiving any streams.


 On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall  wrote:

> Does the new node have itself in the list of seeds per chance? This
> could cause some issues if so.
>
> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory  wrote:
> > I'm still at lost.   I haven't been able to resolve this. I tried
> > adding another node at a different location on the ring but this node
> > too remains stuck in the bootstrapping state for many hours without
> > any of the other nodes being busy with anti compaction or anything
> > else. I don't know what's keeping it from finishing the bootstrap,no
> > CPU, no io, files were already streamed so what is it waiting for?
> > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
> > be anything addressing a similar issue so I figured there was no
> point
> > in upgrading. But let me know if you think there is.
> > Or any other advice...
> >
> > On Tuesday, January 4, 2011, Ran Tavory  wrote:
> >> Thanks Jake, but unfortunately the streams directory is empty so I
> don't think that any of the nodes is anti-compacting data right now or had
> been in the past 5 hours. It seems that all the data was already 
> transferred
> to the joining host but the joining node, after having received the data
> would still remain in bootstrapping mode and not join the cluster. I'm not
> sure that *all* data was transferred (perhaps other nodes need to transfer
> more data) but nothing is actually happening so I assume all has been 
> moved.
> >> Perhaps it's a configuration error from my part. Should I use I use
> AutoBootstrap=true ? Anything else I should look out for in the
> configuration file or something else?
> >>
> >>
> >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani 
> wrote:
> >>
> >> In 0.6, locate the node doing anti-compaction and look in the
> "streams" subdirectory in the keyspace data dir to monitor the
> anti-compaction progress (it puts new SSTables for bootstrapping node in
> there)
> >>
> >>
> >> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory 
> wrote:
> >>
> >>
> >> Running nodetool decommission didn't help. Actually the node refused
> to decommission itself (b/c it wasn't part of the ring). So I simply 
> stopped
> the process, deleted all the data directories and started it again. It
> worked in the sense of the node bootstrapped again but as before, after it
> had finished moving the data nothing happened for a long time (I'm still
> waiting, but nothing seems to be happening).
> >>
> >>
> >>
> >>
> >> Any hints how to analyze 

Re: Bootstrapping taking long

2011-01-05 Thread Ran Tavory
My conclusion is lame: I tried this on several hosts and saw the same
behavior, the only way I was able to join new nodes was to first start them
when they are *not in* their own seeds list and after they
finish transferring the data, then restart them with themselves *in* their
own seeds list. After doing that the node would join the ring.
This is either my misunderstanding or a bug, but the only place I found it
documented stated that the new node should not be in its own seeds list.
Version 0.6.6.

On Wed, Jan 5, 2011 at 10:35 AM, David Boxenhorn  wrote:

> My nodes all have themselves in their list of seeds - always did - and
> everything works. (You may ask why I did this. I don't know, I must have
> copied it from an example somewhere.)
>
> On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory  wrote:
>
>> I was able to make the node join the ring but I'm confused.
>> What I did is, first when adding the node, this node was not in the seeds
>> list of itself. AFAIK this is how it's supposed to be. So it was able to
>> transfer all data to itself from other nodes but then it stayed in the
>> bootstrapping state.
>> So what I did (and I don't know why it works), is add this node to the
>> seeds list in its own storage-conf.xml file. Then restart the server and
>> then I finally see it in the ring...
>> If I had added the node to the seeds list of itself when first joining it,
>> it would not join the ring but if I do it in two phases it did work.
>> So it's either my misunderstanding or a bug...
>>
>>
>> On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory  wrote:
>>
>>> The new node does not see itself as part of the ring, it sees all others
>>> but itself, so from that perspective the view is consistent.
>>> The only problem is that the node never finishes to bootstrap. It stays
>>> in this state for hours (It's been 20 hours now...)
>>>
>>>
>>> $ bin/nodetool -p 9004 -h localhost streams
 Mode: Bootstrapping
 Not sending any streams.
 Not receiving any streams.
>>>
>>>
>>> On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall  wrote:
>>>
 Does the new node have itself in the list of seeds per chance? This
 could cause some issues if so.

 On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory  wrote:
 > I'm still at lost.   I haven't been able to resolve this. I tried
 > adding another node at a different location on the ring but this node
 > too remains stuck in the bootstrapping state for many hours without
 > any of the other nodes being busy with anti compaction or anything
 > else. I don't know what's keeping it from finishing the bootstrap,no
 > CPU, no io, files were already streamed so what is it waiting for?
 > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
 > be anything addressing a similar issue so I figured there was no point
 > in upgrading. But let me know if you think there is.
 > Or any other advice...
 >
 > On Tuesday, January 4, 2011, Ran Tavory  wrote:
 >> Thanks Jake, but unfortunately the streams directory is empty so I
 don't think that any of the nodes is anti-compacting data right now or had
 been in the past 5 hours. It seems that all the data was already 
 transferred
 to the joining host but the joining node, after having received the data
 would still remain in bootstrapping mode and not join the cluster. I'm not
 sure that *all* data was transferred (perhaps other nodes need to transfer
 more data) but nothing is actually happening so I assume all has been 
 moved.
 >> Perhaps it's a configuration error from my part. Should I use I use
 AutoBootstrap=true ? Anything else I should look out for in the
 configuration file or something else?
 >>
 >>
 >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani 
 wrote:
 >>
 >> In 0.6, locate the node doing anti-compaction and look in the
 "streams" subdirectory in the keyspace data dir to monitor the
 anti-compaction progress (it puts new SSTables for bootstrapping node in
 there)
 >>
 >>
 >> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory  wrote:
 >>
 >>
 >> Running nodetool decommission didn't help. Actually the node refused
 to decommission itself (b/c it wasn't part of the ring). So I simply 
 stopped
 the process, deleted all the data directories and started it again. It
 worked in the sense of the node bootstrapped again but as before, after it
 had finished moving the data nothing happened for a long time (I'm still
 waiting, but nothing seems to be happening).
 >>
 >>
 >>
 >>
 >> Any hints how to analyze a "stuck" bootstrapping node??thanks
 >> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:
 >> Thanks Shimi, so indeed anticompaction was run on one of the other
 nodes from the same DC but to my understanding it has already ended. A few
 hour ago...
 >>
 >>
 >>
 >> I plenty of log messages such as

Cassandra 0.7 - Query on network topology

2011-01-05 Thread Narendra Sharma
Hi,

We are working on defining the ring topology for our cluster. One of the
plans under discussion is to have a RF=2 and perform read/write operations
with CL=ONE. I know this could be an issue since it doesn't satisfy R+W >
RF. This will work if we can always force the clients to go to the first
node responsible for the data and go to its replica only if the first node
is not available.

For eg assume the ring is following with token space from 0 to 100. Assume
random parititioner is used.
A[0] -> B[25] -> C[50] -> D[75]

so going by SimpleStrategy, the replica for A will be B, for B it will be C
and so on.

What I am looking for is:
1. Some way to send requests for keys whose token fall between 0-25 to B and
never to C even though C will have the data due to it being replica of B.
2. Only when B is down or not reachable, the request should go to C.
3. Once the requests start going to C, they should continue unless C is down
and in which case the requests should then go to B.

My understanding is that SimpleSnitch should fit here except for the
enforcing #3 above. I should be able to extend the SimpleSnitch to add the
behavior described in #3 above. Question is will SimpleSnitch come into
picture if the request from client reaches node C directly? If yes, will it
proxy request to B? I don't want C to serve request for keys in range 0-25
unless B is down.


One way this can be feasible is if I do key -> token -> node mapping in my
client. But this might get messy as I will have to keep track of nodes in
the ring etc.

Comments/suggestions are welcome.

Thanks,
Naren


Re: Bootstrapping taking long

2011-01-05 Thread David Boxenhorn
My nodes all have themselves in their list of seeds - always did - and
everything works. (You may ask why I did this. I don't know, I must have
copied it from an example somewhere.)

On Wed, Jan 5, 2011 at 9:42 AM, Ran Tavory  wrote:

> I was able to make the node join the ring but I'm confused.
> What I did is, first when adding the node, this node was not in the seeds
> list of itself. AFAIK this is how it's supposed to be. So it was able to
> transfer all data to itself from other nodes but then it stayed in the
> bootstrapping state.
> So what I did (and I don't know why it works), is add this node to the
> seeds list in its own storage-conf.xml file. Then restart the server and
> then I finally see it in the ring...
> If I had added the node to the seeds list of itself when first joining it,
> it would not join the ring but if I do it in two phases it did work.
> So it's either my misunderstanding or a bug...
>
>
> On Wed, Jan 5, 2011 at 7:14 AM, Ran Tavory  wrote:
>
>> The new node does not see itself as part of the ring, it sees all others
>> but itself, so from that perspective the view is consistent.
>> The only problem is that the node never finishes to bootstrap. It stays in
>> this state for hours (It's been 20 hours now...)
>>
>>
>> $ bin/nodetool -p 9004 -h localhost streams
>>> Mode: Bootstrapping
>>> Not sending any streams.
>>> Not receiving any streams.
>>
>>
>> On Wed, Jan 5, 2011 at 1:20 AM, Nate McCall  wrote:
>>
>>> Does the new node have itself in the list of seeds per chance? This
>>> could cause some issues if so.
>>>
>>> On Tue, Jan 4, 2011 at 4:10 PM, Ran Tavory  wrote:
>>> > I'm still at lost.   I haven't been able to resolve this. I tried
>>> > adding another node at a different location on the ring but this node
>>> > too remains stuck in the bootstrapping state for many hours without
>>> > any of the other nodes being busy with anti compaction or anything
>>> > else. I don't know what's keeping it from finishing the bootstrap,no
>>> > CPU, no io, files were already streamed so what is it waiting for?
>>> > I read the release notes of 0.6.7 and 0.6.8 and there didn't seem to
>>> > be anything addressing a similar issue so I figured there was no point
>>> > in upgrading. But let me know if you think there is.
>>> > Or any other advice...
>>> >
>>> > On Tuesday, January 4, 2011, Ran Tavory  wrote:
>>> >> Thanks Jake, but unfortunately the streams directory is empty so I
>>> don't think that any of the nodes is anti-compacting data right now or had
>>> been in the past 5 hours. It seems that all the data was already transferred
>>> to the joining host but the joining node, after having received the data
>>> would still remain in bootstrapping mode and not join the cluster. I'm not
>>> sure that *all* data was transferred (perhaps other nodes need to transfer
>>> more data) but nothing is actually happening so I assume all has been moved.
>>> >> Perhaps it's a configuration error from my part. Should I use I use
>>> AutoBootstrap=true ? Anything else I should look out for in the
>>> configuration file or something else?
>>> >>
>>> >>
>>> >> On Tue, Jan 4, 2011 at 4:08 PM, Jake Luciani 
>>> wrote:
>>> >>
>>> >> In 0.6, locate the node doing anti-compaction and look in the
>>> "streams" subdirectory in the keyspace data dir to monitor the
>>> anti-compaction progress (it puts new SSTables for bootstrapping node in
>>> there)
>>> >>
>>> >>
>>> >> On Tue, Jan 4, 2011 at 8:01 AM, Ran Tavory  wrote:
>>> >>
>>> >>
>>> >> Running nodetool decommission didn't help. Actually the node refused
>>> to decommission itself (b/c it wasn't part of the ring). So I simply stopped
>>> the process, deleted all the data directories and started it again. It
>>> worked in the sense of the node bootstrapped again but as before, after it
>>> had finished moving the data nothing happened for a long time (I'm still
>>> waiting, but nothing seems to be happening).
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> Any hints how to analyze a "stuck" bootstrapping node??thanks
>>> >> On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory  wrote:
>>> >> Thanks Shimi, so indeed anticompaction was run on one of the other
>>> nodes from the same DC but to my understanding it has already ended. A few
>>> hour ago...
>>> >>
>>> >>
>>> >>
>>> >> I plenty of log messages such as [1] which ended a couple of hours
>>> ago, and I've seen the new node streaming and accepting the data from the
>>> node which performed the anticompaction and so far it was normal so it
>>> seemed that data is at its right place. But now the new node seems sort of
>>> stuck. None of the other nodes is anticompacting right now or had been
>>> anticompacting since then.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> The new node's CPU is close to zero, it's iostats are almost zero so I
>>> can't find another bottleneck that would keep it hanging.
>>> >> On the IRC someone suggested I'd maybe retry to join this node,
>>> e.g. decommission and rejoin it again. I'll try it now...
>>> >>