Re: Couldn't detect any schema definitions in local storage - after handling schema disagreement according to FAQ

2012-05-21 Thread aaron morton
> 1) What did I wrong? - why cassandra was throwing exceptions on first startup?
In 1.0.X the history of schema changes was replayed to the node when it 
rejoined the cluster. If the node is receiving traffic while this is going on 
it will log those errors until the schema mutation that created 1012 is 
replayed. 

> 2) Why the keyspace data was invalidated ? Is it expected?
The data will have remained on the disk. The load is calculated based on the 
CF's in the schema, this can mean that the load will not return to full until 
the schema is fully replayed. 

Did you lose data ?

> 3) If answer to #2 is  "yes it's expected" then  that's the point in doing 
> http://wiki.apache.org/cassandra/FAQ#schema_disagreement
> then all keyspace data is lost anyway? It makes more sense to just do 
> http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node

Answer as no. 

Checking, did you delete just the Schema-* and Migration-* files or all of the 
files in data/system?

Also in the first log there is a log of commit log mutation being skipped 
because the schema is not there. Drain should have removed these, but it can 
take a little time (I think).  

> 4) afaiu i could also stop cassandra again move old sstables from snapshot 
> back to keyspace data dir and run repair for all keyspace CFs? So that it 
> finishes faster
> and makes less load than running a repair which has no previous keyspace data 
> at all?

The approach you followed was the correct one. 

I've updated the wiki to say the errors are expected. 

Cheers
 
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 19/05/2012, at 6:34 AM, Piavlo wrote:

> Hi,
> 
> I had a schema disagreement problem in cassandra 1.0.9 cluster, where one 
> node had different schema version.
> So I followed the faq at 
> http://wiki.apache.org/cassandra/FAQ#schema_disagreement
> disabled gossip, disabled thrift, drained  and finally stopped the cassandra 
> process, on startup
> noticed
> INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) 
> Couldn't detect any schema definitions in local storage.
> in the log, and after
> INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) 
> Bootstrap/Replace/Move completed! Now serving reads.
> it started throwing Fatal exceptions for all read/write operations endlessly.
> 
> I had to stop cassandra process again(no draining was done)
> 
> On second start it did came up ok immediately loading the correct cluster 
> schema version
> INFO [main] 2012-05-18 16:54:44,303 DatabaseDescriptor.java (line 499) 
> Loading schema version 9db34ef0-a0be-11e1--f9687e034cf7
> 
> But now this node appears to have started with no data from keyspace which 
> had schema disagreement.
> The original keyspace sstables now appear under snapshots dir.
> 
> # nodetool -h localhost ring
> Address DC  RackStatus State   LoadOwns   
>  Token
>   
> 141784319550391026443072753096570088106
> 10.49.127.4 eu-west 1a  Up Normal  8.19 GB 16.67% 
>  0
> 10.241.29.65eu-west 1b  Up Normal  8.18 GB 16.67% 
>  28356863910078205288614550619314017621
> 10.59.46.236eu-west 1c  Up Normal  8.22 GB 16.67% 
>  56713727820156410577229101238628035242
> 10.50.33.232eu-west 1a  Up Normal  8.2 GB  16.67% 
>  85070591730234615865843651857942052864
> 10.234.71.33eu-west 1b  Up Normal  8.15 GB 16.67% 
>  113427455640312821154458202477256070485
> 10.58.249.118   eu-west 1c  Up Normal  660.98 MB   16.67% 
>  141784319550391026443072753096570088106
> #
> 
> The node is the one with 660.98 MB data( which is opscenter keyspace data 
> which was not invalidated)
> 
> So i have some questions:
> 
> 1) What did I wrong? - why cassandra was throwing exceptions on first startup?
> 2) Why the keyspace data was invalidated ? Is it expected?
> 3) If answer to #2 is  "yes it's expected" then  that's the point in doing 
> http://wiki.apache.org/cassandra/FAQ#schema_disagreement
> then all keyspace data is lost anyway? It makes more sense to just do 
> http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
> 4) afaiu i could also stop cassandra again move old sstables from snapshot 
> back to keyspace data dir and run repair for all keyspace CFs? So that it 
> finishes faster
> and makes less load than running a repair which has no previous keyspace data 
> at all?
> 
> The first startup log is below:
> 
> INFO [main] 2012-05-18 16:23:07,367 AbstractCassandraDaemon.java (line 105) 
> Logging initialized
> INFO [main] 2012-05-18 16:23:07,382 AbstractCassandraDaemon.java (line 126) 
> JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
> INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 127) 
> Heap size

Couldn't detect any schema definitions in local storage - after handling schema disagreement according to FAQ

2012-05-18 Thread Piavlo

 Hi,

I had a schema disagreement problem in cassandra 1.0.9 cluster, where 
one node had different schema version.
So I followed the faq at 
http://wiki.apache.org/cassandra/FAQ#schema_disagreement
disabled gossip, disabled thrift, drained  and finally stopped the 
cassandra process, on startup

noticed
INFO [main] 2012-05-18 16:23:11,879 DatabaseDescriptor.java (line 467) 
Couldn't detect any schema definitions in local storage.

in the log, and after
INFO [main] 2012-05-18 16:23:15,463 StorageService.java (line 619) 
Bootstrap/Replace/Move completed! Now serving reads.
it started throwing Fatal exceptions for all read/write operations 
endlessly.


I had to stop cassandra process again(no draining was done)

On second start it did came up ok immediately loading the correct 
cluster schema version
INFO [main] 2012-05-18 16:54:44,303 DatabaseDescriptor.java (line 499) 
Loading schema version 9db34ef0-a0be-11e1--f9687e034cf7


But now this node appears to have started with no data from keyspace 
which had schema disagreement.

The original keyspace sstables now appear under snapshots dir.

# nodetool -h localhost ring
Address DC  RackStatus State   Load
OwnsToken

   
141784319550391026443072753096570088106
10.49.127.4 eu-west 1a  Up Normal  8.19 GB 
16.67%  0
10.241.29.65eu-west 1b  Up Normal  8.18 GB 
16.67%  28356863910078205288614550619314017621
10.59.46.236eu-west 1c  Up Normal  8.22 GB 
16.67%  56713727820156410577229101238628035242
10.50.33.232eu-west 1a  Up Normal  8.2 GB  
16.67%  85070591730234615865843651857942052864
10.234.71.33eu-west 1b  Up Normal  8.15 GB 
16.67%  113427455640312821154458202477256070485
10.58.249.118   eu-west 1c  Up Normal  660.98 MB   
16.67%  141784319550391026443072753096570088106

#

The node is the one with 660.98 MB data( which is opscenter keyspace 
data which was not invalidated)


So i have some questions:

1) What did I wrong? - why cassandra was throwing exceptions on first 
startup?

2) Why the keyspace data was invalidated ? Is it expected?
3) If answer to #2 is  "yes it's expected" then  that's the point in 
doing http://wiki.apache.org/cassandra/FAQ#schema_disagreement
then all keyspace data is lost anyway? It makes more sense to just do 
http://wiki.apache.org/cassandra/Operations#Replacing_a_Dead_Node
4) afaiu i could also stop cassandra again move old sstables from 
snapshot back to keyspace data dir and run repair for all keyspace CFs? 
So that it finishes faster
and makes less load than running a repair which has no previous keyspace 
data at all?


The first startup log is below:

 INFO [main] 2012-05-18 16:23:07,367 AbstractCassandraDaemon.java (line 
105) Logging initialized
 INFO [main] 2012-05-18 16:23:07,382 AbstractCassandraDaemon.java (line 
126) JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
127) Heap size: 2600468480/2600468480
 INFO [main] 2012-05-18 16:23:07,383 AbstractCassandraDaemon.java (line 
128) Classpath: 
/etc/cassandra/conf:/usr/share/java/jna.jar:/usr/share/java/mx4j-tools.jar:/usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/apache-cassandra-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-clientutil-1.0.9.jar:/usr/share/cassandra/lib/apache-cassandra-thrift-1.0.9.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.4.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.2.jar:/usr/share/cassandra/lib/guava-r08.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jline-0.9.94.jar:/usr/share/cassandra/lib/joda-time-1.6.2.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.6.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/servlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.6.1.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.6.1.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra//lib/jamm-0.2.5.jar
 INFO [main] 2012-05-18 16:23:10,661 CLibrary.java (line 109) JNA 
mlockall successful
 INFO [main] 2012-05-18 16:23:10,692 DatabaseDescriptor.java (line 114) 
Loading settings from file:/etc/cassandra/ssa/cassandra.yaml
 INFO [main] 2012-05-18 16:23:10,868 DatabaseDescriptor.java (line 168) 
DiskAccessMode 'auto' determined to