Hello all, In order to build a *Cassandra cluster exclusively for availability and replication testings*, I thought of a simple solution, based on a single Linux instance, with no virtualization at all.
The idea was to initialize every node, run a testing client, and manually kill some nodes processes in order to check the service availability and data replication with several RFs (1,2,3). a) What do you think of this approach? Is there any improvement or step you think that should be added? b) The Thrift-based client in Java I used always pointed to node1, a single point of failure threatening availability. What do you recommend using instead, stress.py? Thanks in advance! I created the following structure (with 3 nodes) on Linux: /opt/apache-cassandra-0.6.5/nodes | |-- node1 | |-- bin | |-- conf | |-- data | |-- log | `-- txs |-- node2 | |-- bin | |-- conf | |-- data | |-- log | `-- txs `-- node3 |-- bin |-- conf |-- data |-- log `-- txs And below are the steps I took. 1) create additional network interfaces using alias to loopback (lo) # ifconfig lo:2 127.0.0.2 up # ifconfig lo:3 127.0.0.3 up $ ifconfig lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:30848 errors:0 dropped:0 overruns:0 frame:0 TX packets:30848 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2946793 (2.9 MB) TX bytes:2946793 (2.9 MB) lo:2 Link encap:Local Loopback inet addr:127.0.0.2 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 lo:3 Link encap:Local Loopback inet addr:127.0.0.3 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 That way, node1's IP is 127.0.0.1, node2's 127.0.0.2, and so forth. 2) register node hostnames locally /etc/hosts: 127.0.0.1 localhost node1 127.0.0.2 node2 127.0.0.3 node3 $ ping node2 PING node2 (127.0.0.2) 56(84) bytes of data. 64 bytes from node2 (127.0.0.2): icmp_seq=1 ttl=64 time=0.018 ms 64 bytes from node2 (127.0.0.2): icmp_seq=2 ttl=64 time=0.015 ms ^C --- node2 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.015/0.016/0.018/0.004 ms 3) create first node structure $ cd /opt/apache-cassandra-0.6.5/ $ mkdir -p nodes/node1 $ cp -R bin/ conf/ nodes/node1/ $ cd nodes/node1 I opted to use relative paths instead of absolute for simplification purposes. conf/log4j.properties: # Edit the next line to point to your logs directory log4j.appender.R.File=./log/system.log conf/storage-conf.xml: <CommitLogDirectory>./txs</CommitLogDirectory> <DataFileDirectories> <DataFileDirectory>./data</DataFileDirectory> </DataFileDirectories> <ListenAddress>node1</ListenAddress> <StoragePort>7000</StoragePort> <ThriftAddress></ThriftAddress> <ThriftPort>9160</ThriftPort> <ThriftFramedTransport>false</ThriftFramedTransport> bin/cassandra.in.sh: for jar in $cassandra_home/../../lib/*.jar; do CLASSPATH=$CLASSPATH:$jar done 4) create remaining nodes by cloning the first one $ cd .. ; mkdir node2 node3 $ cp -R node1/* node2 $ cp -R node1/* node3 $ tree -L 2 . |-- node1 | |-- bin | `-- conf |-- node2 | |-- bin | `-- conf `-- node3 |-- bin `-- conf Remaining directories (log, data, txs) are to be created automatically on server startup. 5) edit specific node settings Each node must listen to it's own hostname (i.e., node1, node2, node3). conf/storage-conf.xml: <ListenAddress>node2</ListenAddress> JMX interfaces must be bound to the same host, thus we must change the port. First node will be on 8081, node2 on 8082, and node3 on 8083. node1/bin/cassandra.in.sh: # Arguments to pass to the JVM JVM_OPTS=" \ -ea \ -Xms1G \ -Xmx1G \ -XX:+UseParNewGC \ -XX:+UseConcMarkSweepGC \ -XX:+CMSParallelRemarkEnabled \ -XX:SurvivorRatio=8 \ -XX:MaxTenuringThreshold=1 \ -XX:+HeapDumpOnOutOfMemoryError \ -Dcom.sun.management.jmxremote.port=8082 \ -Dcom.sun.management.jmxremote.ssl=false \ -Dcom.sun.management.jmxremote.authenticate=false" 6) start up every server For this test I thought interesting to open a new terminal for each node and issue the instructions below: $ node1/bin/cassandra -f $ node2/bin/cassandra -f $ node3/bin/cassandra -f 7) check services availability In order to check listening TCP ports, one must search for 9160 (Thrift service), 7000 (internal storage), and 808X (JMX interface). $ netstat -lptn Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN - tcp6 0 0 127.0.0.3:9160 :::* LISTEN 8520/java tcp6 0 0 127.0.0.2:9160 :::* LISTEN 8424/java tcp6 0 0 127.0.0.1:9160 :::* LISTEN 8336/java tcp6 0 0 :::46954 :::* LISTEN 8424/java tcp6 0 0 :::53418 :::* LISTEN 8336/java tcp6 0 0 :::49035 :::* LISTEN 8520/java tcp6 0 0 :::80 :::* LISTEN - tcp6 0 0 :::42737 :::* LISTEN 8520/java tcp6 0 0 :::8081 :::* LISTEN 8336/java tcp6 0 0 :::8082 :::* LISTEN 8424/java tcp6 0 0 :::8083 :::* LISTEN 8520/java tcp6 0 0 :::60310 :::* LISTEN 8424/java tcp6 0 0 :::46167 :::* LISTEN 8336/java tcp6 0 0 ::1:631 :::* LISTEN - tcp6 0 0 127.0.0.3:7000 :::* LISTEN 8520/java tcp6 0 0 127.0.0.2:7000 :::* LISTEN 8424/java tcp6 0 0 127.0.0.1:7000 :::* LISTEN 8336/java The last checking is to invoke nodetool's ring command. $ cd /opt/apache-cassandra-0.6.5/ $ ./bin/nodetool -h localhost -p 8081 ring Address Status Load Range Ring 142865723918937898194528652808268231850 127.0.0.1 Up 3,1 KB 39461784941927371686416024510057184051 |<--| 127.0.0.3 Up 3,1 KB 54264004217607518447601711663387808864 | | 127.0.0.2 Up 2,68 KB 142865723918937898194528652808268231850 |-->| Here it is, the local cluster is up and running! :D -- Best regards, Rodrigo Hjort http://agajorte.blogspot.com