Hi All, It looks it is know issue with Cassandra-0.8.4. So either I have to wait till 0.8.5 to be released or have to switch to 0.7.8 if this has been resolved in that. Ref: https://issues.apache.org/jira/browse/CASSANDRA-3044
Regards, Thamizhannal P --- On Thu, 25/8/11, Thamizh <tceg...@yahoo.co.in> wrote: From: Thamizh <tceg...@yahoo.co.in> Subject: Re: multi-node cassandra config doubt To: user@cassandra.apache.org Date: Thursday, 25 August, 2011, 9:01 PM Hi Aaron, Thanks a lot for your suggestions. I have got exhausted with below error. It would great if you point me what went wrong with my approach. I wanted to install cassandra-0.8.4 on 3 nodes and to run Map/Reduce job that uploads data from HDFS to Cassandra. I have installed Cassnadra on 3 nodes lab02(199.168.0.2),lab03(199.168.0.3) & lab04(199.168.0.4) respectively and can create a keyspace & column family and they got distributed across the cluster. When I run my map/reduce program it ended up with "UnknownHostException". the same map/reduce program works well on single node cluster. Here are the steps which I have followed. 1. cassandra.yaml details lab02(199.168.0.2): (seed node) auto_bootstrap: false seeds: "199.168.0.2" listen_address: 199.168.0.2 rpc_address: 199.168.0.2 lab03(199.168.0.3): auto_bootstrap: true seeds: "199.168.0.2" listen_address: 199.168.0.3 rpc_address: 199.168.0.3 lab04(199.168.0.4): auto_bootstrap: true seeds: "199.168.0.2" listen_address: 199.168.0.4 rpc_address: 199.168.0.4 2. O/P of bin/cassandra : ------ ------ INFO 11:59:40,602 Node /199.168.0.2 is now part of the cluster INFO 11:59:40,604 InetAddress /199.168.0.2 is now UP INFO 11:59:55,667 Node /199.168.0.4 is now part of the cluster INFO 11:59:55,669 InetAddress /199.168.0.4 is now UP INFO 12:01:08,389 Joining: getting bootstrap token INFO 12:01:08,410 New token will be 43083119672609054510947312506340649252 to assume load from /199.168.0.2 INFO 12:01:08,412 Enqueuing flush of Memtable-LocationInfo@6824966(123/153 serialized/live bytes, 4 ops) INFO 12:01:08,413 Writing Memtable-LocationInfo@6824966(123/153 serialized/live bytes, 4 ops) INFO 12:01:08,461 Completed flushing /var/lib/cassandra/data/system/LocationInfo-g-2-Data.db (287 bytes) INFO 12:01:08,477 Node /199.168.0.3 state jump to normal INFO 12:01:08,480 Enqueuing flush of Memtable-LocationInfo@10141941(53/66 serialized/live bytes, 2 ops) INFO 12:01:08,482 Writing Memtable-LocationInfo@10141941(53/66 serialized/live bytes, 2 ops) INFO 12:01:08,514 Completed flushing /var/lib/cassandra/data/system/LocationInfo-g-3-Data.db (163 bytes) INFO 12:01:08,527 Node /199.168.0.3 state jump to normal INFO 12:01:08,652 mx4j successfuly loaded HttpAdaptor version 3.0.1 started on port 8081 3. When I run my map/reduce program it ended up with "UnknownHostException" Error: java.net.UnknownHostException: /199.168.0.2 at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849) at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200) at java.net.InetAddress.getAllByName0(InetAddress.java:1153) at java.net.InetAddress.getAllByName(InetAddress.java:1083) at java.net.InetAddress.getAllByName(InetAddress.java:1019) at java.net.InetAddress.getByName(InetAddress.java:969) at org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93) at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67) at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98) at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92) at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132) at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.Child.main(Child.java:170) Here are the config line for map/reduce. job4.setReducerClass(TblUploadReducer.class ); job4.setOutputKeyClass(ByteBuffer.class); job4.setOutputValueClass(List.class); job4.setOutputFormatClass(ColumnFamilyOutputFormat.class); ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), args[1],args[3] ); ConfigHelper.setRpcPort(job4.getConfiguration(), args[7]); // 9160 ConfigHelper.setInitialAddress(job4.getConfiguration(), args[9]); // 199.168.0.2 ConfigHelper.setPartitioner(job4.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner"); Steps which I have verified, 1. There is a passwordless ssh has been configured b/w lab02,lab03 &lab04. All the nodes can ping each other with out any issues. 2. When I ran "InetAddress.getLocalHost()" from java program on lab02 it prints "lab02/199.168.0.2". 3. When I over looked "o/p" of bin/cassandra it prints couple of messages and under InetAddress field "/199.168.0.3" etc. Here it does not print "hostname/IP". Is that problem? Kindly help me. Regards, Thamizhannal --- On Thu, 25/8/11, aaron morton <aa...@thelastpickle.com> wrote: From: aaron morton <aa...@thelastpickle.com> Subject: Re: multi-node cassandra config doubt To: user@cassandra.apache.org Date: Thursday, 25 August, 2011, 3:45 AM Jump on the machine that raised the error and see if you can ssh to node01. or try using ip address to see if they work. Cheers -----------------Aaron MortonFreelance Cassandra Developer@aaronmortonhttp://www.thelastpickle.com On 24/08/2011, at 11:34 PM, Thamizh wrote: Hi Aaron, This is yet to be resolved. I have set-up Cassandra multi node clustering and facing issues in pushing HDFS data to Cassandra. When I ran "MapReduce" progrma I am getting UnknownHostException. In hadoop(0.20.1), I have configured node01-as master and node01, node02 & node03 as slaves. In Cassandra(0.8.4), the installation & configurations has been done. when I issue nodetool ring command I could see the ring and also the KEYSPACES & COLUMNFAMILYS have got distributed. o/p: nodetool $bin/nodetool -h node02 ring Address DC Rack Status State Load Owns Token 161930152162677484001961360738128229499 198.168.0.1 datacenter1 rack1 Up Normal 132.28 MB 12.48% 13027320554261208311902766005835168982 198.168.0.2 datacenter1 rack1 Up Normal 99.34 MB 75.07% 140745249930211229277235689500208693608 198.168.0.3 datacenter1 rack1 Up Normal 66.21 KB 12.45% 161930152162677484001961360738128229499 nutch@lab02:/code/apache-cassandra-0.8.4$ Here are the hadoop config. job4.setOutputFormatClass(ColumnFamilyOutputFormat.class); ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), KEYSPACE,COLUMN_FAMILY ); ConfigHelper.setRpcPort(job4.getConfiguration(), ""9160); ConfigHelper.setInitialAddress(job4.getConfiguration(), "node01"); ConfigHelper.setPartitioner(job4.getConfiguration(), "org.apache.cassandra.dht.RandomPartitioner"); Bleow is an exception message: Error: java.net.UnknownHostException: /198.168.0.3 at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849) at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200) at java.net.InetAddress.getAllByName0(InetAddress.java:1153) at java.net.InetAddress.getAllByName(InetAddress.java:1083) at java.net.InetAddress.getAllByName(InetAddress.java:1019) at java.net.InetAddress.getByName(InetAddress.java:969) at org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93) at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67) at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98) at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92) at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132) at org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.Child.main(Child.java:170) note: Same /etc/hosts file has been used across all the nodes. Kindly help me to resolve this issue? Regards, Thamizhannal P --- On Wed, 24/8/11, aaron morton <aa...@thelastpickle.com> wrote: From: aaron morton <aa...@thelastpickle.com> Subject: Re: multi-node cassandra config doubt To: user@cassandra.apache.org Date: Wednesday, 24 August, 2011, 2:40 PM Did you get this sorted ? At a guess I would say there are no nodes listed in the Hadoop JobConf. Cheers -----------------Aaron MortonFreelance Cassandra Developer@aaronmortonhttp://www.thelastpickle.com On 23/08/2011, at 9:51 PM, Thamizh wrote: Hi All, This is regarding multi-node cluster configuration doubt. I have configured 3 nodes of cluster using Cassandra-0.8.4 and getting error when I ran Map/Reduce job which uploads records from HDFS to Cassandra. Here are my 3 nodes cluster config file (cassandra.yaml) for Cassandra: node01: seeds: "node01,node02,node03" auto_bootstrap: false listen_address: 192.168.0.1 rpc_address: 192.168.0.1 node02: seeds: "node01,node02,node03" auto_bootstrap: true listen_address: 192.168.0.2 rpc_address: 192.168.0.2 node03: seeds: "node01,node02,node03" auto_bootstrap: true listen_address: 192.168.0.3 rpc_address: 192.168.0.3 When I ran M/R program, I am getting below error 11/08/23 04:37:00 INFO mapred.JobClient: map 100% reduce 11% 11/08/23 04:37:06 INFO mapred.JobClient: map 100% reduce 22% 11/08/23 04:37:09 INFO mapred.JobClient: map 100% reduce 33% 11/08/23 04:37:14 INFO mapred.JobClient: Task Id : attempt_201104211044_0719_r_000000_0, Status : FAILED java.lang.NullPointerException at org.apache.cassandra.client.RingCache.getRange(RingCache.java:130) at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:125) at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:60) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:90) at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:1) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.Child.main(Child.java:170) Is anything wrong on my cassandra.yaml file? I followed http://wiki.apache.org/cassandra/MultinodeCluster for cluster configuration. Regards, Thamizhannal