Re: Cassandra 1.20 with Cloudera Hadoop (CDH4) Compatibility Issue

Jeremy Hanna Sat, 16 Feb 2013 09:25:18 -0800

Fwiw - here is are some changes that a friend said should make C*'s Hadoop 
support work with CDH4 - for ColumnFamilyRecordReader.
https://gist.github.com/jeromatron/4967799


On Feb 16, 2013, at 8:23 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote:

> Here is the deal.
> 
> http://wiki.apache.org/hadoop/Defining%20Hadoop
> 
> INAPPROPRIATE: Automotive Joe's Crankshaft: 100% compatible with Hadoop
> 
> Bad, because "100% compatible" is a meaningless statement. Even Apache
> releases have regressions; cases were versions are incompatible *even
> when the Java interfaces don't change*. A statement about
> compatibility ought to be qualilified "Certified by Joe's brother Bob
> as 100% compatible with Apache Hadoop(TM)". In the US, the marketing
> team may be able to get way with the "100% compatible" claim, but in
> some EU countries, sticking that statement up your web site is a claim
> that residents can demand the vendor justifies, or take it down.
> 
> So as a result, if you are running something NOT apache hadoop, CDH,
> DSE, or whatever they are NOT compatible with hadoop or each other by
> definition.
> 
> Anyway, I have been using hadoop for years, and its biggest problem is
> that it has never become happy with its own codebase. Old api, new
> api, jobtracker, yarn, all these thing change, there is really no
> upgrade/downgrade path because there are so many branches etc.Open
> source products move swiftly and end users are normally left holding
> the ball in figuring it our how to do it sanely. With Cassandra +
> Hadoop it is "double trouble".
> 
> All that being said I think it is unrealistic to count on vendors 100%
> to solve your problems. If something throws you and exception like..
> 
> org.apache.cassandra.hadoop.ConfigHelper.setRpcPort(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String;)V
> 
> Guess what? It is time to get out your compiler.
> 
> On Sat, Feb 16, 2013 at 3:39 AM, Yang Song <xfil...@gmail.com> wrote:
>> Thanks Michael. I attached the reply I got back from CDH4 user group from
>> Harsh. Hope to share the experience.
>> "
>> In CDH4, the MR1 and MR2 APIs are both fully compatible (such that
>> moving to YARN in future would require no recompilation from MR1
>> produced jars). You can consider it "2.0" API in binary form, and not
>> 0.20 exactly (i.e. its not backwards compatible with CDH3).
>> 
>> Cassandra is distributing binaries built on MR1 (Apache Hadoop 1,
>> CDH3, etc.), which wouldn't work on your CDH4 platform. You will have
>> to recompile against the proper platform to get binary-compatible
>> jars/etc.."
>> 
>> Interesting. Has anyone have issue with CDH4 with the newly released C*
>> 1.21?
>> 
>> Thanks
>> 
>> 2013/2/15 Michael Kjellman <mkjell...@barracuda.com>
>>> 
>>> Sorry. I meant to say even though there *wasnt* a major change between
>>> 1.0.x and 0.22. The big change was 0.20 to 0.22. Sorry for the confusion.
>>> 
>>> On Feb 15, 2013, at 9:53 PM, "Michael Kjellman" <mkjell...@barracuda.com>
>>> wrote:
>>> 
>>> There were pretty big changes in Hadoop between 0.20 and 0.22 (which is
>>> now known as 1.0.x) even though there were major change between 0.22 and
>>> 1.0.x. Cloudera hadn't yet upgraded to 0.22 which uses the new map reduce
>>> framework instead of the old mapred API. I don't see the C* project back
>>> porting their code at this time and if anything Cloudera should update their
>>> release!!
>>> 
>>> On Feb 15, 2013, at 9:48 PM, "Yang Song" <xfil...@gmail.com> wrote:
>>> 
>>> It is interesting though. I am using CDH4 which contains hadoop 0.20, and
>>> I am using Cassandra 1.20.
>>> The previous mentioned errors still occur. Any suggestions? Thanks.
>>> 
>>> 2013/2/15 Michael Kjellman <mkjell...@barracuda.com>
>>>> 
>>>> That bug is kinda wrong though. 1.0.x is current for like a year now and
>>>> C* works great with it :)
>>>> 
>>>> On Feb 15, 2013, at 7:38 PM, "Dave Brosius" <dbros...@mebigfatguy.com>
>>>> wrote:
>>>> 
>>>> see https://issues.apache.org/jira/browse/CASSANDRA-5201
>>>> 
>>>> 
>>>> On 02/15/2013 10:05 PM, Yang Song wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> Does anyone use CDH4's Hadoop with Cassandra to interact? The goal is
>>>> simply read/write to Cassandra from Hadoop direclty using
>>>> ColumnFamilyInput(Output)Format, but seems a bit compatibility issue. There
>>>> are two java exceptions
>>>> 
>>>> 1. java.lang.IncompatibleClassChangeError: Found interface
>>>> org.apache.hadoop.mapreduce.JobContext, but class was expected
>>>> This shows when I run hadoop jar file to read directly from Cassandra.
>>>> Seems that there is a change on Hadoop that JobContext was changed from
>>>> class to interface. Has anyone have similar issue?
>>>> Does it mean the Hadoop version in CDH4 is old?
>>>> 
>>>> 2. Another error is java.lang.NoSuchMethodError:
>>>> org.apache.cassandra.hadoop.ConfigHelper.setRpcPort(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String;)V
>>>> This shows when the jar file contains rpc port for remote Cassandra
>>>> cluster.
>>>> 
>>>> Does anyone have similiar experience? Any comments are welcome. thanks!
>>>> 
>>>> 
>>> 
>>

Re: Cassandra 1.20 with Cloudera Hadoop (CDH4) Compatibility Issue

Reply via email to