Re: IncompatibleClassChangeError

2013-09-29 Thread Pradeep Gollakota
I'm not entirely sure what the differences are... but according to Cloudera documentation, upgrading from CDH3 to CDH4 does involve a recompile. http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/4.2.0/CDH4-Release-Notes/cdh4ki_topic_1_6.html On Sun, Sep 29, 2013 at 8:29 PM, lei

Re: IncompatibleClassChangeError

2013-09-29 Thread lei liu
Yes, My job is compiled in CHD3u3, and I run the job on CDH4.3.1, but I use the mr1 of CHD4.3.1 to run the job. What are the different mr1 of cdh4 and mr of cdh3? Thanks, LiuLei 2013/9/30 Pradeep Gollakota > I believe it's a difference between the version that your code was > compiled again

Re: IncompatibleClassChangeError

2013-09-29 Thread Pradeep Gollakota
I believe it's a difference between the version that your code was compiled against vs the version that you're running against. Make sure that you're not packaging hadoop jar's into your jar and make sure you're compiling against the correct version as well. On Sun, Sep 29, 2013 at 7:27 PM, lei l

IncompatibleClassChangeError

2013-09-29 Thread lei liu
I use the CDH-4.3.1 and mr1, when I run one job, I am getting the following error. Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.get

Re: All datanodes are bad IOException when trying to implement multithreading serialization

2013-09-29 Thread yunming zhang
Thanks Sonai, Felix, I have researched into combined file format before. The problem I am trying to solve here is that I want to reduce the number of mappers running concurrently on a single node. Normally, on a machine with 8 GB of RAM and 8 Cores, I need to run 8 JVMs(mapper) to exploit 8 core C

Re: All datanodes are bad IOException when trying to implement multithreading serialization

2013-09-29 Thread Felix Chern
The number of mappers usually is same as the number of the files you fed to it. To reduce the number you can use CombineFileInputFormat. I recently wrote an article about it. You can take a look if this fits your needs. http://www.idryman.org/blog/2013/09/22/process-small-files-on-hadoop-using-co

Re: All datanodes are bad IOException when trying to implement multithreading serialization

2013-09-29 Thread yunming zhang
I am actually trying to reduce the number of mappers because my application takes up a lot of memory (in the order of 1-2 GB ram per mapper). I want to be able to use a few mappers but still maintain good CPU utilization through multithreading within a single mapper. Multithreaded Mapper does't wo

Re: All datanodes are bad IOException when trying to implement multithreading serialization

2013-09-29 Thread Sonal Goyal
Wouldn't you rather just change your split size so that you can have more mappers work on your input? What else are you doing in the mappers? Sent from my iPad On Sep 30, 2013, at 2:22 AM, yunming zhang wrote: > Hi, > > I was playing with Hadoop code trying to have a single Mapper support rea

Re: Can container requests be made paralelly from multiple threads

2013-09-29 Thread Jian He
Hi, If you adopt hadoop-2.1-beta, StartContainerRequest is changed to StartContainer(s)Request, meaning it can accept a list of container start requests and start those containers in one RPC call. Jian On Fri, Sep 27, 2013 at 1:06 PM, Omkar Joshi wrote: > My point is why you want multiple th

All datanodes are bad IOException when trying to implement multithreading serialization

2013-09-29 Thread yunming zhang
Hi, I was playing with Hadoop code trying to have a single Mapper support reading a input split using multiple threads. I am getting All datanodes are bad IOException, and I am not sure what is the issue. The reason for this work is that I suspect my computation was slow because it takes too long

Re: Calling the JobTracker from Reducer throws InvalidCredentials GSSException

2013-09-29 Thread Manish Verma
Hi Harsh, Awesome, after incorporating these changes in my jobdriver, it worked for me. It looks like I should use Oozie to launch my jobs so that I don't have to copy this code in my driver. Thanks for your help Manish On Sun, Sep 29, 2013 at 3:51 AM, Harsh J wrote: > Hm, I think I forgot the

Re: How to best decide mapper output/reducer input for a huge string?

2013-09-29 Thread Jens Scheidtmann
Dear Pavan, If it was working well, runtime would be shorter. What makes you sure this is Hbase or Hadoop related? What percentage of time is spent in your algorithms? Use System.getTimeMillies() and run your program on the first 100,000 Records single threaded and print to stdout. See were time

Re: Calling the JobTracker from Reducer throws InvalidCredentials GSSException

2013-09-29 Thread Harsh J
Hm, I think I forgot the bit where Oozie also adds a DT for itself to use: https://github.com/apache/oozie/blob/release-4.0.0/core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java#L374. Doing that same additional thing to the job's driver, works just fine for me. On Sun, Sep 29,

Re: Is there any way to partially process HDFS edits?

2013-09-29 Thread Jens Scheidtmann
Tom, I would file a jira, if I were you and my Hadoop Version was recent enough. Should be pretty easy to reproduce. Jens Am Donnerstag, 26. September 2013 schrieb Tom Brown : > They were created and deleted in quick succession. I thought that meant > the edits for both the create and delete w