Hive Read from Reducer : Is it advisable ?
Hi, We are making Hive read to few files in HDFS from Reducer(s) part of a map-reduce job, This works good when launching few reducers ( say 5). But when we launch more that, the initial connection to Hive Server2 takes longer time( around 10 mins ). We have configured hive-site.xml to allow parallel execution . 1. Is this advisable to read HDFS data via hive from reducers. ? or what are the best practices for this scenario ? 2. Is there a way increase hive concurrent access performance Regards, Malli
RE: Hive 0.13/Hcatalog : Mapreduce Exception : java.lang.IncompatibleClassChangeError
Hi, Thanks .. do I need to run this as -Phadoop-1. As I am using Hadoop 2.4.0 , I thought of running it with –Phadoop-2 ? Please advise .. -malli From: Navis류승우 [mailto:navis@nexr.com] Sent: Wednesday, June 04, 2014 11:44 PM To: user@hive.apache.org Subject: Re: Hive 0.13/Hcatalog : Mapreduce Exception : java.lang.IncompatibleClassChangeError It's fixed in HIVE-6432. I think you should rebuild your own hcatalog from source with profile -Phadoop-1. 2014-06-05 9:08 GMT+09:00 Sundaramoorthy, Malliyanathan mailto:malliyanathan.sundaramoor...@citi.com>>: Hi, I am using Hadoop 2.4.0 with Hive 0.13 + included package of HCatalog . Wrote a simple map-reduce job from the example and running the code below .. getting “Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected“ .. Not sure of the error I am making .. Not sure if there a compatibility issue .. please help.. boolean success = true; try { Configuration conf = getConf(); args = new GenericOptionsParser(conf, args).getRemainingArgs(); //Hive Table Details String dbName = args[0]; String inputTableName= args[1]; String outputTableName= args[2]; //Job Input Job job = new Job(conf,"Scenarios"); //Initialize Map/Reducer Input/Output HCatInputFormat.setInput(job,dbName,inputTableName); //HCatInputFormat.ssetInput(job,InputJobInfo.create(dbName, inputTableName, null)); job.setInputFormatClass(HCatInputFormat.class); job.setJarByClass(MainRunner.class); job.setMapperClass(ScenarioMapper.class); job.setReducerClass(ScenarioReducer.class); job.setMapOutputKeyClass(IntWritable.class); job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(WritableComparable.class); job.setOutputValueClass(DefaultHCatRecord.class); HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, outputTableName, null)); HCatSchema outSchema = HCatOutputFormat.getTableSchema(conf); System.err.println("INFO: output schema explicitly set for writing:"+ outSchema); HCatOutputFormat.setSchema(job, outSchema); job.setOutputFormatClass(HCatOutputFormat.class); 14/06/02 18:52:57 INFO client.RMProxy: Connecting to ResourceManager at localhost/00.04.07.174:8040<https://urldefense.proofpoint.com/v1/url?u=http://00.04.07.174:8040&k=wdHsQuqY0Mqq1fNjZGIYnA%3D%3D%0A&r=iLn%2Bmqaq6tYc5RSEfqyvWmIOWfTkMNunj3cq4hhJIPT4wUox9rIKv0OgIpoGRAGk%0A&m=NcbnbS00R7MHvX0cxPKDM0P40V6%2FNubyDpsFbSX7ixE%3D%0A&s=77a39a26f016ebfa744f527288fb8eb05a96ee0b97dde535b02efcf935a042cc> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:104) at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:84) at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:73) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303) at com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.run(MainRunner.java:79) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.main(MainRunner.java:89) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Regards, Malli
Hive 0.13/Hcatalog : Mapreduce Exception : java.lang.IncompatibleClassChangeError
Hi, I am using Hadoop 2.4.0 with Hive 0.13 + included package of HCatalog . Wrote a simple map-reduce job from the example and running the code below .. getting "Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected" .. Not sure of the error I am making .. Not sure if there a compatibility issue .. please help.. boolean success = true; try { Configuration conf = getConf(); args = new GenericOptionsParser(conf, args).getRemainingArgs(); //Hive Table Details String dbName = args[0]; String inputTableName= args[1]; String outputTableName= args[2]; //Job Input Job job = new Job(conf,"Scenarios"); //Initialize Map/Reducer Input/Output HCatInputFormat.setInput(job,dbName,inputTableName); //HCatInputFormat.ssetInput(job,InputJobInfo.create(dbName, inputTableName, null)); job.setInputFormatClass(HCatInputFormat.class); job.setJarByClass(MainRunner.class); job.setMapperClass(ScenarioMapper.class); job.setReducerClass(ScenarioReducer.class); job.setMapOutputKeyClass(IntWritable.class); job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(WritableComparable.class); job.setOutputValueClass(DefaultHCatRecord.class); HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, outputTableName, null)); HCatSchema outSchema = HCatOutputFormat.getTableSchema(conf); System.err.println("INFO: output schema explicitly set for writing:"+ outSchema); HCatOutputFormat.setSchema(job, outSchema); job.setOutputFormatClass(HCatOutputFormat.class); 14/06/02 18:52:57 INFO client.RMProxy: Connecting to ResourceManager at localhost/00.04.07.174:8040 Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:104) at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:84) at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:73) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303) at com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.run(MainRunner.java:79) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.main(MainRunner.java:89) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Regards, Malli
Hive 0.13/Hcatalog : Mapreduce Exception : java.lang.IncompatibleClassChangeError
Hi, I am using Hadoop 2.4.0 with Hive 0.13 + included package of HCatalog . Wrote a simple map-reduce job from the example and running the code below .. getting "Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected" .. Not sure of the error I am making .. Not sure if there a compatibility issue .. please help.. boolean success = true; try { Configuration conf = getConf(); args = new GenericOptionsParser(conf, args).getRemainingArgs(); //Hive Table Details String dbName = args[0]; String inputTableName= args[1]; String outputTableName= args[2]; //Job Input Job job = new Job(conf,"Scenarios"); //Initialize Map/Reducer Input/Output HCatInputFormat.setInput(job,dbName,inputTableName); //HCatInputFormat.ssetInput(job,InputJobInfo.create(dbName, inputTableName, null)); job.setInputFormatClass(HCatInputFormat.class); job.setJarByClass(MainRunner.class); job.setMapperClass(ScenarioMapper.class); job.setReducerClass(ScenarioReducer.class); job.setMapOutputKeyClass(IntWritable.class); job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(WritableComparable.class); job.setOutputValueClass(DefaultHCatRecord.class); HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, outputTableName, null)); HCatSchema outSchema = HCatOutputFormat.getTableSchema(conf); System.err.println("INFO: output schema explicitly set for writing:"+ outSchema); HCatOutputFormat.setSchema(job, outSchema); job.setOutputFormatClass(HCatOutputFormat.class); 14/06/02 18:52:57 INFO client.RMProxy: Connecting to ResourceManager at localhost/00.04.07.174:8040 Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:104) at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:84) at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:73) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303) at com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.run(MainRunner.java:79) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.main(MainRunner.java:89) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Regards, Malli