Hive Read from Reducer : Is it advisable ?

2014-10-07 Thread Sundaramoorthy, Malliyanathan
Hi,
We are making Hive read to few files in HDFS from Reducer(s) part of a 
map-reduce job,

This works good when launching few reducers ( say 5). But when we launch more 
that, the initial connection to Hive Server2 takes longer time( around 10 mins 
).
We have configured hive-site.xml to allow parallel execution .


1.  Is this advisable to read HDFS data via hive from reducers. ? or what 
are the best practices for this scenario ?

2.  Is there a way increase hive concurrent access performance



Regards,
Malli





RE: Hive 0.13/Hcatalog : Mapreduce Exception : java.lang.IncompatibleClassChangeError

2014-06-05 Thread Sundaramoorthy, Malliyanathan
Hi,
Thanks .. do I need to run this as -Phadoop-1. As I am using Hadoop 2.4.0 , I 
thought of running it with –Phadoop-2 ?
Please advise ..

-malli


From: Navis류승우 [mailto:navis@nexr.com]
Sent: Wednesday, June 04, 2014 11:44 PM
To: user@hive.apache.org
Subject: Re: Hive 0.13/Hcatalog : Mapreduce Exception : 
java.lang.IncompatibleClassChangeError

It's fixed in HIVE-6432. I think you should rebuild your own hcatalog from 
source with profile -Phadoop-1.

2014-06-05 9:08 GMT+09:00 Sundaramoorthy, Malliyanathan 
mailto:malliyanathan.sundaramoor...@citi.com>>:
Hi,
I am using Hadoop 2.4.0 with Hive 0.13 + included package of HCatalog . Wrote a 
simple map-reduce job from the example and running the code below .. getting 
“Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected“  .. 
Not sure of the error I am making ..
Not sure if there a compatibility issue .. please help..

boolean success = true;
  try {
  Configuration conf = getConf();
  args = new GenericOptionsParser(conf, args).getRemainingArgs();
  //Hive Table Details
  String dbName = args[0];
  String inputTableName= args[1];
  String outputTableName= args[2];

  //Job Input
  Job job = new Job(conf,"Scenarios");
  //Initialize Map/Reducer Input/Output
  HCatInputFormat.setInput(job,dbName,inputTableName);
  //HCatInputFormat.ssetInput(job,InputJobInfo.create(dbName, 
inputTableName, null));
  job.setInputFormatClass(HCatInputFormat.class);
  job.setJarByClass(MainRunner.class);
   job.setMapperClass(ScenarioMapper.class);
job.setReducerClass(ScenarioReducer.class);
   job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(IntWritable.class);

job.setOutputKeyClass(WritableComparable.class);
job.setOutputValueClass(DefaultHCatRecord.class);

HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, 
outputTableName, null));
HCatSchema outSchema = HCatOutputFormat.getTableSchema(conf);
System.err.println("INFO: output schema explicitly set for writing:"+ 
outSchema);
   HCatOutputFormat.setSchema(job, outSchema);
job.setOutputFormatClass(HCatOutputFormat.class);


14/06/02 18:52:57 INFO client.RMProxy: Connecting to ResourceManager at 
localhost/00.04.07.174:8040<https://urldefense.proofpoint.com/v1/url?u=http://00.04.07.174:8040&k=wdHsQuqY0Mqq1fNjZGIYnA%3D%3D%0A&r=iLn%2Bmqaq6tYc5RSEfqyvWmIOWfTkMNunj3cq4hhJIPT4wUox9rIKv0OgIpoGRAGk%0A&m=NcbnbS00R7MHvX0cxPKDM0P40V6%2FNubyDpsFbSX7ixE%3D%0A&s=77a39a26f016ebfa744f527288fb8eb05a96ee0b97dde535b02efcf935a042cc>
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:104)
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:84)
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:73)
at 
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at 
com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.run(MainRunner.java:79)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at 
com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.main(MainRunner.java:89)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Regards,
Malli




Hive 0.13/Hcatalog : Mapreduce Exception : java.lang.IncompatibleClassChangeError

2014-06-04 Thread Sundaramoorthy, Malliyanathan
Hi,
I am using Hadoop 2.4.0 with Hive 0.13 + included package of HCatalog . Wrote a 
simple map-reduce job from the example and running the code below .. getting 
"Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected"  .. 
Not sure of the error I am making ..
Not sure if there a compatibility issue .. please help..

boolean success = true;
  try {
  Configuration conf = getConf();
  args = new GenericOptionsParser(conf, args).getRemainingArgs();
  //Hive Table Details
  String dbName = args[0];
  String inputTableName= args[1];
  String outputTableName= args[2];

  //Job Input
  Job job = new Job(conf,"Scenarios");
  //Initialize Map/Reducer Input/Output
  HCatInputFormat.setInput(job,dbName,inputTableName);
  //HCatInputFormat.ssetInput(job,InputJobInfo.create(dbName, 
inputTableName, null));
  job.setInputFormatClass(HCatInputFormat.class);
  job.setJarByClass(MainRunner.class);
   job.setMapperClass(ScenarioMapper.class);
job.setReducerClass(ScenarioReducer.class);
   job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(IntWritable.class);

job.setOutputKeyClass(WritableComparable.class);
job.setOutputValueClass(DefaultHCatRecord.class);

HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, 
outputTableName, null));
HCatSchema outSchema = HCatOutputFormat.getTableSchema(conf);
System.err.println("INFO: output schema explicitly set for writing:"+ 
outSchema);
   HCatOutputFormat.setSchema(job, outSchema);
job.setOutputFormatClass(HCatOutputFormat.class);


14/06/02 18:52:57 INFO client.RMProxy: Connecting to ResourceManager at 
localhost/00.04.07.174:8040
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:104)
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:84)
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:73)
at 
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at 
com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.run(MainRunner.java:79)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at 
com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.main(MainRunner.java:89)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Regards,
Malli



Hive 0.13/Hcatalog : Mapreduce Exception : java.lang.IncompatibleClassChangeError

2014-06-03 Thread Sundaramoorthy, Malliyanathan
Hi,
I am using Hadoop 2.4.0 with Hive 0.13 + included package of HCatalog . Wrote a 
simple map-reduce job from the example and running the code below .. getting 
"Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected"  .. 
Not sure of the error I am making ..
Not sure if there a compatibility issue .. please help..

boolean success = true;
  try {
  Configuration conf = getConf();
  args = new GenericOptionsParser(conf, args).getRemainingArgs();
  //Hive Table Details
  String dbName = args[0];
  String inputTableName= args[1];
  String outputTableName= args[2];

  //Job Input
  Job job = new Job(conf,"Scenarios");
  //Initialize Map/Reducer Input/Output
  HCatInputFormat.setInput(job,dbName,inputTableName);
  //HCatInputFormat.ssetInput(job,InputJobInfo.create(dbName, 
inputTableName, null));
  job.setInputFormatClass(HCatInputFormat.class);
  job.setJarByClass(MainRunner.class);
   job.setMapperClass(ScenarioMapper.class);
job.setReducerClass(ScenarioReducer.class);
   job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(IntWritable.class);

job.setOutputKeyClass(WritableComparable.class);
job.setOutputValueClass(DefaultHCatRecord.class);

HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, 
outputTableName, null));
HCatSchema outSchema = HCatOutputFormat.getTableSchema(conf);
System.err.println("INFO: output schema explicitly set for writing:"+ 
outSchema);
   HCatOutputFormat.setSchema(job, outSchema);
job.setOutputFormatClass(HCatOutputFormat.class);


14/06/02 18:52:57 INFO client.RMProxy: Connecting to ResourceManager at 
localhost/00.04.07.174:8040
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:104)
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:84)
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:73)
at 
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:458)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at 
com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.run(MainRunner.java:79)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at 
com.citi.aqua.snu.hdp.clar.mra.service.MainRunner.main(MainRunner.java:89)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Regards,
Malli