Hi,

With CDH3u4 and Cloudera Manager, and I am running a hive query to repartition 
all our tables.  I'm reducing the number of partitions from 5 to 2, because the 
performance benefits of a smaller mapred.input.dir is significant, which I only 
realized as our tables have grown in size, and there was little perceived 
benefit from having the extra partitions considering our typical queries.  
After adjusting hive.exec.max.dynamic.partitions to deal with the enormous 
number of partitions in our larger tables, I got this exception:

org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
java.io.IOException: Exceeded max jobconf size: 5445900 limit: 5242880
        at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3771) 
        at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        …

I solved this problem in a round-about way, and then the query was successful, 
but I don't understand why and would like to get a better grasp on what is 
going on.   I tried these things:

1) In my hive query file, I added a "set mapred.user.jobconf.limit=7000000;" 
before the query, but I saw the exact same exception.

2) Since setting mapred.user.jobconf.limit from the CLI didn't seem to be 
working, I used the safety valve for the jobtracker via cloudera manager to add 
this:

<property>
    <name>mapred.user.jobconf.limit</name>
    <value>7000000</value>
</property>

and then I saved those changes, restarted the job tracker, and reran the query. 
 I saw the same exception.

Digging further, I used "set -v" in my hive query file to see the value of 
mapred.user.jobconf.limit, and I discovered:

a) hive -e "set mapred.user.jobconf.limit=7000000; set -v" | grep 
mapred.user.jobconf.limit showed the value as 7000000", so it seems as if the 
CLI setting is being observed.
b) hive -e "set -v" | grep mapred.user.jobconf.limit showed the value as 
5242880, which suggests that the safety valve isn't working (?).

3) Finally, I wondered if there was a hard-coded maximum in the code of 5MB for 
mapred.user.jobconf.limit, despite looking at the code and seeing nothing 
obvious, so I tried "set mapred.user.jobconf.limit=100" to set it to a very 
small value, and see if the exception would show that I had exceeded the limit, 
which should now be reported at '100'.  Guess what?  The query executed 
successfully, which makes absolutely no sense to me.

Also, FYI, the size, in bytes, of mapred.input.dir for this query was 5392189.

Does anyone have ideas why:

1) The safety valve setting wasn't observed,
2) The CLI setting, which seemed to observed was not used, at least according 
to the limit stated by the exception, and
3) Why setting mapred.user.jobconf.limit to an absurdly low number actually 
allowed the query to be successful?

Thanks,
Greg

Reply via email to