>From an offline conversation with folks at Yahoo, some issues have been >observed when running with JDK 8 and 64-bit VMs. There seems to be longer GC >cycles in 64-bit VMs. So if the tasks were on the boundary of GC activity then >moving to 64 bit VM could push them over and slow them down – resulting in >overall slowdown. Not sure if this is related to JDK-8 or not. Some other OOM >issues were observed with JDK-8 (but not with JDK-7). Jonathan/Jason from >Yahoo could provide more details.
Juho, do you have the Tez UI enabled on EMR? If yes, then you could compare the GC times of the slow and fast jobs to see if there was any slowdown from that reason. If you don’t have the Tez UI, then you could grep Application master logs for VERTEX_FINISHED and look at the counters dumped in that log for GC counters and compare them. Bikas From: Juho Autio [mailto:[email protected]] Sent: Thursday, December 10, 2015 12:07 AM To: [email protected] Subject: Re: Hive on Tez with Java 8 JRE poor performance Tez version is 0.7.0. Installing it with the following AWS EMR step: s3://support.elasticmapreduce/tez/bigtop/install-tez-with-ui.rb Using EMR release label emr-4.0.0. Installing Java 8 with https://gist.github.com/pstorch/c217d8324c4133a003c4 (as a bootstrap step). Currently it installs this version: $ java -version openjdk version "1.8.0_65" OpenJDK Runtime Environment (build 1.8.0_65-b17) OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode) Sorry, but I don't have logs or other specifics to provide at this point. Already switched back to running with mr engine. But it's a pretty basic Hive script with tez engine enabled. With Java 8 it takes twice as long to run as with Java 7. My cluster has 40 m1.large nodes. What I haven't done is configure the tez container size in any way – maybe I would need to, especially for Java 8? I'm looking for kind of "best practices" guide for running Tez with Java 8 instead of 7, if anyone has any experience about this migration? Thanks for helping! On Wed, Dec 9, 2015 at 7:47 AM, Tsuyoshi Ozawa <[email protected] <mailto:[email protected]> > wrote: Hi, Could you tell us the version of Tez you have been using? - Tsuyoshi On Wed, Dec 9, 2015 at 7:12 AM, Rajesh Balamohan <[email protected] <mailto:[email protected]> > wrote: > Can you provide details on the kind of regression you are seeing with 1.8? > Is it regressing for a specific task or for overall job?. If possible, > sharing the application logs with 1.7 and 1.8 would be helpful for > understanding it better. > > ~Rajesh.B > > On Tue, Dec 8, 2015 at 2:34 PM, Jean-Baptiste Note <[email protected] > <mailto:[email protected]> > wrote: >> >> Hi Juho, >> >> Which version of Java 8 are you using? Some have performance problems with >> Hadoop on top of issues with kerberos; you should be using a revision >> strictly greater than u45. >> >> Kind regards, >> JB > > -- Juho Autio Analytics Developer Hatch Rovio Entertainment Ltd Mobile: + 358 (0)45 313 0122 [email protected] <mailto:[email protected]> www.rovio.com <http://www.rovio.com> This message and its attachments may contain confidential information and is intended solely for the attention and use of the named addressee(s). If you are not the intended recipient and / or you have received this message in error, please contact the sender immediately and delete all material you have received in this message. You are hereby notified that any use of the information, which you have received in error in whatsoever form, is strictly prohibited. Thank you for your co-operation.
