[ https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160683#comment-14160683 ]
Mostafa Mokhtar commented on HIVE-8292: --------------------------------------- [~vikram.dixit] Patch which addresses the regression attached. > Reading from partitioned bucketed tables has high overhead in > MapOperator.cleanUpInputFileChangedOp > --------------------------------------------------------------------------------------------------- > > Key: HIVE-8292 > URL: https://issues.apache.org/jira/browse/HIVE-8292 > Project: Hive > Issue Type: Bug > Components: Query Processor > Affects Versions: 0.14.0 > Environment: cn105 > Reporter: Mostafa Mokhtar > Assignee: Vikram Dixit K > Fix For: 0.14.0 > > Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch > > > Reading from bucketed partitioned tables has significantly higher overhead > compared to non-bucketed non-partitioned files. > 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp > 5% the CPU in > {code} > Path onepath = normalizePath(onefile); > {code} > And > 45% the CPU in > {code} > onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri()); > {code} > From the profiler > {code} > Stack Trace Sample Count Percentage(%) > hive.ql.exec.tez.MapRecordSource.processRow(Object) 5,327 62.348 > hive.ql.exec.vector.VectorMapOperator.process(Writable) 5,326 62.336 > hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851 56.777 > hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849 56.753 > java.net.URI.relativize(URI) 3,903 45.681 > java.net.URI.relativize(URI, URI) 3,903 > 45.681 > java.net.URI.normalize(String) 2,169 > 25.386 > java.net.URI.equal(String, String) > 526 6.156 > java.net.URI.equalIgnoringCase(String, > String) 1 0.012 > java.lang.String.substring(int) > 1 0.012 > hive.ql.exec.MapOperator.normalizePath(String) 506 5.922 > org.apache.commons.logging.impl.Log4JLogger.info(Object) 32 > 0.375 > java.net.URI.equals(Object) 12 0.14 > java.util.HashMap$KeySet.iterator() 5 > 0.059 > java.util.HashMap.get(Object) 4 > 0.047 > java.util.LinkedHashMap.get(Object) 3 > 0.035 > hive.ql.exec.Operator.cleanUpInputFileChanged() 1 0.012 > hive.ql.exec.Operator.forward(Object, ObjectInspector) 473 5.536 > hive.ql.exec.mr.ExecMapperContext.inputFileChanged() 1 0.012 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)