[ 
https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160683#comment-14160683
 ] 

Mostafa Mokhtar commented on HIVE-8292:
---------------------------------------

[~vikram.dixit]
Patch which addresses the regression attached.

> Reading from partitioned bucketed tables has high overhead in 
> MapOperator.cleanUpInputFileChangedOp
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-8292
>                 URL: https://issues.apache.org/jira/browse/HIVE-8292
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.14.0
>         Environment: cn105
>            Reporter: Mostafa Mokhtar
>            Assignee: Vikram Dixit K
>             Fix For: 0.14.0
>
>         Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch
>
>
> Reading from bucketed partitioned tables has significantly higher overhead 
> compared to non-bucketed non-partitioned files.
> 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp
> 5% the CPU in 
> {code}
>  Path onepath = normalizePath(onefile);
> {code}
> And 
> 45% the CPU in 
> {code}
>  onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri());
> {code}
> From the profiler 
> {code}
> Stack Trace   Sample Count    Percentage(%)
> hive.ql.exec.tez.MapRecordSource.processRow(Object)   5,327   62.348
>    hive.ql.exec.vector.VectorMapOperator.process(Writable)    5,326   62.336
>       hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851   56.777
>          hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849   56.753
>                                  java.net.URI.relativize(URI) 3,903   45.681
>                                     java.net.URI.relativize(URI, URI) 3,903   
> 45.681
>                                        java.net.URI.normalize(String) 2,169   
> 25.386
>                                        java.net.URI.equal(String, String)     
> 526     6.156
>                                        java.net.URI.equalIgnoringCase(String, 
> String) 1       0.012
>                                        java.lang.String.substring(int)        
> 1       0.012
>             hive.ql.exec.MapOperator.normalizePath(String)    506     5.922
>             org.apache.commons.logging.impl.Log4JLogger.info(Object)  32      
> 0.375
>                                  java.net.URI.equals(Object)  12      0.14
>                                  java.util.HashMap$KeySet.iterator()  5       
> 0.059
>                                  java.util.HashMap.get(Object)        4       
> 0.047
>                                  java.util.LinkedHashMap.get(Object)  3       
> 0.035
>          hive.ql.exec.Operator.cleanUpInputFileChanged()      1       0.012
>       hive.ql.exec.Operator.forward(Object, ObjectInspector)  473     5.536
>       hive.ql.exec.mr.ExecMapperContext.inputFileChanged()    1       0.012
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to