[jira] [Commented] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
[ https://issues.apache.org/jira/browse/HIVE-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748416#comment-16748416 ] Teddy Choi commented on HIVE-20419: --- Revised and pushed to master. Thanks [~gopalv]. > Vectorization: Prevent mutation of VectorPartitionDesc after being used in a > hashmap key > > > Key: HIVE-20419 > URL: https://issues.apache.org/jira/browse/HIVE-20419 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20419.1.patch, HIVE-20419.2.patch, > HIVE-20419.4.patch > > > This is going into the loop because the VectorPartitionDesc is modified after > it is used in the HashMap key - resulting in a hashcode & equals modification > after it has been placed in the hashmap. > {code} > HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: > 621ms > java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869 <7 > recursive calls> > java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, > Object) HashMap.java:1989 > java.util.HashMap.putVal(int, Object, Object, boolean, boolean) > HashMap.java:637 > java.util.HashMap.put(Object, Object) HashMap.java:611 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, > VectorPartitionDesc, Map) Vectorizer.java:1272 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, > boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork, > String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) > Vectorizer.java:1654 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork, > Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork, > boolean) Vectorizer.java:1109 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node, > Stack, Object[]) Vectorizer.java:961 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, > TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) > TaskGraphWalker.java:180 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, > HashMap) TaskGraphWalker.java:125 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext) > Vectorizer.java:2442 > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, > ParseContext, Context) TezCompiler.java:717 > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, > HashSet, HashSet) TaskCompiler.java:258 > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443 > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) > CalcitePlanner.java:358 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
[ https://issues.apache.org/jira/browse/HIVE-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748357#comment-16748357 ] Gopal V commented on HIVE-20419: LGTM - +1 [~teddy.choi]: can you fix, before commit? {code} +dataTypeInfos = new TypeInfo[0]; {code} instead use an EMPTY_TYPEINFO_ARRAY singleton. > Vectorization: Prevent mutation of VectorPartitionDesc after being used in a > hashmap key > > > Key: HIVE-20419 > URL: https://issues.apache.org/jira/browse/HIVE-20419 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20419.1.patch, HIVE-20419.2.patch > > > This is going into the loop because the VectorPartitionDesc is modified after > it is used in the HashMap key - resulting in a hashcode & equals modification > after it has been placed in the hashmap. > {code} > HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: > 621ms > java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869 <7 > recursive calls> > java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, > Object) HashMap.java:1989 > java.util.HashMap.putVal(int, Object, Object, boolean, boolean) > HashMap.java:637 > java.util.HashMap.put(Object, Object) HashMap.java:611 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, > VectorPartitionDesc, Map) Vectorizer.java:1272 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, > boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork, > String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) > Vectorizer.java:1654 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork, > Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork, > boolean) Vectorizer.java:1109 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node, > Stack, Object[]) Vectorizer.java:961 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, > TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) > TaskGraphWalker.java:180 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, > HashMap) TaskGraphWalker.java:125 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext) > Vectorizer.java:2442 > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, > ParseContext, Context) TezCompiler.java:717 > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, > HashSet, HashSet) TaskCompiler.java:258 > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443 > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) > CalcitePlanner.java:358 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
[ https://issues.apache.org/jira/browse/HIVE-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748322#comment-16748322 ] Hive QA commented on HIVE-20419: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12955712/HIVE-20419.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15697 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15724/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15724/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15724/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12955712 - PreCommit-HIVE-Build > Vectorization: Prevent mutation of VectorPartitionDesc after being used in a > hashmap key > > > Key: HIVE-20419 > URL: https://issues.apache.org/jira/browse/HIVE-20419 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20419.1.patch, HIVE-20419.2.patch > > > This is going into the loop because the VectorPartitionDesc is modified after > it is used in the HashMap key - resulting in a hashcode & equals modification > after it has been placed in the hashmap. > {code} > HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: > 621ms > java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869 <7 > recursive calls> > java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, > Object) HashMap.java:1989 > java.util.HashMap.putVal(int, Object, Object, boolean, boolean) > HashMap.java:637 > java.util.HashMap.put(Object, Object) HashMap.java:611 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, > VectorPartitionDesc, Map) Vectorizer.java:1272 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, > boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork, > String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) > Vectorizer.java:1654 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork, > Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork, > boolean) Vectorizer.java:1109 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node, > Stack, Object[]) Vectorizer.java:961 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, > TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) > TaskGraphWalker.java:180 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, > HashMap) TaskGraphWalker.java:125 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext) > Vectorizer.java:2442 > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, > ParseContext, Context) TezCompiler.java:717 > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, > HashSet, HashSet) TaskCompiler.java:258 > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443 > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) > CalcitePlanner.java:358 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
[ https://issues.apache.org/jira/browse/HIVE-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748294#comment-16748294 ] Hive QA commented on HIVE-20419: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 31s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 1 new + 616 unchanged - 1 fixed = 617 total (was 617) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 23s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15724/dev-support/hive-personality.sh | | git revision | master / 34c8ca4 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15724/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15724/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Vectorization: Prevent mutation of VectorPartitionDesc after being used in a > hashmap key > > > Key: HIVE-20419 > URL: https://issues.apache.org/jira/browse/HIVE-20419 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20419.1.patch, HIVE-20419.2.patch > > > This is going into the loop because the VectorPartitionDesc is modified after > it is used in the HashMap key - resulting in a hashcode & equals modification > after it has been placed in the hashmap. > {code} > HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: > 621ms > java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869 <7 > recursive calls> > java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, > Object) HashMap.java:1989 > java.util.HashMap.putVal(int, Object, Object, boolean, boolean) > HashMap.java:637 > java.util.HashMap.put(Object, Object) HashMap.java:611 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, > VectorPartitionDesc, Map) Vectorizer.java:1272 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, > boolean, List,
[jira] [Commented] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
[ https://issues.apache.org/jira/browse/HIVE-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748076#comment-16748076 ] Hive QA commented on HIVE-20419: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 31s{color} | {color:blue} ql in master has 2310 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s{color} | {color:red} ql: The patch generated 1 new + 616 unchanged - 1 fixed = 617 total (was 617) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-15720/dev-support/hive-personality.sh | | git revision | master / 34c8ca4 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-15720/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-15720/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Vectorization: Prevent mutation of VectorPartitionDesc after being used in a > hashmap key > > > Key: HIVE-20419 > URL: https://issues.apache.org/jira/browse/HIVE-20419 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-20419.1.patch > > > This is going into the loop because the VectorPartitionDesc is modified after > it is used in the HashMap key - resulting in a hashcode & equals modification > after it has been placed in the hashmap. > {code} > HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: > 621ms > java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869 <7 > recursive calls> > java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, > Object) HashMap.java:1989 > java.util.HashMap.putVal(int, Object, Object, boolean, boolean) > HashMap.java:637 > java.util.HashMap.put(Object, Object) HashMap.java:611 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, > VectorPartitionDesc, Map) Vectorizer.java:1272 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, > boolean, List, Set, Map, Set, Array
[jira] [Commented] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
[ https://issues.apache.org/jira/browse/HIVE-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16747968#comment-16747968 ] ASF GitHub Bot commented on HIVE-20419: --- GitHub user pudidic opened a pull request: https://github.com/apache/hive/pull/518 HIVE-20419: Vectorization: Prevent mutation of VectorPartitionDesc af… …ter being used in a hashmap key (Teddy Choi) Change-Id: Id2219d74cd09db8efc6c464acef27aa0bb95fe2b You can merge this pull request into a Git repository by running: $ git pull https://github.com/pudidic/hive HIVE-20419 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/518.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #518 commit 45df1940daa816148cec44e282119aa62b65fb26 Author: Teddy Choi Date: 2019-01-21T08:31:11Z HIVE-20419: Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key (Teddy Choi) Change-Id: Id2219d74cd09db8efc6c464acef27aa0bb95fe2b > Vectorization: Prevent mutation of VectorPartitionDesc after being used in a > hashmap key > > > Key: HIVE-20419 > URL: https://issues.apache.org/jira/browse/HIVE-20419 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Assignee: Teddy Choi >Priority: Major > Labels: pull-request-available > > This is going into the loop because the VectorPartitionDesc is modified after > it is used in the HashMap key - resulting in a hashcode & equals modification > after it has been placed in the hashmap. > {code} > HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: > 621ms > java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869 <7 > recursive calls> > java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, > Object) HashMap.java:1989 > java.util.HashMap.putVal(int, Object, Object, boolean, boolean) > HashMap.java:637 > java.util.HashMap.put(Object, Object) HashMap.java:611 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, > VectorPartitionDesc, Map) Vectorizer.java:1272 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, > boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork, > String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) > Vectorizer.java:1654 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork, > Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork, > boolean) Vectorizer.java:1109 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node, > Stack, Object[]) Vectorizer.java:961 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, > TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) > TaskGraphWalker.java:180 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, > HashMap) TaskGraphWalker.java:125 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext) > Vectorizer.java:2442 > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, > ParseContext, Context) TezCompiler.java:717 > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, > HashSet, HashSet) TaskCompiler.java:258 > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443 > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) > CalcitePlanner.java:358 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
[ https://issues.apache.org/jira/browse/HIVE-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664417#comment-16664417 ] Gopal V commented on HIVE-20419: No, right now it just duplicates the desc for each partition and makes the plan object bigger than it should be. > Vectorization: Prevent mutation of VectorPartitionDesc after being used in a > hashmap key > > > Key: HIVE-20419 > URL: https://issues.apache.org/jira/browse/HIVE-20419 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Priority: Major > > This is going into the loop because the VectorPartitionDesc is modified after > it is used in the HashMap key - resulting in a hashcode & equals modification > after it has been placed in the hashmap. > {code} > HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: > 621ms > java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869 <7 > recursive calls> > java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, > Object) HashMap.java:1989 > java.util.HashMap.putVal(int, Object, Object, boolean, boolean) > HashMap.java:637 > java.util.HashMap.put(Object, Object) HashMap.java:611 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, > VectorPartitionDesc, Map) Vectorizer.java:1272 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, > boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork, > String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) > Vectorizer.java:1654 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork, > Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork, > boolean) Vectorizer.java:1109 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node, > Stack, Object[]) Vectorizer.java:961 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, > TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) > TaskGraphWalker.java:180 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, > HashMap) TaskGraphWalker.java:125 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext) > Vectorizer.java:2442 > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, > ParseContext, Context) TezCompiler.java:717 > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, > HashSet, HashSet) TaskCompiler.java:258 > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443 > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) > CalcitePlanner.java:358 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
[ https://issues.apache.org/jira/browse/HIVE-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664409#comment-16664409 ] Sergey Shelukhin commented on HIVE-20419: - Hmm. Cannot this also cause incorrect results? If the key is not found later. > Vectorization: Prevent mutation of VectorPartitionDesc after being used in a > hashmap key > > > Key: HIVE-20419 > URL: https://issues.apache.org/jira/browse/HIVE-20419 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Priority: Major > > This is going into the loop because the VectorPartitionDesc is modified after > it is used in the HashMap key - resulting in a hashcode & equals modification > after it has been placed in the hashmap. > {code} > HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: > 621ms > java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869 <7 > recursive calls> > java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, > Object) HashMap.java:1989 > java.util.HashMap.putVal(int, Object, Object, boolean, boolean) > HashMap.java:637 > java.util.HashMap.put(Object, Object) HashMap.java:611 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, > VectorPartitionDesc, Map) Vectorizer.java:1272 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, > boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork, > String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) > Vectorizer.java:1654 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork, > Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork, > boolean) Vectorizer.java:1109 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node, > Stack, Object[]) Vectorizer.java:961 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, > TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) > TaskGraphWalker.java:180 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, > HashMap) TaskGraphWalker.java:125 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext) > Vectorizer.java:2442 > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, > ParseContext, Context) TezCompiler.java:717 > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, > HashSet, HashSet) TaskCompiler.java:258 > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443 > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) > CalcitePlanner.java:358 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-20419) Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key
[ https://issues.apache.org/jira/browse/HIVE-20419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664410#comment-16664410 ] Sergey Shelukhin commented on HIVE-20419: - cc [~teddy.choi] > Vectorization: Prevent mutation of VectorPartitionDesc after being used in a > hashmap key > > > Key: HIVE-20419 > URL: https://issues.apache.org/jira/browse/HIVE-20419 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Gopal V >Priority: Major > > This is going into the loop because the VectorPartitionDesc is modified after > it is used in the HashMap key - resulting in a hashcode & equals modification > after it has been placed in the hashmap. > {code} > HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: > 621ms > java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869 <7 > recursive calls> > java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, > Object) HashMap.java:1989 > java.util.HashMap.putVal(int, Object, Object, boolean, boolean) > HashMap.java:637 > java.util.HashMap.put(Object, Object) HashMap.java:611 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, > VectorPartitionDesc, Map) Vectorizer.java:1272 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, > boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork, > String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) > Vectorizer.java:1654 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork, > Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork, > boolean) Vectorizer.java:1109 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node, > Stack, Object[]) Vectorizer.java:961 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, > TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) > TaskGraphWalker.java:180 > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, > HashMap) TaskGraphWalker.java:125 > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext) > Vectorizer.java:2442 > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, > ParseContext, Context) TezCompiler.java:717 > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, > HashSet, HashSet) TaskCompiler.java:258 > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443 > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) > CalcitePlanner.java:358 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)