[jira] [Commented] (HIVE-3693) Performance regression introduced by HIVE-3483
[ https://issues.apache.org/jira/browse/HIVE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13548192#comment-13548192 ] Hudson commented on HIVE-3693: -- Integrated in Hive-trunk-hadoop2 #54 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/54/]) HIVE-3693 : Performance regression introduced by HIVE-3483 (Thejas Nair via Ashutosh Chauhan) (Revision 1415098) Result = ABORTED hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1415098 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java > Performance regression introduced by HIVE-3483 > -- > > Key: HIVE-3693 > URL: https://issues.apache.org/jira/browse/HIVE-3693 > Project: Hive > Issue Type: Bug >Reporter: Gang Tim Liu >Assignee: Thejas M Nair >Priority: Minor > Fix For: 0.10.0 > > Attachments: HIVE-3693.1.patch > > > https://issues.apache.org/jira/browse/HIVE-3483 introduced a performance > regression in the client side during split computation. > The client side spends a lot more time in the split computation phase. The > problem is checkFilterPathContains method. > While investigating, can you create a config to disable it by default? > thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3693) Performance regression introduced by HIVE-3483
[ https://issues.apache.org/jira/browse/HIVE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506608#comment-13506608 ] Hudson commented on HIVE-3693: -- Integrated in Hive-trunk-h0.21 #1825 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1825/]) HIVE-3693 : Performance regression introduced by HIVE-3483 (Thejas Nair via Ashutosh Chauhan) (Revision 1415098) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1415098 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java > Performance regression introduced by HIVE-3483 > -- > > Key: HIVE-3693 > URL: https://issues.apache.org/jira/browse/HIVE-3693 > Project: Hive > Issue Type: Bug >Reporter: Gang Tim Liu >Assignee: Thejas M Nair >Priority: Minor > Fix For: 0.10.0 > > Attachments: HIVE-3693.1.patch > > > https://issues.apache.org/jira/browse/HIVE-3483 introduced a performance > regression in the client side during split computation. > The client side spends a lot more time in the split computation phase. The > problem is checkFilterPathContains method. > While investigating, can you create a config to disable it by default? > thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3693) Performance regression introduced by HIVE-3483
[ https://issues.apache.org/jira/browse/HIVE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506158#comment-13506158 ] Ashutosh Chauhan commented on HIVE-3693: +1. Will commit if tests pass. > Performance regression introduced by HIVE-3483 > -- > > Key: HIVE-3693 > URL: https://issues.apache.org/jira/browse/HIVE-3693 > Project: Hive > Issue Type: Bug >Reporter: Gang Tim Liu >Assignee: Thejas M Nair >Priority: Minor > Attachments: HIVE-3693.1.patch > > > https://issues.apache.org/jira/browse/HIVE-3483 introduced a performance > regression in the client side during split computation. > The client side spends a lot more time in the split computation phase. The > problem is checkFilterPathContains method. > While investigating, can you create a config to disable it by default? > thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3693) Performance regression introduced by HIVE-3483
[ https://issues.apache.org/jira/browse/HIVE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493777#comment-13493777 ] Gang Tim Liu commented on HIVE-3693: Yes, Namit. > Performance regression introduced by HIVE-3483 > -- > > Key: HIVE-3693 > URL: https://issues.apache.org/jira/browse/HIVE-3693 > Project: Hive > Issue Type: Bug >Reporter: Gang Tim Liu >Priority: Critical > > https://issues.apache.org/jira/browse/HIVE-3483 introduced a performance > regression in the client side during split computation. > The client side spends a lot more time in the split computation phase. The > problem is checkFilterPathContains method. > While investigating, can you create a config to disable it by default? > thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3693) Performance regression introduced by HIVE-3483
[ https://issues.apache.org/jira/browse/HIVE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493775#comment-13493775 ] Namit Jain commented on HIVE-3693: -- I agree. Tim, can you file a patch which undoes HIVE-3483 ? We can think about how to fix HIVE-3483 subsequently. > Performance regression introduced by HIVE-3483 > -- > > Key: HIVE-3693 > URL: https://issues.apache.org/jira/browse/HIVE-3693 > Project: Hive > Issue Type: Bug >Reporter: Gang Tim Liu >Priority: Critical > > https://issues.apache.org/jira/browse/HIVE-3483 introduced a performance > regression in the client side during split computation. > The client side spends a lot more time in the split computation phase. The > problem is checkFilterPathContains method. > While investigating, can you create a config to disable it by default? > thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3693) Performance regression introduced by HIVE-3483
[ https://issues.apache.org/jira/browse/HIVE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493770#comment-13493770 ] Gang Tim Liu commented on HIVE-3693: OK. Maybe we revert HIVE-3483 first? Then, find another solution for HIVE-3483. thanks a lot > Performance regression introduced by HIVE-3483 > -- > > Key: HIVE-3693 > URL: https://issues.apache.org/jira/browse/HIVE-3693 > Project: Hive > Issue Type: Bug >Reporter: Gang Tim Liu >Priority: Critical > > https://issues.apache.org/jira/browse/HIVE-3483 introduced a performance > regression in the client side during split computation. > The client side spends a lot more time in the split computation phase. The > problem is checkFilterPathContains method. > While investigating, can you create a config to disable it by default? > thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3693) Performance regression introduced by HIVE-3483
[ https://issues.apache.org/jira/browse/HIVE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493725#comment-13493725 ] Thejas M Nair commented on HIVE-3693: - There is more change required than just adding a Shell.windows check in one place, because the code has been changed to use Path instead of Strings representation of paths. This was done because Path handles issues like some string representations have paths that start with "/C:" (result of Path.toUri().getPath()), while others have "C:" (Path.toString()). But once when Path comparison were introduced, I saw issues because given path in CombineFilter.accept(path) didn't have scheme but paths CombineFilter.filterPaths had scheme, and hence the change to make given path fully qualified in checkFilterPathContains. Though using Path makes the code more extensible (eg in case of query against files on different file system, scheme should be considered), I think it needs a more holistic change is required to use it (HIVE-3616). As a quick fix,I think the code can be changed back to using strings instead of path, and change the way path in accept(Path path) is converted to string - (replace Path.toString() with Path.toUri().toString().) > Performance regression introduced by HIVE-3483 > -- > > Key: HIVE-3693 > URL: https://issues.apache.org/jira/browse/HIVE-3693 > Project: Hive > Issue Type: Bug >Reporter: Gang Tim Liu >Priority: Critical > > https://issues.apache.org/jira/browse/HIVE-3483 introduced a performance > regression in the client side during split computation. > The client side spends a lot more time in the split computation phase. The > problem is checkFilterPathContains method. > While investigating, can you create a config to disable it by default? > thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3693) Performance regression introduced by HIVE-3483
[ https://issues.apache.org/jira/browse/HIVE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493721#comment-13493721 ] Gang Tim Liu commented on HIVE-3693: A simpler and quicker way is to check shell.window before the method. may we do this before the simpler patch you refer to? Also it is consistent to other windows related patches. Thanks a lot Sent from my iPhone > Performance regression introduced by HIVE-3483 > -- > > Key: HIVE-3693 > URL: https://issues.apache.org/jira/browse/HIVE-3693 > Project: Hive > Issue Type: Bug >Reporter: Gang Tim Liu >Priority: Critical > > https://issues.apache.org/jira/browse/HIVE-3483 introduced a performance > regression in the client side during split computation. > The client side spends a lot more time in the split computation phase. The > problem is checkFilterPathContains method. > While investigating, can you create a config to disable it by default? > thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3693) Performance regression introduced by HIVE-3483
[ https://issues.apache.org/jira/browse/HIVE-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493716#comment-13493716 ] Thejas M Nair commented on HIVE-3693: - Sorry about that ! I see it can cause performance issues where there are large number of partitions. Working on a simpler patch that does not try to deal with so many issues. > Performance regression introduced by HIVE-3483 > -- > > Key: HIVE-3693 > URL: https://issues.apache.org/jira/browse/HIVE-3693 > Project: Hive > Issue Type: Bug >Reporter: Gang Tim Liu >Priority: Critical > > https://issues.apache.org/jira/browse/HIVE-3483 introduced a performance > regression in the client side during split computation. > The client side spends a lot more time in the split computation phase. The > problem is checkFilterPathContains method. > While investigating, can you create a config to disable it by default? > thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira