[ https://issues.apache.org/jira/browse/HIVE-22964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17054325#comment-17054325 ]
Peter Vary commented on HIVE-22964: ----------------------------------- Hi [~aditya-shah], * I have found this for renaming the configuration key: [https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/conf/Configuration.html#addDeprecation-java.lang.String-java.lang.String-java.lang.String-] We should check that this is working as advertised/expected, and then go ahead and we can rename the configuration value. * HIVE-13120: The last comment states: {quote}Since the ORCInputformat is cached in `FetchOperator.java`, the UGI in `Context.threadpool` thread will be userA always. {quote} This suggests to me that the problem was that we cached the ORCInputFormat. Do we have any such problem here? * MMPathInfo: We might just use 2 synchronizedList or some "Concurrent" implementation as {{finalPaths}} and {{pathsWithFileOriginals}} parameters for the processPathsForMmRead method, and get away without more objects. Or did you see serious performance degradation there because of the synchronization? Thanks for taking care of this! Peter > MM table split computation is very slow > --------------------------------------- > > Key: HIVE-22964 > URL: https://issues.apache.org/jira/browse/HIVE-22964 > Project: Hive > Issue Type: Improvement > Reporter: Aditya Shah > Assignee: Aditya Shah > Priority: Major > Attachments: HIVE-22964.patch > > > Since for MM table we process the paths prior to inputFormat.getSplits() we > end up doing listing on the whole table at once. This could be optimized. -- This message was sent by Atlassian Jira (v8.3.4#803005)