[jira] [Commented] (HIVE-11525) Bucket pruning
[ https://issues.apache.org/jira/browse/HIVE-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708472#comment-14708472 ] Sergey Shelukhin commented on HIVE-11525: - I only cloned it for convenience. The idea is precisely to have two separate JIRAs for 2 separate optimizations, since they don't have to be done at the same time Bucket pruning -- Key: HIVE-11525 URL: https://issues.apache.org/jira/browse/HIVE-11525 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.0 Reporter: Maciek Kocon Assignee: Takuya Fukudome Labels: gsoc2015 Logically and functionally bucketing and partitioning are quite similar - both provide mechanism to segregate and separate the table's data based on its content. Thanks to that significant further optimisations like [partition] PRUNING or [bucket] MAP JOIN are possible. The difference seems to be imposed by design where the PARTITIONing is open/explicit while BUCKETing is discrete/implicit. Partitioning seems to be very common if not a standard feature in all current RDBMS while BUCKETING seems to be HIVE specific only. In a way BUCKETING could be also called by hashing or simply IMPLICIT PARTITIONING. Regardless of the fact that these two are recognised as two separate features available in Hive there should be nothing to prevent leveraging same existing query/join optimisations across the two. BUCKET pruning Enable partition PRUNING equivalent optimisation for queries on BUCKETED tables Simplest example is for queries like: SELECT … FROM x WHERE colA=123123 to read only the relevant bucket file rather than all file-buckets that belong to a table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11617) Explain plan for multiple lateral views is very slow
[ https://issues.apache.org/jira/browse/HIVE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11617: Attachment: HIVE-11617.patch Explain plan for multiple lateral views is very slow Key: HIVE-11617 URL: https://issues.apache.org/jira/browse/HIVE-11617 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-11617.patch The following explain job will be very slow or never finish if there are many lateral views involved. High CPU usage is also noticed. {noformat} EXPLAIN SELECT * from ( SELECT * FROM table1 ) x LATERAL VIEW json_tuple(...) x1 LATERAL VIEW json_tuple(...) x2 ... {noformat} From jstack, the job is busy with preorder tree traverse. {noformat} at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.hadoop.hive.ql.lib.RuleRegExp.cost(RuleRegExp.java:67) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:72) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at
[jira] [Updated] (HIVE-11617) Explain plan for multiple lateral views is very slow
[ https://issues.apache.org/jira/browse/HIVE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11617: Attachment: (was: HIVE-11617.patch) Explain plan for multiple lateral views is very slow Key: HIVE-11617 URL: https://issues.apache.org/jira/browse/HIVE-11617 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-11617.patch The following explain job will be very slow or never finish if there are many lateral views involved. High CPU usage is also noticed. {noformat} EXPLAIN SELECT * from ( SELECT * FROM table1 ) x LATERAL VIEW json_tuple(...) x1 LATERAL VIEW json_tuple(...) x2 ... {noformat} From jstack, the job is busy with preorder tree traverse. {noformat} at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.hadoop.hive.ql.lib.RuleRegExp.cost(RuleRegExp.java:67) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:72) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at
[jira] [Commented] (HIVE-11617) Explain plan for multiple lateral views is very slow
[ https://issues.apache.org/jira/browse/HIVE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708416#comment-14708416 ] Aihua Xu commented on HIVE-11617: - Seems my change causes the plan not to output some necessary info. I will fix that. Explain plan for multiple lateral views is very slow Key: HIVE-11617 URL: https://issues.apache.org/jira/browse/HIVE-11617 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-11617.patch The following explain job will be very slow or never finish if there are many lateral views involved. High CPU usage is also noticed. {noformat} EXPLAIN SELECT * from ( SELECT * FROM table1 ) x LATERAL VIEW json_tuple(...) x1 LATERAL VIEW json_tuple(...) x2 ... {noformat} From jstack, the job is busy with preorder tree traverse. {noformat} at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.hadoop.hive.ql.lib.RuleRegExp.cost(RuleRegExp.java:67) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:72) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61)
[jira] [Updated] (HIVE-11617) Explain plan for multiple lateral views is very slow
[ https://issues.apache.org/jira/browse/HIVE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11617: Attachment: HIVE-11617.patch Explain plan for multiple lateral views is very slow Key: HIVE-11617 URL: https://issues.apache.org/jira/browse/HIVE-11617 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-11617.patch The following explain job will be very slow or never finish if there are many lateral views involved. High CPU usage is also noticed. {noformat} EXPLAIN SELECT * from ( SELECT * FROM table1 ) x LATERAL VIEW json_tuple(...) x1 LATERAL VIEW json_tuple(...) x2 ... {noformat} From jstack, the job is busy with preorder tree traverse. {noformat} at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.hadoop.hive.ql.lib.RuleRegExp.cost(RuleRegExp.java:67) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:72) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at
[jira] [Updated] (HIVE-11617) Explain plan for multiple lateral views is very slow
[ https://issues.apache.org/jira/browse/HIVE-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11617: Attachment: (was: HIVE-11617.patch) Explain plan for multiple lateral views is very slow Key: HIVE-11617 URL: https://issues.apache.org/jira/browse/HIVE-11617 Project: Hive Issue Type: Bug Components: Logical Optimizer Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-11617.patch The following explain job will be very slow or never finish if there are many lateral views involved. High CPU usage is also noticed. {noformat} EXPLAIN SELECT * from ( SELECT * FROM table1 ) x LATERAL VIEW json_tuple(...) x1 LATERAL VIEW json_tuple(...) x2 ... {noformat} From jstack, the job is busy with preorder tree traverse. {noformat} at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.hadoop.hive.ql.lib.RuleRegExp.cost(RuleRegExp.java:67) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:72) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:56) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:61) at
[jira] [Commented] (HIVE-11623) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for PTF operator
[ https://issues.apache.org/jira/browse/HIVE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708591#comment-14708591 ] Hive QA commented on HIVE-11623: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751932/HIVE-11623.01.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9377 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_gby_empty org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_limit org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_semijoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_subq_not_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_udf_udaf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_windowing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5046/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5046/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5046/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12751932 - PreCommit-HIVE-TRUNK-Build CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for PTF operator - Key: HIVE-11623 URL: https://issues.apache.org/jira/browse/HIVE-11623 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11623.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10890) Provide implementable engine selector
[ https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708627#comment-14708627 ] Navis commented on HIVE-10890: -- [~nemon] It's named as selector but you can implement more sophisticated strategy also in it. Provide implementable engine selector - Key: HIVE-10890 URL: https://issues.apache.org/jira/browse/HIVE-10890 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Now hive supports three kind of engines. It would be good to have an automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11623) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for PTF operator
[ https://issues.apache.org/jira/browse/HIVE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11623: --- Attachment: HIVE-11623.01.patch CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for PTF operator - Key: HIVE-11623 URL: https://issues.apache.org/jira/browse/HIVE-11623 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11623.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10890) Provide implementable engine selector
[ https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-10890: - Attachment: HIVE-10890.1.patch.txt Provide implementable engine selector - Key: HIVE-10890 URL: https://issues.apache.org/jira/browse/HIVE-10890 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-10890.1.patch.txt Now hive supports three kind of engines. It would be good to have an automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10890) Provide implementable engine selector
[ https://issues.apache.org/jira/browse/HIVE-10890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708642#comment-14708642 ] Hive QA commented on HIVE-10890: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751943/HIVE-10890.1.patch.txt Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5047/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5047/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5047/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5047/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at e2d148b HIVE-11586: ObjectInspectorFactory.getReflectionObjectInspector is not thread-safe (Jimmy, reviewed by Szehon, Xuefu) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at e2d148b HIVE-11586: ObjectInspectorFactory.getReflectionObjectInspector is not thread-safe (Jimmy, reviewed by Szehon, Xuefu) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12751943 - PreCommit-HIVE-TRUNK-Build Provide implementable engine selector - Key: HIVE-10890 URL: https://issues.apache.org/jira/browse/HIVE-10890 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-10890.1.patch.txt Now hive supports three kind of engines. It would be good to have an automatic engine selector without setting explicitly engine for execution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11521) Loop optimization for SIMD in logical operators
[ https://issues.apache.org/jira/browse/HIVE-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708756#comment-14708756 ] Ashutosh Chauhan commented on HIVE-11521: - +1 Loop optimization for SIMD in logical operators --- Key: HIVE-11521 URL: https://issues.apache.org/jira/browse/HIVE-11521 Project: Hive Issue Type: Sub-task Reporter: Teddy Choi Assignee: Teddy Choi Priority: Minor Attachments: HIVE-11521.patch JVM is quite strict on the code schema which may executed with SIMD instructions, take a loop in ColOrCol.java for example, {code} for (int i = 0; i != n; i++) { outputVector[i] = vector1[0] | vector2[i]; } {code} The vector1\[0\] reference would prevent JVM to execute this part of code with vectorized instructions, we need to assign the vector1\[0\] to a variable outside of loop, and use that variable in loop. This issues covers AND, OR, NOT logical operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11341) Avoid expensive resizing of ASTNode tree
[ https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708795#comment-14708795 ] Ashutosh Chauhan commented on HIVE-11341: - [~hsubramaniyan] Can you briefly describe what you are trying to achieve in this patch? Avoid expensive resizing of ASTNode tree - Key: HIVE-11341 URL: https://issues.apache.org/jira/browse/HIVE-11341 Project: Hive Issue Type: Bug Components: Hive, Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11341.1.patch, HIVE-11341.2.patch, HIVE-11341.3.patch, HIVE-11341.4.patch, HIVE-11341.5.patch, HIVE-11341.6.patch, HIVE-11341.7.patch, HIVE-11341.8.patch, HIVE-11341.9.patch {code} Stack TraceSample CountPercentage(%) parse.BaseSemanticAnalyzer.analyze(ASTNode, Context) 1,605 90 parse.CalcitePlanner.analyzeInternal(ASTNode) 1,605 90 parse.SemanticAnalyzer.analyzeInternal(ASTNode, SemanticAnalyzer$PlannerContext) 1,605 90 parse.CalcitePlanner.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) 1,604 90 parse.SemanticAnalyzer.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) 1,604 90 parse.SemanticAnalyzer.genPlan(QB) 1,604 90 parse.SemanticAnalyzer.genPlan(QB, boolean) 1,604 90 parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map) 1,604 90 parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, Operator, Map, boolean) 1,603 90 parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, Operator, boolean)1,603 90 parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, boolean)1,603 90 parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 1,603 90 parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 1,603 90 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx) 1,603 90 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, TypeCheckProcFactory) 1,603 90 lib.DefaultGraphWalker.startWalking(Collection, HashMap) 1,579 89 lib.DefaultGraphWalker.walk(Node) 1,571 89 java.util.ArrayList.removeAll(Collection) 1,433 81 java.util.ArrayList.batchRemove(Collection, boolean) 1,433 81 java.util.ArrayList.contains(Object) 1,228 69 java.util.ArrayList.indexOf(Object)1,228 69 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8319) Add configuration for custom services in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8319: Attachment: HIVE-8319.4.patch.txt Add configuration for custom services in hiveserver2 Key: HIVE-8319 URL: https://issues.apache.org/jira/browse/HIVE-8319 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8319.1.patch.txt, HIVE-8319.2.patch.txt, HIVE-8319.3.patch.txt, HIVE-8319.4.patch.txt NO PRECOMMIT TESTS Register services to hiveserver2, for example, {noformat} property namehive.server2.service.classesname valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue /property property nameazkaban.ssl.portname name...name /property {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11366) Avoid right leaning tree hashCode depth during ExprNodeDescEqualityWrapper HashMaps
[ https://issues.apache.org/jira/browse/HIVE-11366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708793#comment-14708793 ] Ashutosh Chauhan commented on HIVE-11366: - +1 Avoid right leaning tree hashCode depth during ExprNodeDescEqualityWrapper HashMaps --- Key: HIVE-11366 URL: https://issues.apache.org/jira/browse/HIVE-11366 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.1.1 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-11366.1.patch For a large sequence of AND clauses, the precedence order results in a deep unbalanced tree. (AND A (AND B (AND C (AND D (AND E)) which could result in a hashcode for the top-level expression traversing deep. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11450) Resources are not cleaned up properly at multiple places
[ https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-11450: --- Attachment: HIVE-11450.4.patch Resources are not cleaned up properly at multiple places Key: HIVE-11450 URL: https://issues.apache.org/jira/browse/HIVE-11450 Project: Hive Issue Type: Bug Components: JDBC Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-11450.2.patch, HIVE-11450.3.patch, HIVE-11450.4.patch, HIVE-11450.patch I noticed that various resources aren't properly cleaned up in various classes. To be specific, * Some streams aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/BeeLine.java}} and {{beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java}} * {{Statement}}, {{ResultSet}}, and {{Connection}} aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java}} * {{Statement}} and {{ResultSet}} aren't properly cleaned up in {{jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11623) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for PTF operator
[ https://issues.apache.org/jira/browse/HIVE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708823#comment-14708823 ] Hive QA commented on HIVE-11623: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751963/HIVE-11623.02.patch {color:green}SUCCESS:{color} +1 9377 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5048/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5048/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5048/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12751963 - PreCommit-HIVE-TRUNK-Build CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for PTF operator - Key: HIVE-11623 URL: https://issues.apache.org/jira/browse/HIVE-11623 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11623.01.patch, HIVE-11623.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11624) Beeline-cli: support hive.cli.print.header in new CLI
[ https://issues.apache.org/jira/browse/HIVE-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Jia reassigned HIVE-11624: - Assignee: Ke Jia Beeline-cli: support hive.cli.print.header in new CLI - Key: HIVE-11624 URL: https://issues.apache.org/jira/browse/HIVE-11624 Project: Hive Issue Type: Bug Reporter: Ke Jia Assignee: Ke Jia In the old CLI, it uses hive.cli.print.header from the hive configuration to force execution a script . We need to support the previous configuration using beeline functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708669#comment-14708669 ] Navis commented on HIVE-8319: - [~thejas] Do you still have an interest to get this into hive? Add configuration for custom services in hiveserver2 Key: HIVE-8319 URL: https://issues.apache.org/jira/browse/HIVE-8319 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8319.1.patch.txt, HIVE-8319.2.patch.txt, HIVE-8319.3.patch.txt, HIVE-8319.4.patch.txt NO PRECOMMIT TESTS Register services to hiveserver2, for example, {noformat} property namehive.server2.service.classesname valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue /property property nameazkaban.ssl.portname name...name /property {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8319) Add configuration for custom services in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8319: Attachment: HIVE-8319.4.patch.txt Add configuration for custom services in hiveserver2 Key: HIVE-8319 URL: https://issues.apache.org/jira/browse/HIVE-8319 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8319.1.patch.txt, HIVE-8319.2.patch.txt, HIVE-8319.3.patch.txt, HIVE-8319.4.patch.txt NO PRECOMMIT TESTS Register services to hiveserver2, for example, {noformat} property namehive.server2.service.classesname valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue /property property nameazkaban.ssl.portname name...name /property {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11624) Beeline-cli: support hive.cli.print.header in new CLI
[ https://issues.apache.org/jira/browse/HIVE-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Jia updated HIVE-11624: -- Assignee: (was: Ke Jia) Beeline-cli: support hive.cli.print.header in new CLI - Key: HIVE-11624 URL: https://issues.apache.org/jira/browse/HIVE-11624 Project: Hive Issue Type: Bug Reporter: Ke Jia In the old CLI, it uses hive.cli.print.header from the hive configuration to force execution a script . We need to support the previous configuration using beeline functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11623) CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for PTF operator
[ https://issues.apache.org/jira/browse/HIVE-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11623: --- Attachment: HIVE-11623.02.patch CBO: Calcite Operator To Hive Operator (Calcite Return Path): fix the tableAlias for PTF operator - Key: HIVE-11623 URL: https://issues.apache.org/jira/browse/HIVE-11623 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11623.01.patch, HIVE-11623.02.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11616) DelegationTokenSecretManager reuse the same objectstore ,which has cocurrent issue
[ https://issues.apache.org/jira/browse/HIVE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Fu updated HIVE-11616: --- Assignee: (was: Cody Fu) DelegationTokenSecretManager reuse the same objectstore ,which has cocurrent issue -- Key: HIVE-11616 URL: https://issues.apache.org/jira/browse/HIVE-11616 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.2.1 Reporter: wangwenli Original Estimate: 12h Remaining Estimate: 12h sometime in metastore log, will get below exception, after analysis, we found that : when hivemetastore start, the DelegationTokenSecretManager will maintain the same objectstore, see here saslServer.startDelegationTokenSecretManager(conf, *baseHandler.getMS()*, ServerMode.METASTORE); this lead to the cocurrent issue. 2015-08-18 20:59:10,520 | ERROR | pool-6-thread-200 | Error occurred during processing of message. | org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:296) org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:154) at org.apache.hadoop.hive.thrift.DBTokenStore.getToken(DBTokenStore.java:88) at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:112) at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:56) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java:565) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java:596) at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:589) at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:244) at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283) at org.apache.thrift.transport.HiveTSaslServerTransport.open(HiveTSaslServerTransport.java:133) at org.apache.thrift.transport.HiveTSaslServerTransport$Factory.getTransport(HiveTSaslServerTransport.java:261) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1652) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started at org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47) at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131) at org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88) at org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80) at org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:420) at org.apache.hadoop.hive.metastore.ObjectStore.getToken(ObjectStore.java:6455) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) at com.sun.proxy.$Proxy4.getToken(Unknown Source) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
[jira] [Assigned] (HIVE-11616) DelegationTokenSecretManager reuse the same objectstore ,which has cocurrent issue
[ https://issues.apache.org/jira/browse/HIVE-11616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cody Fu reassigned HIVE-11616: -- Assignee: Cody Fu DelegationTokenSecretManager reuse the same objectstore ,which has cocurrent issue -- Key: HIVE-11616 URL: https://issues.apache.org/jira/browse/HIVE-11616 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.2.1 Reporter: wangwenli Assignee: Cody Fu Original Estimate: 12h Remaining Estimate: 12h sometime in metastore log, will get below exception, after analysis, we found that : when hivemetastore start, the DelegationTokenSecretManager will maintain the same objectstore, see here saslServer.startDelegationTokenSecretManager(conf, *baseHandler.getMS()*, ServerMode.METASTORE); this lead to the cocurrent issue. 2015-08-18 20:59:10,520 | ERROR | pool-6-thread-200 | Error occurred during processing of message. | org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:296) org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:154) at org.apache.hadoop.hive.thrift.DBTokenStore.getToken(DBTokenStore.java:88) at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:112) at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:56) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java:565) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java:596) at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:589) at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:244) at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283) at org.apache.thrift.transport.HiveTSaslServerTransport.open(HiveTSaslServerTransport.java:133) at org.apache.thrift.transport.HiveTSaslServerTransport$Factory.getTransport(HiveTSaslServerTransport.java:261) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1652) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.datanucleus.transaction.NucleusTransactionException: Invalid state. Transaction has already started at org.datanucleus.transaction.TransactionManager.begin(TransactionManager.java:47) at org.datanucleus.TransactionImpl.begin(TransactionImpl.java:131) at org.datanucleus.api.jdo.JDOTransaction.internalBegin(JDOTransaction.java:88) at org.datanucleus.api.jdo.JDOTransaction.begin(JDOTransaction.java:80) at org.apache.hadoop.hive.metastore.ObjectStore.openTransaction(ObjectStore.java:420) at org.apache.hadoop.hive.metastore.ObjectStore.getToken(ObjectStore.java:6455) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) at com.sun.proxy.$Proxy4.getToken(Unknown Source) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
[jira] [Commented] (HIVE-11469) InstanceCache does not have proper implementation of equals or hashcode
[ https://issues.apache.org/jira/browse/HIVE-11469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708749#comment-14708749 ] Ashutosh Chauhan commented on HIVE-11469: - +1 [~swarnim] Can you also update title of this jira to reflect changes in the patch? InstanceCache does not have proper implementation of equals or hashcode --- Key: HIVE-11469 URL: https://issues.apache.org/jira/browse/HIVE-11469 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-11469.1.patch.txt With HIVE-11288, we started using InstanceCache as a key. However it doesn't seem like the class actually implements the equals or hashcode methods which can potentially lead to inaccurate results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11472) ORC StringDirectTreeReader is thrashing the GC due to byte[] allocation per row
[ https://issues.apache.org/jira/browse/HIVE-11472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708781#comment-14708781 ] Ashutosh Chauhan commented on HIVE-11472: - +1 Patch looks good. I wonder whether we shall just bump hadoop version for this (we are already at 2.6.0) instead of shiming. ORC StringDirectTreeReader is thrashing the GC due to byte[] allocation per row --- Key: HIVE-11472 URL: https://issues.apache.org/jira/browse/HIVE-11472 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Labels: Performance Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11472.1.patch, HIVE-11472.2.patch For every row x column {code} int len = (int) lengths.next(); int offset = 0; byte[] bytes = new byte[len]; while (len 0) { int written = stream.read(bytes, offset, len); if (written 0) { throw new EOFException(Can't finish byte read from + stream); } {code} https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/TreeReaderFactory.java#L1552 This is not a big issue until it misses the GC TLAB. From hadoop-2.6.x (HADOOP-10855) you can read into a Text directly. Possibly can create a different TreeReader from the factory for 2.6.x use a DataInputStream per stream and prevent an allocation in the inner loop. {code} int len = (int) lengths.next(); result.readWithKnownLength(datastream, len); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11241) Database prefix does not work properly if table has same name
[ https://issues.apache.org/jira/browse/HIVE-11241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708808#comment-14708808 ] Ashutosh Chauhan commented on HIVE-11241: - I am also of opinion that we should fix this in master by disambiguating the grammar. Current design of guessing what user wants in describe statement is confusing. Database prefix does not work properly if table has same name - Key: HIVE-11241 URL: https://issues.apache.org/jira/browse/HIVE-11241 Project: Hive Issue Type: Bug Components: Database/Schema Reporter: Johndee Burks Assignee: Ferdinand Xu Attachments: HIVE-11241.patch If you do the following it will fail: {code} 0: jdbc:hive2://cdh54-1.test.com:1/defaul create database test4; No rows affected (0.881 seconds) 0: jdbc:hive2://cdh54-1.test.com:1/defaul use test4; No rows affected (0.1 seconds) 0: jdbc:hive2://cdh54-1.test.com:1/defaul create table test4 (c1 char(200)); No rows affected (0.306 seconds) 0: jdbc:hive2://cdh54-1.test.com:1/defaul desc test4.test4; Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. cannot find field test4 from [0:c1] (state=08S01,code=1) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11450) Resources are not cleaned up properly at multiple places
[ https://issues.apache.org/jira/browse/HIVE-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708811#comment-14708811 ] Nezih Yigitbasi commented on HIVE-11450: [~ashutoshc] Rebased uploaded a new one, can you please retry? Resources are not cleaned up properly at multiple places Key: HIVE-11450 URL: https://issues.apache.org/jira/browse/HIVE-11450 Project: Hive Issue Type: Bug Components: JDBC Reporter: Nezih Yigitbasi Assignee: Nezih Yigitbasi Attachments: HIVE-11450.2.patch, HIVE-11450.3.patch, HIVE-11450.4.patch, HIVE-11450.patch I noticed that various resources aren't properly cleaned up in various classes. To be specific, * Some streams aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/BeeLine.java}} and {{beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java}} * {{Statement}}, {{ResultSet}}, and {{Connection}} aren't properly cleaned up in {{beeline/src/java/org/apache/hive/beeline/HiveSchemaTool.java}} * {{Statement}} and {{ResultSet}} aren't properly cleaned up in {{jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11624) Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch]
[ https://issues.apache.org/jira/browse/HIVE-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-11624: Summary: Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch] (was: Beeline-cli: support hive.cli.print.header in new CLI) Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch] - Key: HIVE-11624 URL: https://issues.apache.org/jira/browse/HIVE-11624 Project: Hive Issue Type: Bug Reporter: Ke Jia Assignee: Ke Jia In the old CLI, it uses hive.cli.print.header from the hive configuration to force execution a script . We need to support the previous configuration using beeline functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11624) Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch]
[ https://issues.apache.org/jira/browse/HIVE-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-11624: Issue Type: Sub-task (was: Bug) Parent: HIVE-10511 Beeline-cli: support hive.cli.print.header in new CLI[beeline-cli branch] - Key: HIVE-11624 URL: https://issues.apache.org/jira/browse/HIVE-11624 Project: Hive Issue Type: Sub-task Reporter: Ke Jia Assignee: Ke Jia In the old CLI, it uses hive.cli.print.header from the hive configuration to force execution a script . We need to support the previous configuration using beeline functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11624) Beeline-cli: support hive.cli.print.header in new CLI
[ https://issues.apache.org/jira/browse/HIVE-11624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-11624: Assignee: Ke Jia Beeline-cli: support hive.cli.print.header in new CLI - Key: HIVE-11624 URL: https://issues.apache.org/jira/browse/HIVE-11624 Project: Hive Issue Type: Bug Reporter: Ke Jia Assignee: Ke Jia In the old CLI, it uses hive.cli.print.header from the hive configuration to force execution a script . We need to support the previous configuration using beeline functionality. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11544) LazyInteger should avoid throwing NumberFormatException
[ https://issues.apache.org/jira/browse/HIVE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708754#comment-14708754 ] Ashutosh Chauhan commented on HIVE-11544: - [~gopalv] Won't this extra check slow down the common case where bytes are indeed valid ? LazyInteger should avoid throwing NumberFormatException --- Key: HIVE-11544 URL: https://issues.apache.org/jira/browse/HIVE-11544 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.14.0, 1.2.0, 1.3.0, 2.0.0 Reporter: William Slacum Assignee: Gopal V Priority: Minor Labels: Performance Attachments: HIVE-11544.1.patch {{LazyInteger#parseInt}} will throw a {{NumberFormatException}} under these conditions: # bytes are null # radix is invalid # length is 0 # the string is '+' or '-' # {{LazyInteger#parse}} throws a {{NumberFormatException}} Most of the time, such as in {{LazyInteger#init}} and {{LazyByte#init}}, the exception is caught, swallowed, and {{isNull}} is set to {{true}}. This is generally a bad workflow, as exception creation is a performance bottleneck, and potentially repeating for many rows in a query can have a drastic performance consequence. It would be better if this method returned an {{OptionalInteger}}, which would provide similar functionality with a higher throughput rate. I've tested against 0.14.0, and saw that the logic is unchanged in 1.2.0, so I've marked those as affected. Any version in between would also suffer from this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)