[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-13 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203332#comment-16203332 ] Rui Li commented on HIVE-15104: --- [~xuefuz], we need to locate the jar on Hive side, before w

[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-12 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201695#comment-16201695 ] Rui Li commented on HIVE-15104: --- One correction: the {{NoClassDefFoundError}} is for {{com.

[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-11 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16201405#comment-16201405 ] Rui Li commented on HIVE-15104: --- Hi [~xuefuz], sorry for taking so long to update. I tried o

[jira] [Commented] (HIVE-16395) ConcurrentModificationException on config object in HoS

2017-10-10 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198558#comment-16198558 ] Rui Li commented on HIVE-16395: --- Hi [~asherman], sorry for the late response, just returned

[jira] [Commented] (HIVE-15860) RemoteSparkJobMonitor may hang when RemoteDriver exits abnormally

2017-10-10 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198434#comment-16198434 ] Rui Li commented on HIVE-15860: --- Hi [~stakiar], I agree it's good to make QUEUED/SENT fail f

[jira] [Commented] (HIVE-13843) Re-enable the HoS tests disabled in HIVE-13402

2017-10-01 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-13843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16187310#comment-16187310 ] Rui Li commented on HIVE-13843: --- [~stakiar], sorry about the delay. The patch LGTM, +1. I th

[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-27 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183483#comment-16183483 ] Rui Li commented on HIVE-17545: --- [~stakiar] I think so. We actually have {{SplitSparkWorkRes

[jira] [Commented] (HIVE-13843) Re-enable the HoS tests disabled in HIVE-13402

2017-09-27 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-13843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183480#comment-16183480 ] Rui Li commented on HIVE-13843: --- Thanks [~stakiar] for offering the help. > Re-enable the H

[jira] [Commented] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed

2017-09-26 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181921#comment-16181921 ] Rui Li commented on HIVE-17586: --- [~xuefuz], the {{allowCoreThreadTimeOut}} controls whether

[jira] [Commented] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed

2017-09-26 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181898#comment-16181898 ] Rui Li commented on HIVE-17586: --- Hi [~xuefuz], we're setting {{allowCoreThreadTimeOut(true)}

[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-25 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180152#comment-16180152 ] Rui Li commented on HIVE-17545: --- [~kellyzly], if you apply two different transformations to

[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-25 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180111#comment-16180111 ] Rui Li commented on HIVE-17545: --- Hi [~stakiar], since we have a switch to turn off combing e

[jira] [Updated] (HIVE-17554) Occurr java.lang.ArithmeticException: / by zero at hplsql component

2017-09-20 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17554: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to mast

[jira] [Commented] (HIVE-17554) Occurr java.lang.ArithmeticException: / by zero at hplsql component

2017-09-20 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174142#comment-16174142 ] Rui Li commented on HIVE-17554: --- +1 > Occurr java.lang.ArithmeticException: / by zero at hp

[jira] [Commented] (HIVE-17554) Occurr java.lang.ArithmeticException: / by zero at hplsql component

2017-09-19 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172653#comment-16172653 ] Rui Li commented on HIVE-17554: --- Hi [~linzhangbing], I think it'd be better to use a double

[jira] [Commented] (HIVE-17549) Use SHA-256 for RowContainer to improve security

2017-09-19 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172650#comment-16172650 ] Rui Li commented on HIVE-17549: --- Thanks for the explanations [~txhsj]. It seems row containe

[jira] [Commented] (HIVE-17474) Poor Performance about subquery like DS/query70 on HoS

2017-09-19 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171238#comment-16171238 ] Rui Li commented on HIVE-17474: --- Thanks [~kellyzly] for the update. Good to know we can get

[jira] [Commented] (HIVE-17549) Use SHA-256 for RowContainer to improve security

2017-09-19 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171233#comment-16171233 ] Rui Li commented on HIVE-17549: --- Hi [~txhsj], could you explain how the hash is used here an

[jira] [Commented] (HIVE-17542) Make HoS CombineEquivalentWorkResolver Configurable

2017-09-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171000#comment-16171000 ] Rui Li commented on HIVE-17542: --- Thanks [~stakiar] for the work. +1 > Make HoS CombineEquiv

[jira] [Commented] (HIVE-17474) Poor Performance about subquery like DS/query70 on HoS

2017-09-13 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164235#comment-16164235 ] Rui Li commented on HIVE-17474: --- [~kellyzly], skew is detected by simply counting the size o

[jira] [Commented] (HIVE-17474) Poor Performance about subquery like DS/query70 on HoS

2017-09-12 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16164056#comment-16164056 ] Rui Li commented on HIVE-17474: --- [~kellyzly], I think CommonMergeJoinOperator is specific to

[jira] [Commented] (HIVE-17414) HoS DPP + Vectorization generates invalid explain plan due to CombineEquivalentWorkResolver

2017-09-04 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16152286#comment-16152286 ] Rui Li commented on HIVE-17414: --- +1, thanks for the update. And I suppose you meant the 4th

[jira] [Commented] (HIVE-17414) HoS DPP + Vectorization generates invalid explain plan due to CombineEquivalentWorkResolver

2017-09-01 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150357#comment-16150357 ] Rui Li commented on HIVE-17414: --- +1 > HoS DPP + Vectorization generates invalid explain pla

[jira] [Commented] (HIVE-17405) HoS DPP ConstantPropagate should use ConstantPropagateOption.SHORTCUT

2017-09-01 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150356#comment-16150356 ] Rui Li commented on HIVE-17405: --- +1 > HoS DPP ConstantPropagate should use ConstantPropagat

[jira] [Commented] (HIVE-17383) ArrayIndexOutOfBoundsException in VectorGroupByOperator

2017-09-01 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150347#comment-16150347 ] Rui Li commented on HIVE-17383: --- [~kellyzly], I mean the failures with age 1. Can you reprod

[jira] [Commented] (HIVE-17383) ArrayIndexOutOfBoundsException in VectorGroupByOperator

2017-09-01 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150314#comment-16150314 ] Rui Li commented on HIVE-17383: --- The failures can't be reproduced locally. [~mmccline], [~as

[jira] [Commented] (HIVE-17405) HoS DPP ConstantPropagate should use ConstantPropagateOption.SHORTCUT

2017-08-31 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149926#comment-16149926 ] Rui Li commented on HIVE-17405: --- [~kellyzly], I see, thanks for the explanations. Then let's

[jira] [Commented] (HIVE-17414) HoS DPP + Vectorization generates invalid explain plan due to CombineEquivalentWorkResolver

2017-08-31 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149919#comment-16149919 ] Rui Li commented on HIVE-17414: --- Thanks [~kellyzly] for the update. I don't think the commen

[jira] [Commented] (HIVE-17405) HoS DPP ConstantPropagate should use ConstantPropagateOption.SHORTCUT

2017-08-31 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149916#comment-16149916 ] Rui Li commented on HIVE-17405: --- Hi [~stakiar], could you explain why you move the constant

[jira] [Updated] (HIVE-17383) ArrayIndexOutOfBoundsException in VectorGroupByOperator

2017-08-31 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17383: -- Status: Patch Available (was: Open) > ArrayIndexOutOfBoundsException in VectorGroupByOperator > ---

[jira] [Updated] (HIVE-17383) ArrayIndexOutOfBoundsException in VectorGroupByOperator

2017-08-31 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17383: -- Attachment: HIVE-17383.1.patch Patch v1 uses the key index (instead of the vector expression's column index) fo

[jira] [Commented] (HIVE-17414) HoS DPP + Vectorization generates invalid explain plan due to CombineEquivalentWorkResolver

2017-08-31 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148579#comment-16148579 ] Rui Li commented on HIVE-17414: --- [~kellyzly], I think another way is to call SparkUtilities#

[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-08-30 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148485#comment-16148485 ] Rui Li commented on HIVE-15104: --- [~xuefuz], I'll try if that's feasible. Do you think it's O

[jira] [Assigned] (HIVE-17383) ArrayIndexOutOfBoundsException in VectorGroupByOperator

2017-08-30 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li reassigned HIVE-17383: - Assignee: Rui Li > ArrayIndexOutOfBoundsException in VectorGroupByOperator >

[jira] [Commented] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs

2017-08-30 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148344#comment-16148344 ] Rui Li commented on HIVE-17193: --- Yes I think so. We don't consider DPP when we combine map w

[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-08-30 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148340#comment-16148340 ] Rui Li commented on HIVE-15104: --- [~xuefuz], my previous [comment|https://issues.apache.org/

[jira] [Commented] (HIVE-17405) HoS DPP ConstantPropagate should use ConstantPropagateOption.SHORTCUT

2017-08-29 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146616#comment-16146616 ] Rui Li commented on HIVE-17405: --- [~stakiar], I think the root cause is the vector GBY has so

[jira] [Commented] (HIVE-17405) HoS DPP ConstantPropagate should use ConstantPropagateOption.SHORTCUT

2017-08-29 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146478#comment-16146478 ] Rui Li commented on HIVE-17405: --- It seems this also fixes spark_vectorized_dynamic_partition

[jira] [Commented] (HIVE-16823) "ArrayIndexOutOfBoundsException" in spark_vectorized_dynamic_partition_pruning.q

2017-08-28 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143455#comment-16143455 ] Rui Li commented on HIVE-16823: --- [~kellyzly], the v1 patch doesn't fix the root cause of the

[jira] [Commented] (HIVE-17383) ArrayIndexOutOfBoundsException in VectorGroupByOperator

2017-08-27 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143449#comment-16143449 ] Rui Li commented on HIVE-17383: --- [~kellyzly], I don't see the works shown as vectorized in y

[jira] [Commented] (HIVE-17383) ArrayIndexOutOfBoundsException in VectorGroupByOperator

2017-08-27 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143413#comment-16143413 ] Rui Li commented on HIVE-17383: --- [~kellyzly], I can reproduce the issue with latest master.

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-08-24 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.5.patch > Hive on Spark generate more shuffle data than hive on mr >

[jira] [Commented] (HIVE-16823) "ArrayIndexOutOfBoundsException" in spark_vectorized_dynamic_partition_pruning.q

2017-08-24 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139902#comment-16139902 ] Rui Li commented on HIVE-16823: --- I have a simpler query to reproduce the issue and it happen

[jira] [Commented] (HIVE-16823) "ArrayIndexOutOfBoundsException" in spark_vectorized_dynamic_partition_pruning.q

2017-08-24 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139686#comment-16139686 ] Rui Li commented on HIVE-16823: --- [~kellyzly], the operator tree is different between spark a

[jira] [Commented] (HIVE-16823) "ArrayIndexOutOfBoundsException" in spark_vectorized_dynamic_partition_pruning.q

2017-08-23 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139582#comment-16139582 ] Rui Li commented on HIVE-16823: --- [~kellyzly] I think that's because when CBO is on, Constant

[jira] [Commented] (HIVE-16823) "ArrayIndexOutOfBoundsException" in spark_vectorized_dynamic_partition_pruning.q

2017-08-23 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139547#comment-16139547 ] Rui Li commented on HIVE-16823: --- Here's what I found so far. When we create vector GBY, the

[jira] [Commented] (HIVE-16823) "ArrayIndexOutOfBoundsException" in spark_vectorized_dynamic_partition_pruning.q

2017-08-23 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138093#comment-16138093 ] Rui Li commented on HIVE-16823: --- [~kellyzly], thanks for working on this. I think the root c

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-08-21 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.5.patch Run tests with the switch on. > Hive on Spark generate more shuffle data than hi

[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-08-20 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16134639#comment-16134639 ] Rui Li commented on HIVE-15104: --- Thanks [~xuefuz] and take your time. I guess we can also ru

[jira] [Updated] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-20 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17292: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to mast

[jira] [Commented] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16132057#comment-16132057 ] Rui Li commented on HIVE-17292: --- +1 > Change TestMiniSparkOnYarnCliDriver test configuratio

[jira] [Updated] (HIVE-16948) Invalid explain when running dynamic partition pruning query in Hive On Spark

2017-08-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-16948: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to mast

[jira] [Commented] (HIVE-16948) Invalid explain when running dynamic partition pruning query in Hive On Spark

2017-08-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131834#comment-16131834 ] Rui Li commented on HIVE-16948: --- +1. Thanks for the update [~kellyzly] > Invalid explain wh

[jira] [Updated] (HIVE-17347) TestMiniSparkOnYarnCliDriver[spark_dynamic_partition_pruning_mapjoin_only] is failing every time

2017-08-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17347: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to mast

[jira] [Updated] (HIVE-17346) TestMiniSparkOnYarnCliDriver[spark_dynamic_partition_pruning] is failing every time

2017-08-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17346: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to mast

[jira] [Comment Edited] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131618#comment-16131618 ] Rui Li edited comment on HIVE-17292 at 8/18/17 2:22 AM: Patch LGTM

[jira] [Commented] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131618#comment-16131618 ] Rui Li commented on HIVE-17292: --- Path LGTM. Let's wait a little bit to get HIVE-17346 and HI

[jira] [Commented] (HIVE-17347) TestMiniSparkOnYarnCliDriver[spark_dynamic_partition_pruning_mapjoin_only] is failing every time

2017-08-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131614#comment-16131614 ] Rui Li commented on HIVE-17347: --- +1 > TestMiniSparkOnYarnCliDriver[spark_dynamic_partition_

[jira] [Commented] (HIVE-17346) TestMiniSparkOnYarnCliDriver[spark_dynamic_partition_pruning] is failing every time

2017-08-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16131613#comment-16131613 ] Rui Li commented on HIVE-17346: --- Thanks [~pvary] for working on this. +1 > TestMiniSparkOnY

[jira] [Updated] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17321: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to mast

[jira] [Commented] (HIVE-16948) Invalid explain when running dynamic partition pruning query in Hive On Spark

2017-08-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16130024#comment-16130024 ] Rui Li commented on HIVE-16948: --- [~kellyzly], sorry about the delay. I've left some comments

[jira] [Commented] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129982#comment-16129982 ] Rui Li commented on HIVE-17321: --- [~kellyzly], yes they're all related to orc tables. > HoS:

[jira] [Commented] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129891#comment-16129891 ] Rui Li commented on HIVE-17321: --- Latest failures are not related. Changes to the golden file

[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-08-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129871#comment-16129871 ] Rui Li commented on HIVE-15104: --- Hi [~xuefuz], with HIVE-17114 and HIVE-17321 the benchmark

[jira] [Commented] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129807#comment-16129807 ] Rui Li commented on HIVE-17292: --- Hi [~pvary], as Xuefu agrees, let's only fix the yarn tests

[jira] [Updated] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17321: -- Attachment: HIVE-17321.2.patch Update golden files > HoS: analyze ORC table doesn't compute raw data size when

[jira] [Commented] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128407#comment-16128407 ] Rui Li commented on HIVE-17321: --- [~kellyzly], w/o the patch, analyze table w/o noscan/partia

[jira] [Commented] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128332#comment-16128332 ] Rui Li commented on HIVE-17321: --- [~kellyzly], the problem is if you run analyze table w/o no

[jira] [Commented] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128310#comment-16128310 ] Rui Li commented on HIVE-17292: --- I'm not sure if it's worth the efforts to update the golden

[jira] [Updated] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17321: -- Status: Patch Available (was: Open) > HoS: analyze ORC table doesn't compute raw data size when noscan/partials

[jira] [Updated] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17321: -- Attachment: HIVE-17321.1.patch > HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan >

[jira] [Updated] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17321: -- Priority: Minor (was: Major) > HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan >

[jira] [Updated] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17321: -- Description: Need to implement HIVE-9560 for Spark. > HoS: analyze ORC table doesn't compute raw data size when

[jira] [Assigned] (HIVE-17321) HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan is not specified

2017-08-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li reassigned HIVE-17321: - > HoS: analyze ORC table doesn't compute raw data size when noscan/partialscan > is not specified >

[jira] [Commented] (HIVE-17287) HoS can not deal with skewed data group by

2017-08-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126973#comment-16126973 ] Rui Li commented on HIVE-17287: --- [~kellyzly], group by w/ rollup and group by w/o rollup are

[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126738#comment-16126738 ] Rui Li commented on HIVE-17291: --- [~pvary], bq. Logic suggests, that in this case we will req

[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-14 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125508#comment-16125508 ] Rui Li commented on HIVE-17291: --- [~pvary], thanks so much for working in the middle of the n

[jira] [Commented] (HIVE-17287) HoS can not deal with skewed data group by

2017-08-14 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125371#comment-16125371 ] Rui Li commented on HIVE-17287: --- [~kellyzly], disabling {{spark.shuffle.reduceLocality.enabl

[jira] [Comment Edited] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-13 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125157#comment-16125157 ] Rui Li edited comment on HIVE-17292 at 8/14/17 2:40 AM: [~pvary],

[jira] [Commented] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-13 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125157#comment-16125157 ] Rui Li commented on HIVE-17292: --- [~pvary], thanks for the explanation. Would you mind set th

[jira] [Commented] (HIVE-17287) HoS can not deal with skewed data group by

2017-08-13 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125156#comment-16125156 ] Rui Li commented on HIVE-17287: --- The groupByKey shuffle uses unbounded memory. You can set

[jira] [Commented] (HIVE-17291) Set the number of executors based on config if client does not provide information

2017-08-13 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125137#comment-16125137 ] Rui Li commented on HIVE-17291: --- Is this just to avoid unstable test output? If so, it seems

[jira] [Commented] (HIVE-17287) HoS can not deal with skewed data group by

2017-08-13 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125130#comment-16125130 ] Rui Li commented on HIVE-17287: --- Have you tried {{hive.spark.use.groupby.shuffle}}? I think

[jira] [Commented] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-11 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123544#comment-16123544 ] Rui Li commented on HIVE-17292: --- I mean we did set {{RM_SCHEDULER_MINIMUM_ALLOCATION_MB}} to

[jira] [Commented] (HIVE-17287) HoS can not deal with skewed data group by

2017-08-11 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123204#comment-16123204 ] Rui Li commented on HIVE-17287: --- [~kellyzly], I mean you can check each of the group keys to

[jira] [Commented] (HIVE-17287) HoS can not deal with skewed data group by

2017-08-11 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123036#comment-16123036 ] Rui Li commented on HIVE-17287: --- OK that seems a skewed shuffle to me. You can run some stat

[jira] [Commented] (HIVE-17287) HoS can not deal with skewed data group by

2017-08-11 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122994#comment-16122994 ] Rui Li commented on HIVE-17287: --- To determine whether a shuffle is skewed, you need to look

[jira] [Commented] (HIVE-17287) HoS can not deal with skewed data group by

2017-08-11 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122984#comment-16122984 ] Rui Li commented on HIVE-17287: --- That config is enabled by default in 2.0: https://github.co

[jira] [Commented] (HIVE-17292) Change TestMiniSparkOnYarnCliDriver test configuration to use the configured cores

2017-08-10 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122933#comment-16122933 ] Rui Li commented on HIVE-17292: --- {{spark_vectorized_dynamic_partition_pruning}} doesn't work

[jira] [Commented] (HIVE-17287) HoS can not deal with skewed data group by

2017-08-10 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122929#comment-16122929 ] Rui Li commented on HIVE-17287: --- Hi [~kellyzly], I'm trying to understand how the group by i

[jira] [Updated] (HIVE-16945) Add method to compare Operators

2017-08-10 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-16945: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Test failures

[jira] [Commented] (HIVE-17247) HoS DPP: UDFs on the partition column side does not evaluate correctly

2017-08-10 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121180#comment-16121180 ] Rui Li commented on HIVE-17247: --- I think {{spark_dynamic_partition_pruning_mapjoin_only}} ne

[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

2017-08-09 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121020#comment-16121020 ] Rui Li commented on HIVE-17270: --- Hi [~pvary], I don't understand why 1 NM means we can only

[jira] [Updated] (HIVE-16945) Add method to compare Operators

2017-08-09 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-16945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-16945: -- Attachment: HIVE-16945.3.patch [~jcamachorodriguez], nice catch. Update to address the comment. > Add method to

[jira] [Commented] (HIVE-17270) Qtest results show wrong number of executors

2017-08-08 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119244#comment-16119244 ] Rui Li commented on HIVE-17270: --- When automatically deciding numReducers, it should be no le

[jira] [Commented] (HIVE-17247) HoS DPP: UDFs on the partition column side does not evaluate correctly

2017-08-06 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116037#comment-16116037 ] Rui Li commented on HIVE-17247: --- +1 > HoS DPP: UDFs on the partition column side does not e

[jira] [Updated] (HIVE-17177) move TestSuite.java to the right position

2017-08-02 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17177: -- Affects Version/s: (was: 3.0.0) > move TestSuite.java to the right position > --

[jira] [Updated] (HIVE-17177) move TestSuite.java to the right position

2017-08-02 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17177: -- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master. Thanks Saijin. > move Test

[jira] [Updated] (HIVE-17176) Add ASF header for LlapAllocatorBuffer.java

2017-08-02 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17176: -- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master. Thanks Saijin. > Add ASF h

[jira] [Updated] (HIVE-17176) Add ASF header for LlapAllocatorBuffer.java

2017-08-02 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-17176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-17176: -- Affects Version/s: (was: 3.0.0) > Add ASF header for LlapAllocatorBuffer.java >

<    1   2   3   4   5   6   7   8   9   10   >