[jira] [Commented] (HIVE-18361) Extend shared work optimizer to reuse computation beyond work boundaries

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309260#comment-16309260
 ] 

Hive QA commented on HIVE-18361:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} common: The patch generated 2 new + 931 unchanged - 0 
fixed = 933 total (was 931) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
29s{color} | {color:red} ql: The patch generated 23 new + 45 unchanged - 0 
fixed = 68 total (was 45) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8413/yetus/diff-checkstyle-common.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8413/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8413/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Extend shared work optimizer to reuse computation beyond work boundaries
> 
>
> Key: HIVE-18361
> URL: https://issues.apache.org/jira/browse/HIVE-18361
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
> Attachments: HIVE-18361.patch
>
>
> Follow-up of the work in HIVE-16867.
> HIVE-16867 introduced an optimization that identifies scans on input tables 
> that can be merged and reuses the computation that is done in the work 
> containing those scans. In particular, we traverse both parts of the plan 
> upstream and reuse the operators if possible.
> Currently, the optimizer will not go beyond the output edge(s) of that work. 
> This extension removes that limitation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17896) TopNKey: Create a standalone vectorizable TopNKey operator

2018-01-02 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-17896:
--
Attachment: HIVE-17896.5.patch

This fifth patch fixed TestDanglingQOuts failure. I also tested 
auto_sortmerge_join_2.q and lateral_view_ppd.q tests and they passed. It looks 
like they are unrelated with this patch.

> TopNKey: Create a standalone vectorizable TopNKey operator
> --
>
> Key: HIVE-17896
> URL: https://issues.apache.org/jira/browse/HIVE-17896
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Teddy Choi
> Attachments: HIVE-17896.1.patch, HIVE-17896.3.patch, 
> HIVE-17896.4.patch, HIVE-17896.5.patch
>
>
> For TPC-DS Query27, the TopN operation is delayed by the group-by - the 
> group-by operator buffers up all the rows before discarding the 99% of the 
> rows in the TopN Hash within the ReduceSink Operator.
> The RS TopN operator is very restrictive as it only supports doing the 
> filtering on the shuffle keys, but it is better to do this before breaking 
> the vectors into rows and losing the isRepeating properties.
> Adding a TopN Key operator in the physical operator tree allows the following 
> to happen.
> GBY->RS(Top=1)
> can become 
> TNK(1)->GBY->RS(Top=1)
> So that, the TopNKey can remove rows before they are buffered into the GBY 
> and consume memory.
> Here's the equivalent implementation in Presto
> https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35
> Adding this as a sub-feature of GroupBy prevents further optimizations if the 
> GBY is on keys "a,b,c" and the TopNKey is on just "a".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17929) Use sessionId for HoS Remote Driver Client id

2018-01-02 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309247#comment-16309247
 ] 

Rui Li commented on HIVE-17929:
---

+1

> Use sessionId for HoS Remote Driver Client id
> -
>
> Key: HIVE-17929
> URL: https://issues.apache.org/jira/browse/HIVE-17929
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17929.1.patch, HIVE-17929.2.patch, 
> HIVE-17929.3.patch
>
>
> Each {{SparkClientImpl}} creates a client connection using a client id. The 
> client id is created via {{UUID.randomUUID()}}.
> Since each HoS session has a single client connection we should just use the 
> sessionId instead (which is also a UUID). This should help simplify the code 
> and some of the client logging.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18360) NPE in TezSessionState

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309243#comment-16309243
 ] 

Hive QA commented on HIVE-18360:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904315/HIVE-18360.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11146 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=93)


[jira] [Commented] (HIVE-18360) NPE in TezSessionState

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309209#comment-16309209
 ] 

Hive QA commented on HIVE-18360:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8412/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> NPE in TezSessionState
> --
>
> Key: HIVE-18360
> URL: https://issues.apache.org/jira/browse/HIVE-18360
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18360.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309200#comment-16309200
 ] 

Hive QA commented on HIVE-17486:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904307/HIVE-17486.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 67 failed/errored test(s), 10834 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestLocalSparkCliDriver.org.apache.hadoop.hive.cli.TestLocalSparkCliDriver
 (batchId=247)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=176)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
 (batchId=179)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[materialized_view_authorization_create_no_grant]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_publisher_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=105)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=106)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=107)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=108)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=109)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=110)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=111)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=112)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=113)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=114)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=117)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=121)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=123)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=125)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=126)

[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309171#comment-16309171
 ] 

Hive QA commented on HIVE-17486:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 11s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-8411/patches/PreCommit-HIVE-Build-8411.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8411/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, HIVE-17486.2.patch, 
> HIVE-17486.3.patch, HIVE-17486.4.patch, HIVE-17486.5.patch, 
> explain.28.share.false, explain.28.share.true, scanshare.after.svg, 
> scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18221) test acid default

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309163#comment-16309163
 ] 

Hive QA commented on HIVE-18221:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904306/HIVE-18221.20.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 775 failed/errored test(s), 6901 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=1)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=2)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=4)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=40)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=50)

[jira] [Commented] (HIVE-18079) Statistics: Allow HyperLogLog to be merged to the lowest-common-denominator bit-size

2018-01-02 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309158#comment-16309158
 ] 

Gopal V commented on HIVE-18079:


With a clean run, will disable CBO rotating that join to make the tez bucket 
mapjoin test useful again.

> Statistics: Allow HyperLogLog to be merged to the lowest-common-denominator 
> bit-size
> 
>
> Key: HIVE-18079
> URL: https://issues.apache.org/jira/browse/HIVE-18079
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore, Statistics
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-18079.1.patch, HIVE-18079.2.patch, 
> HIVE-18079.4.patch, HIVE-18079.5.patch
>
>
> HyperLogLog can merge a 14 bit HLL into a 10 bit HLL bitset, because of its 
> mathematical hash distribution & construction.
> Allow the squashing of a 14 bit HLL -> 10 bit HLL without needing a second 
> scan over the data-set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18079) Statistics: Allow HyperLogLog to be merged to the lowest-common-denominator bit-size

2018-01-02 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-18079:
---
Attachment: HIVE-18079.5.patch

Rebase + explain one change to a test.

{code}
2018-01-02T20:25:52,726 DEBUG [36da1c38-207d-4b70-bb09-ff320a0062a3 main] 
calcite.sql2rel: Plan after trimming unused fields
HiveProject(key=[$0], key1=[$3])
  HiveJoin(condition=[=($1, $4)], joinType=[inner], algorithm=[none], cost=[not 
available])
HiveJoin(condition=[=($0, $2)], joinType=[inner], algorithm=[none], 
cost=[not available])
  HiveFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))])
HiveProject(key=[$0], value=[$1])
  HiveTableScan(table=[[default.tab_part]], table:alias=[a])
  HiveFilter(condition=[IS NOT NULL($0)])
HiveProject(key=[$0])
  HiveTableScan(table=[[default.tab_part]], table:alias=[c])
HiveFilter(condition=[IS NOT NULL($1)])
  HiveProject(key=[$0], value=[$1])
HiveTableScan(table=[[default.tab_part]], table:alias=[b])
{code}

became 

{code}
2018-01-02T20:25:52,925 DEBUG [36da1c38-207d-4b70-bb09-ff320a0062a3 main] 
translator.PlanModifierForASTConv: Original plan for PlanModifier
 HiveProject(key=[$0], key1=[$2])
  HiveJoin(condition=[=($0, $4)], joinType=[inner], algorithm=[none], cost=[not 
available])
HiveJoin(condition=[=($1, $3)], joinType=[inner], algorithm=[none], 
cost=[not available])
  HiveProject(key=[$0], value=[$1])
HiveFilter(condition=[AND(IS NOT NULL($0), IS NOT NULL($1))])
  HiveTableScan(table=[[default.tab_part]], table:alias=[a])
  HiveProject(key=[$0], value=[$1])
HiveFilter(condition=[IS NOT NULL($1)])
  HiveTableScan(table=[[default.tab_part]], table:alias=[b])
HiveProject(key=[$0])
  HiveFilter(condition=[IS NOT NULL($0)])
HiveTableScan(table=[[default.tab_part]], table:alias=[c]
{code}

> Statistics: Allow HyperLogLog to be merged to the lowest-common-denominator 
> bit-size
> 
>
> Key: HIVE-18079
> URL: https://issues.apache.org/jira/browse/HIVE-18079
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore, Statistics
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-18079.1.patch, HIVE-18079.2.patch, 
> HIVE-18079.4.patch, HIVE-18079.5.patch
>
>
> HyperLogLog can merge a 14 bit HLL into a 10 bit HLL bitset, because of its 
> mathematical hash distribution & construction.
> Allow the squashing of a 14 bit HLL -> 10 bit HLL without needing a second 
> scan over the data-set.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18190) Consider looking at ORC file schema rather than using _metadata_acid file

2018-01-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18190:
--
Attachment: HIVE-18190.07.patch

> Consider looking at ORC file schema rather than using _metadata_acid file
> -
>
> Key: HIVE-18190
> URL: https://issues.apache.org/jira/browse/HIVE-18190
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18190.01.patch, HIVE-18190.02.patch, 
> HIVE-18190.04.patch, HIVE-18190.05.patch, HIVE-18190.06.patch, 
> HIVE-18190.07.patch
>
>
> See if it's possible to just look at the schema of the file in base_ or 
> delta_ to see if it has Acid metadata columns.  If not, it's an 'original' 
> file and needs ROW_IDs generated.
> see more discussion at https://reviews.apache.org/r/64131/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18221) test acid default

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309135#comment-16309135
 ] 

Hive QA commented on HIVE-18221:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
4s{color} | {color:green} The patch has no ill-formed XML file. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
51s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  1m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| modules | C: . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8410/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> test acid default
> -
>
> Key: HIVE-18221
> URL: https://issues.apache.org/jira/browse/HIVE-18221
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18221.01.patch, HIVE-18221.02.patch, 
> HIVE-18221.03.patch, HIVE-18221.04.patch, HIVE-18221.07.patch, 
> HIVE-18221.08.patch, HIVE-18221.09.patch, HIVE-18221.10.patch, 
> HIVE-18221.11.patch, HIVE-18221.12.patch, HIVE-18221.13.patch, 
> HIVE-18221.14.patch, HIVE-18221.16.patch, HIVE-18221.18.patch, 
> HIVE-18221.19.patch, HIVE-18221.20.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18052) Run p-tests on mm tables

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309132#comment-16309132
 ] 

Hive QA commented on HIVE-18052:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  9m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
49s{color} | {color:red} ql: The patch generated 6 new + 1645 unchanged - 2 
fixed = 1651 total (was 1647) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m  
1s{color} | {color:red} root: The patch generated 6 new + 2764 unchanged - 2 
fixed = 2770 total (was 2766) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  3m 
43s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
2s{color} | {color:red} hcatalog-unit in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
1s{color} | {color:red} hive-minikdc in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
2s{color} | {color:red} hive-unit in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:blue}0{color} | {color:blue} asflicense {color} | {color:blue}  0m  
3s{color} | {color:blue} ASF License check generated no output? {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8408/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8408/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8408/yetus/whitespace-eol.txt 
|
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8408/yetus/patch-javadoc-root.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8408/yetus/patch-javadoc-itests_hcatalog-unit.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8408/yetus/patch-javadoc-itests_hive-minikdc.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8408/yetus/patch-javadoc-itests_hive-unit.txt
 |
| modules | C: common standalone-metastore ql service hcatalog/core 
hcatalog/hcatalog-pig-adapter hcatalog/server-extensions 
hcatalog/webhcat/java-client hcatalog/streaming . itests/hcatalog-unit 

[jira] [Commented] (HIVE-18190) Consider looking at ORC file schema rather than using _metadata_acid file

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309131#comment-16309131
 ] 

Hive QA commented on HIVE-18190:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904302/HIVE-18190.06.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8409/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8409/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8409/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-01-03 04:27:33.748
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-8409/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-01-03 04:27:33.753
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 5b0d993 HIVE-18294 - add switch to make acid table the default 
(Eugene Koifman, reviewed by Alan Gates)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 5b0d993 HIVE-18294 - add switch to make acid table the default 
(Eugene Koifman, reviewed by Alan Gates)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-01-03 04:27:40.266
+ rm -rf ../yetus
rm: cannot remove ?../yetus?: Directory not empty
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904302 - PreCommit-HIVE-Build

> Consider looking at ORC file schema rather than using _metadata_acid file
> -
>
> Key: HIVE-18190
> URL: https://issues.apache.org/jira/browse/HIVE-18190
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18190.01.patch, HIVE-18190.02.patch, 
> HIVE-18190.04.patch, HIVE-18190.05.patch, HIVE-18190.06.patch
>
>
> See if it's possible to just look at the schema of the file in base_ or 
> delta_ to see if it has Acid metadata columns.  If not, it's an 'original' 
> file and needs ROW_IDs generated.
> see more discussion at https://reviews.apache.org/r/64131/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18052) Run p-tests on mm tables

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309130#comment-16309130
 ] 

Hive QA commented on HIVE-18052:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904304/HIVE-18052.13.patch

{color:green}SUCCESS:{color} +1 due to 94 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1090 failed/errored test(s), 10204 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=93)


[jira] [Updated] (HIVE-18361) Extend shared work optimizer to reuse computation beyond work boundaries

2018-01-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18361:
---
Attachment: HIVE-18361.patch

> Extend shared work optimizer to reuse computation beyond work boundaries
> 
>
> Key: HIVE-18361
> URL: https://issues.apache.org/jira/browse/HIVE-18361
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
> Attachments: HIVE-18361.patch
>
>
> Follow-up of the work in HIVE-16867.
> HIVE-16867 introduced an optimization that identifies scans on input tables 
> that can be merged and reuses the computation that is done in the work 
> containing those scans. In particular, we traverse both parts of the plan 
> upstream and reuse the operators if possible.
> Currently, the optimizer will not go beyond the output edge(s) of that work. 
> This extension removes that limitation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18361) Extend shared work optimizer to reuse computation beyond work boundaries

2018-01-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-18361:
---
Status: Patch Available  (was: In Progress)

> Extend shared work optimizer to reuse computation beyond work boundaries
> 
>
> Key: HIVE-18361
> URL: https://issues.apache.org/jira/browse/HIVE-18361
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
>
> Follow-up of the work in HIVE-16867.
> HIVE-16867 introduced an optimization that identifies scans on input tables 
> that can be merged and reuses the computation that is done in the work 
> containing those scans. In particular, we traverse both parts of the plan 
> upstream and reuse the operators if possible.
> Currently, the optimizer will not go beyond the output edge(s) of that work. 
> This extension removes that limitation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HIVE-18361) Extend shared work optimizer to reuse computation beyond work boundaries

2018-01-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-18361 started by Jesus Camacho Rodriguez.
--
> Extend shared work optimizer to reuse computation beyond work boundaries
> 
>
> Key: HIVE-18361
> URL: https://issues.apache.org/jira/browse/HIVE-18361
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
>
> Follow-up of the work in HIVE-16867.
> HIVE-16867 introduced an optimization that identifies scans on input tables 
> that can be merged and reuses the computation that is done in the work 
> containing those scans. In particular, we traverse both parts of the plan 
> upstream and reuse the operators if possible.
> Currently, the optimizer will not go beyond the output edge(s) of that work. 
> This extension removes that limitation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18361) Extend shared work optimizer to reuse computation beyond work boundaries

2018-01-02 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-18361:
--


> Extend shared work optimizer to reuse computation beyond work boundaries
> 
>
> Key: HIVE-18361
> URL: https://issues.apache.org/jira/browse/HIVE-18361
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>  Labels: TODOC3.0
>
> Follow-up of the work in HIVE-16867.
> HIVE-16867 introduced an optimization that identifies scans on input tables 
> that can be merged and reuses the computation that is done in the work 
> containing those scans. In particular, we traverse both parts of the plan 
> upstream and reuse the operators if possible.
> Currently, the optimizer will not go beyond the output edge(s) of that work. 
> This extension removes that limitation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17396) Support DPP with map joins where the source and target belong in the same stage

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309089#comment-16309089
 ] 

Hive QA commented on HIVE-17396:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904294/HIVE-17396.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11146 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=93)


[jira] [Updated] (HIVE-18360) NPE in TezSessionState

2018-01-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18360:

Attachment: HIVE-18360.patch

A tiny patch. [~prasanth_j] [~jdere] can you take a look?

> NPE in TezSessionState
> --
>
> Key: HIVE-18360
> URL: https://issues.apache.org/jira/browse/HIVE-18360
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18360.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18360) NPE in TezSessionState

2018-01-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18360:

Status: Patch Available  (was: Open)

> NPE in TezSessionState
> --
>
> Key: HIVE-18360
> URL: https://issues.apache.org/jira/browse/HIVE-18360
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18360.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18360) NPE in TezSessionState

2018-01-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-18360:
---


> NPE in TezSessionState
> --
>
> Key: HIVE-18360
> URL: https://issues.apache.org/jira/browse/HIVE-18360
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepesh Khandelwal
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18269:

Attachment: HIVE-18269.01.patch

The patch. I have also attached a patch that I've tested on some cluster that 
limits the queue without replacing the linked list.

New patch needs some extensive cluster testing.
[~prasanth_j] [~gopalv] can you take a look?

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.01.patch, HIVE-18269.1.patch, 
> HIVE-18269.bad.patch, Screen Shot 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18269:

Attachment: HIVE-18269.bad.patch

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.1.patch, HIVE-18269.bad.patch, Screen Shot 
> 2017-12-13 at 1.15.16 AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17396) Support DPP with map joins where the source and target belong in the same stage

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309041#comment-16309041
 ] 

Hive QA commented on HIVE-17396:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
34s{color} | {color:red} ql: The patch generated 2 new + 21 unchanged - 2 fixed 
= 23 total (was 23) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8407/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8407/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support DPP with map joins where the source and target belong in the same 
> stage
> ---
>
> Key: HIVE-17396
> URL: https://issues.apache.org/jira/browse/HIVE-17396
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Attachments: HIVE-17396.1.patch, HIVE-17396.1.patch, 
> HIVE-17396.1.patch, HIVE-17396.2.patch
>
>
> When the target of a partition pruning sink operator is in not the same as 
> the target of hash table sink operator, both source and target gets scheduled 
> within the same spark job, and that can result in File Not Found Exception.  
> HIVE-17225 has a fix to disable DPP in that scenario.  This JIRA is to 
> support DPP for such cases.
> Test Case:
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.auto.convert.join=true;
> SET hive.strict.checks.cartesian.product=false;
> CREATE TABLE part_table1 (col int) PARTITIONED BY (part1_col int);
> CREATE TABLE part_table2 (col int) PARTITIONED BY (part2_col int);
> CREATE TABLE reg_table (col int);
> ALTER TABLE part_table1 ADD PARTITION (part1_col = 1);
> ALTER TABLE part_table2 ADD PARTITION (part2_col = 1);
> ALTER TABLE part_table2 ADD PARTITION (part2_col = 2);
> INSERT INTO TABLE part_table1 PARTITION (part1_col = 1) VALUES (1);
> INSERT INTO 

[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309017#comment-16309017
 ] 

Hive QA commented on HIVE-18214:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904278/HIVE-18214.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 11542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_mapjoin_15]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=177)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8406/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8406/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8406/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904278 - PreCommit-HIVE-Build

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2018-01-02 Thread liyunzhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang updated HIVE-17486:
--
Comment: was deleted

(was: [~stakiar]:
the original purpose to change M->R to M->M->R is to let 
CombineEquivalentWorkResolver combine same Maps. Like
logical plan
{code}
TS[0]-FIL[52]-SEL[2]-GBY[3]-RS[4]-GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
TS[1] -FIL[53]-SEL[9]-GBY[10]-RS[11]-GBY[12]-RS[43]-JOIN[48]
{code}  
physical plan
{code}  
Map1:TS[0]
Map2:TS[1]
Map3:FIL[52]-SEL[2]-GBY[3]-RS[4]
Map4:FIL[53]-SEL[9]-GBY[10]-RS[11]
Reducer1:GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
Reducer2:GBY[12]-RS[43]
{code}
For {{CombineEquivalentWorkResolver}}, it will combine same Maps. In above 
case, Map2 will be removed because TS\[0\] is same as TS\[1\].  

But when I finish the code, I found that there is no necessary to use this way 
to combine TS\[0\] and TS\[1\]. {{MapInput}} is responsible for TS and I only 
need generate same MapInput for TS\[0\] and TS\[1\]. More detail see 
HIVE-17486.5.patch.
)

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, HIVE-17486.2.patch, 
> HIVE-17486.3.patch, HIVE-17486.4.patch, HIVE-17486.5.patch, 
> explain.28.share.false, explain.28.share.true, scanshare.after.svg, 
> scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2018-01-02 Thread liyunzhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang updated HIVE-17486:
--
Attachment: HIVE-17486.5.patch

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, HIVE-17486.2.patch, 
> HIVE-17486.3.patch, HIVE-17486.4.patch, HIVE-17486.5.patch, 
> explain.28.share.false, explain.28.share.true, scanshare.after.svg, 
> scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2018-01-02 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308995#comment-16308995
 ] 

liyunzhang edited comment on HIVE-17486 at 1/3/18 1:40 AM:
---

[~stakiar]:
the original purpose to change M->R to M->M->R is to let 
CombineEquivalentWorkResolver combine same Maps. Like
logical plan
{code}
TS[0]-FIL[52]-SEL[2]-GBY[3]-RS[4]-GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
TS[1] -FIL[53]-SEL[9]-GBY[10]-RS[11]-GBY[12]-RS[43]-JOIN[48]
{code}  
physical plan
{code}  
Map1:TS[0]
Map2:TS[1]
Map3:FIL[52]-SEL[2]-GBY[3]-RS[4]
Map4:FIL[53]-SEL[9]-GBY[10]-RS[11]
Reducer1:GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
Reducer2:GBY[12]-RS[43]
{code}
For {{CombineEquivalentWorkResolver}}, it will combine same Maps. In above 
case, Map2 will be removed because TS\[0\] is same as TS\[1\].  

But when I finished the code, I found that there is no necessary to use this 
way to combine TS\[0\] and TS\[1\]. {{MapInput}} is responsible for TS and I 
only need generate same MapInput for TS\[0\] and TS\[1\]. More detail see 
HIVE-17486.5.patch.



was (Author: kellyzly):
[~stakiar]:
the original purpose to change M->R to M->M->R is to let 
CombineEquivalentWorkResolver combine same Maps. Like
logical plan
{code}
TS[0]-FIL[52]-SEL[2]-GBY[3]-RS[4]-GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
TS[1] -FIL[53]-SEL[9]-GBY[10]-RS[11]-GBY[12]-RS[43]-JOIN[48]
{code}  
physical plan
{code}  
Map1:TS[0]
Map2:TS[1]
Map3:FIL[52]-SEL[2]-GBY[3]-RS[4]
Map4:FIL[53]-SEL[9]-GBY[10]-RS[11]
Reducer1:GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
Reducer2:GBY[12]-RS[43]
{code}
For {{CombineEquivalentWorkResolver}}, it will combine same Maps. In above 
case, Map2 will be removed because TS\[0\] is same as TS\[1\].  

But when I finish the code, I found that there is no necessary to use this way 
to combine TS\[0\] and TS\[1\]. {{MapInput}} is responsible for TS and I only 
need generate same MapInput for TS\[0\] and TS\[1\]. More detail see 
HIVE-17486.5.patch.


> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, HIVE-17486.2.patch, 
> HIVE-17486.3.patch, HIVE-17486.4.patch, explain.28.share.false, 
> explain.28.share.true, scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2018-01-02 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308994#comment-16308994
 ] 

liyunzhang commented on HIVE-17486:
---

[~stakiar]:
the original purpose to change M->R to M->M->R is to let 
CombineEquivalentWorkResolver combine same Maps. Like
logical plan
{code}
TS[0]-FIL[52]-SEL[2]-GBY[3]-RS[4]-GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
TS[1] -FIL[53]-SEL[9]-GBY[10]-RS[11]-GBY[12]-RS[43]-JOIN[48]
{code}  
physical plan
{code}  
Map1:TS[0]
Map2:TS[1]
Map3:FIL[52]-SEL[2]-GBY[3]-RS[4]
Map4:FIL[53]-SEL[9]-GBY[10]-RS[11]
Reducer1:GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
Reducer2:GBY[12]-RS[43]
{code}
For {{CombineEquivalentWorkResolver}}, it will combine same Maps. In above 
case, Map2 will be removed because TS\[0\] is same as TS\[1\].  

But when I finish the code, I found that there is no necessary to use this way 
to combine TS\[0\] and TS\[1\]. {{MapInput}} is responsible for TS and I only 
need generate same MapInput for TS\[0\] and TS\[1\]. More detail see 
HIVE-17486.5.patch.


> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, HIVE-17486.2.patch, 
> HIVE-17486.3.patch, HIVE-17486.4.patch, explain.28.share.false, 
> explain.28.share.true, scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18221) test acid default

2018-01-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18221:
--
Attachment: HIVE-18221.20.patch

> test acid default
> -
>
> Key: HIVE-18221
> URL: https://issues.apache.org/jira/browse/HIVE-18221
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18221.01.patch, HIVE-18221.02.patch, 
> HIVE-18221.03.patch, HIVE-18221.04.patch, HIVE-18221.07.patch, 
> HIVE-18221.08.patch, HIVE-18221.09.patch, HIVE-18221.10.patch, 
> HIVE-18221.11.patch, HIVE-18221.12.patch, HIVE-18221.13.patch, 
> HIVE-18221.14.patch, HIVE-18221.16.patch, HIVE-18221.18.patch, 
> HIVE-18221.19.patch, HIVE-18221.20.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2018-01-02 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308995#comment-16308995
 ] 

liyunzhang commented on HIVE-17486:
---

[~stakiar]:
the original purpose to change M->R to M->M->R is to let 
CombineEquivalentWorkResolver combine same Maps. Like
logical plan
{code}
TS[0]-FIL[52]-SEL[2]-GBY[3]-RS[4]-GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
TS[1] -FIL[53]-SEL[9]-GBY[10]-RS[11]-GBY[12]-RS[43]-JOIN[48]
{code}  
physical plan
{code}  
Map1:TS[0]
Map2:TS[1]
Map3:FIL[52]-SEL[2]-GBY[3]-RS[4]
Map4:FIL[53]-SEL[9]-GBY[10]-RS[11]
Reducer1:GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
Reducer2:GBY[12]-RS[43]
{code}
For {{CombineEquivalentWorkResolver}}, it will combine same Maps. In above 
case, Map2 will be removed because TS\[0\] is same as TS\[1\].  

But when I finish the code, I found that there is no necessary to use this way 
to combine TS\[0\] and TS\[1\]. {{MapInput}} is responsible for TS and I only 
need generate same MapInput for TS\[0\] and TS\[1\]. More detail see 
HIVE-17486.5.patch.


> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, HIVE-17486.2.patch, 
> HIVE-17486.3.patch, HIVE-17486.4.patch, explain.28.share.false, 
> explain.28.share.true, scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18052) Run p-tests on mm tables

2018-01-02 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-18052:
--
Attachment: HIVE-18052.13.patch

> Run p-tests on mm tables
> 
>
> Key: HIVE-18052
> URL: https://issues.apache.org/jira/browse/HIVE-18052
> Project: Hive
>  Issue Type: Task
>Reporter: Steve Yeom
>Assignee: Steve Yeom
> Attachments: HIVE-18052.1.patch, HIVE-18052.10.patch, 
> HIVE-18052.11.patch, HIVE-18052.12.patch, HIVE-18052.13.patch, 
> HIVE-18052.2.patch, HIVE-18052.3.patch, HIVE-18052.4.patch, 
> HIVE-18052.5.patch, HIVE-18052.6.patch, HIVE-18052.7.patch, 
> HIVE-18052.8.patch, HIVE-18052.9.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18190) Consider looking at ORC file schema rather than using _metadata_acid file

2018-01-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18190:
--
Attachment: HIVE-18190.06.patch

> Consider looking at ORC file schema rather than using _metadata_acid file
> -
>
> Key: HIVE-18190
> URL: https://issues.apache.org/jira/browse/HIVE-18190
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18190.01.patch, HIVE-18190.02.patch, 
> HIVE-18190.04.patch, HIVE-18190.05.patch, HIVE-18190.06.patch
>
>
> See if it's possible to just look at the schema of the file in base_ or 
> delta_ to see if it has Acid metadata columns.  If not, it's an 'original' 
> file and needs ROW_IDs generated.
> see more discussion at https://reviews.apache.org/r/64131/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18214) Flaky test: TestSparkClient

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308944#comment-16308944
 ] 

Hive QA commented on HIVE-18214:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} spark-client: The patch generated 0 new + 38 
unchanged - 1 fixed = 38 total (was 39) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  8m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| modules | C: spark-client U: spark-client |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8406/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18221) test acid default

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308934#comment-16308934
 ] 

Hive QA commented on HIVE-18221:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904277/HIVE-18221.19.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8405/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8405/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8405/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-01-03 00:45:32.248
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-8405/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-01-03 00:45:32.252
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 5b0d993 HIVE-18294 - add switch to make acid table the default 
(Eugene Koifman, reviewed by Alan Gates)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 5b0d993 HIVE-18294 - add switch to make acid table the default 
(Eugene Koifman, reviewed by Alan Gates)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-01-03 00:45:37.401
+ rm -rf ../yetus
+ mkdir ../yetus
+ cp -R . ../yetus
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-8405/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Going to apply patch with: git apply -p0
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: protoc version: 250, detected platform: linux/amd64
protoc-jar: executing: [/tmp/protoc2023378580845206144.exe, 
-I/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore,
 
--java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources,
 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
Output file 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources/org/apache/hadoop/hive/metastore/parser/FilterParser.java
 does not exist: must build 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g
org/apache/hadoop/hive/metastore/parser/Filter.g
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process 
(process-resource-bundles) on project hive-hcatalog: Failed to resolve 
dependencies for one or more projects in the reactor. Reason: No versions are 
present in the repository for the artifact with a range [1.3.1,2.3]
[ERROR] net.minidev:json-smart:jar:null
[ERROR] 
[ERROR] from the specified remote repositories:
[ERROR] datanucleus (http://www.datanucleus.org/downloads/maven2, 
releases=true, snapshots=false),
[ERROR] glassfish-repository 
(http://maven.glassfish.org/content/groups/glassfish, releases=false, 
snapshots=false),
[ERROR] glassfish-repo-archive 
(http://maven.glassfish.org/content/groups/glassfish, releases=false, 
snapshots=false),
[ERROR] sonatype-snapshot 
(https://oss.sonatype.org/content/repositories/snapshots, releases=false, 
snapshots=false),
[ERROR] apache.snapshots (https://repository.apache.org/snapshots, 
releases=false, 

[jira] [Commented] (HIVE-18190) Consider looking at ORC file schema rather than using _metadata_acid file

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308932#comment-16308932
 ] 

Hive QA commented on HIVE-18190:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904273/HIVE-18190.05.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestAcidUtils.testParsing (batchId=266)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8404/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8404/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8404/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904273 - PreCommit-HIVE-Build

> Consider looking at ORC file schema rather than using _metadata_acid file
> -
>
> Key: HIVE-18190
> URL: https://issues.apache.org/jira/browse/HIVE-18190
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18190.01.patch, HIVE-18190.02.patch, 
> HIVE-18190.04.patch, HIVE-18190.05.patch
>
>
> See if it's possible to just look at the schema of the file in base_ or 
> delta_ to see if it has Acid metadata columns.  If not, it's an 'original' 
> file and needs ROW_IDs generated.
> see more discussion at https://reviews.apache.org/r/64131/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18358) from_unixtime returns wrong year for Dec 31 timestamps with format 'YYYY'

2018-01-02 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-18358:
-

Assignee: Andrew Sherman

> from_unixtime returns wrong year for Dec 31 timestamps with format ''
> -
>
> Key: HIVE-18358
> URL: https://issues.apache.org/jira/browse/HIVE-18358
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
> Environment: AWS EMR with Hive 2.1.0-amzn-0
>Reporter: Nick Orka
>Assignee: Andrew Sherman
>  Labels: timezone
>
> If you use capital Ys as a year format in from_unixtime() it returns next 
> year for Dec 31 only. All other days work as intended.
> Here is reproduction code:
> {code:sql}
> hive> select from_unixtime(1514754599, '-MM-dd HH-mm-ss'), 
> from_unixtime(1514754599, '-MM-dd HH-mm-ss');
> OK
> 2018-12-31 21-09-59   2017-12-31 21-09-59
> Time taken: 0.025 seconds, Fetched: 1 row(s)
> hive>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17482) External LLAP client: acquire locks for tables queried directly by LLAP

2018-01-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17482:
--
Component/s: Transactions

> External LLAP client: acquire locks for tables queried directly by LLAP
> ---
>
> Key: HIVE-17482
> URL: https://issues.apache.org/jira/browse/HIVE-17482
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap, Transactions
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 3.0.0
>
> Attachments: HIVE-17482.1.patch, HIVE-17482.2.patch, 
> HIVE-17482.3.patch, HIVE-17482.4.patch, HIVE-17482.5.patch, HIVE-17482.6.patch
>
>
> When using the LLAP external client with simple queries (filter/project of 
> single table), the appropriate locks should be taken on the table being read 
> like they are for normal Hive queries. This is important in the case of 
> transactional tables being queried, since the compactor relies on the 
> presence of table locks to determine whether it can safely delete old 
> versions of compacted files without affecting currently running queries.
> This does not have to happen in the complex query case, since a query is used 
> (with the appropriate locking mechanisms) to create/populate the temp table 
> holding the results to the complex query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18358) from_unixtime returns wrong year for Dec 31 timestamps with format 'YYYY'

2018-01-02 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308920#comment-16308920
 ] 

Andrew Sherman commented on HIVE-18358:
---

The code that deals with  is just 
[SimpleDateFormat|https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html].
 I think  is not the same as . It has a special meaning as 'week in 
year'. 
[https://stackoverflow.com/questions/15916958/simpledateformat-producing-wrong-date-time-when-parsing--mm-dd-hhmm]

> from_unixtime returns wrong year for Dec 31 timestamps with format ''
> -
>
> Key: HIVE-18358
> URL: https://issues.apache.org/jira/browse/HIVE-18358
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
> Environment: AWS EMR with Hive 2.1.0-amzn-0
>Reporter: Nick Orka
>  Labels: timezone
>
> If you use capital Ys as a year format in from_unixtime() it returns next 
> year for Dec 31 only. All other days work as intended.
> Here is reproduction code:
> {code:sql}
> hive> select from_unixtime(1514754599, '-MM-dd HH-mm-ss'), 
> from_unixtime(1514754599, '-MM-dd HH-mm-ss');
> OK
> 2018-12-31 21-09-59   2017-12-31 21-09-59
> Time taken: 0.025 seconds, Fetched: 1 row(s)
> hive>
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18190) Consider looking at ORC file schema rather than using _metadata_acid file

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308892#comment-16308892
 ] 

Hive QA commented on HIVE-18190:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
35s{color} | {color:red} ql: The patch generated 1 new + 464 unchanged - 0 
fixed = 465 total (was 464) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8404/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8404/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Consider looking at ORC file schema rather than using _metadata_acid file
> -
>
> Key: HIVE-18190
> URL: https://issues.apache.org/jira/browse/HIVE-18190
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18190.01.patch, HIVE-18190.02.patch, 
> HIVE-18190.04.patch, HIVE-18190.05.patch
>
>
> See if it's possible to just look at the schema of the file in base_ or 
> delta_ to see if it has Acid metadata columns.  If not, it's an 'original' 
> file and needs ROW_IDs generated.
> see more discussion at https://reviews.apache.org/r/64131/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18326) LLAP Tez scheduler - only preempt tasks if there's a dependency between them

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308886#comment-16308886
 ] 

Hive QA commented on HIVE-18326:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904271/HIVE-18326.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 11542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[archive_partspec2]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_alter_table_exchange_partition_fail]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[ctas_noemptyfolder]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[materialized_view_authorization_create_no_grant]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[materialized_view_authorization_drop_other]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_publisher_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[subquery_notin_implicit_gby]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[truncate_bucketed_column]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8403/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8403/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8403/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 26 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904271 - PreCommit-HIVE-Build

> LLAP Tez scheduler - only preempt tasks if there's a dependency between them
> 
>
> Key: HIVE-18326
> URL: https://issues.apache.org/jira/browse/HIVE-18326
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18326.01.patch, HIVE-18326.02.patch, 
> HIVE-18326.patch
>
>
> It is currently possible for e.g. two sides of a union (or a join for that 
> matter) to have slightly different priorities. We don't want to preempt 
> running tasks on one side in favor of the other side in such cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18359) Extend grouping set limits from int to long

2018-01-02 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-18359:



> Extend grouping set limits from int to long
> ---
>
> Key: HIVE-18359
> URL: https://issues.apache.org/jira/browse/HIVE-18359
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Grouping sets is broken for >32 columns because of usage of Int for bitmap 
> (also GROUPING__ID virtual column). This assumption breaks grouping 
> sets/rollups/cube when number of participating aggregation columns is >32. 
> The easier fix would be extend it to Long for now. The correct fix would be 
> to use BitSets everywhere but that would require GROUPING__ID column type to 
> binary which will make predicates on GROUPING__ID difficult to deal with. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17396) Support DPP with map joins where the source and target belong in the same stage

2018-01-02 Thread Janaki Lahorani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-17396:
---
Attachment: HIVE-17396.2.patch

Fix issues reported by Hive QA.

> Support DPP with map joins where the source and target belong in the same 
> stage
> ---
>
> Key: HIVE-17396
> URL: https://issues.apache.org/jira/browse/HIVE-17396
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Attachments: HIVE-17396.1.patch, HIVE-17396.1.patch, 
> HIVE-17396.1.patch, HIVE-17396.2.patch
>
>
> When the target of a partition pruning sink operator is in not the same as 
> the target of hash table sink operator, both source and target gets scheduled 
> within the same spark job, and that can result in File Not Found Exception.  
> HIVE-17225 has a fix to disable DPP in that scenario.  This JIRA is to 
> support DPP for such cases.
> Test Case:
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.auto.convert.join=true;
> SET hive.strict.checks.cartesian.product=false;
> CREATE TABLE part_table1 (col int) PARTITIONED BY (part1_col int);
> CREATE TABLE part_table2 (col int) PARTITIONED BY (part2_col int);
> CREATE TABLE reg_table (col int);
> ALTER TABLE part_table1 ADD PARTITION (part1_col = 1);
> ALTER TABLE part_table2 ADD PARTITION (part2_col = 1);
> ALTER TABLE part_table2 ADD PARTITION (part2_col = 2);
> INSERT INTO TABLE part_table1 PARTITION (part1_col = 1) VALUES (1);
> INSERT INTO TABLE part_table2 PARTITION (part2_col = 1) VALUES (1);
> INSERT INTO TABLE part_table2 PARTITION (part2_col = 2) VALUES (2);
> INSERT INTO table reg_table VALUES (1), (2), (3), (4), (5), (6);
> EXPLAIN SELECT *
> FROM   part_table1 pt1,
>part_table2 pt2,
>reg_table rt
> WHERE  rt.col = pt1.part1_col
> ANDpt2.part2_col = pt1.part1_col;
> Plan:
> STAGE DEPENDENCIES:
>   Stage-2 is a root stage
>   Stage-1 depends on stages: Stage-2
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-2
> Spark
>  A masked pattern was here 
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: pt1
>   Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: col (type: int), part1_col (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Spark HashTable Sink Operator
>   keys:
> 0 _col1 (type: int)
> 1 _col1 (type: int)
> 2 _col0 (type: int)
> Select Operator
>   expressions: _col1 (type: int)
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> keys: _col0 (type: int)
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Spark Partition Pruning Sink Operator
>   Target column: part2_col (int)
>   partition key expr: part2_col
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   target work: Map 2
> Local Work:
>   Map Reduce Local Work
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: pt2
>   Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: col (type: int), part2_col (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 2 Data size: 2 Basic stats: 
> COMPLETE Column stats: NONE
> Spark HashTable Sink Operator
>   keys:
> 0 _col1 (type: int)
> 1 _col1 (type: int)
> 2 _col0 (type: int)
> Local Work:
>   Map Reduce Local Work
>   Stage: Stage-1
> Spark
>  A masked pattern was here 
>   Vertices:
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: rt
>   Statistics: Num rows: 6 Data size: 6 Basic stats: 

[jira] [Commented] (HIVE-18255) spark-client jar should be prefixed with hive-

2018-01-02 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308847#comment-16308847
 ] 

Sahil Takiar commented on HIVE-18255:
-

Test failures look un-related.

> spark-client jar should be prefixed with hive-
> --
>
> Key: HIVE-18255
> URL: https://issues.apache.org/jira/browse/HIVE-18255
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18255.1.patch, HIVE-18255.2.patch, 
> HIVE-18255.3.patch
>
>
> Other Hive jars are prefixed with "hive-" except for the spark-client jar. 
> Fixing this to make sure the jar name is consistent across all Hive jars.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2018-01-02 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308835#comment-16308835
 ] 

Sahil Takiar commented on HIVE-16484:
-

Attaching updated patch. Basically, just rebased the old patch and resolved all 
conflicts.

I'll keep this JIRA focused on migrating to {{SparkLauncher}} and work on 
integrating with {{InProcessLauncher}} in a sub-task.

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16484.1.patch, HIVE-16484.2.patch, 
> HIVE-16484.3.patch, HIVE-16484.4.patch, HIVE-16484.5.patch, 
> HIVE-16484.6.patch, HIVE-16484.7.patch, HIVE-16484.8.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2018-01-02 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16484:

Attachment: HIVE-16484.8.patch

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16484.1.patch, HIVE-16484.2.patch, 
> HIVE-16484.3.patch, HIVE-16484.4.patch, HIVE-16484.5.patch, 
> HIVE-16484.6.patch, HIVE-16484.7.patch, HIVE-16484.8.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18326) LLAP Tez scheduler - only preempt tasks if there's a dependency between them

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308820#comment-16308820
 ] 

Hive QA commented on HIVE-18326:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m  
9s{color} | {color:red} llap-tez: The patch generated 6 new + 146 unchanged - 0 
fixed = 152 total (was 146) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8403/yetus/diff-checkstyle-llap-tez.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8403/yetus/whitespace-eol.txt 
|
| modules | C: common llap-tez U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8403/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP Tez scheduler - only preempt tasks if there's a dependency between them
> 
>
> Key: HIVE-18326
> URL: https://issues.apache.org/jira/browse/HIVE-18326
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18326.01.patch, HIVE-18326.02.patch, 
> HIVE-18326.patch
>
>
> It is currently possible for e.g. two sides of a union (or a join for that 
> matter) to have slightly different priorities. We don't want to preempt 
> running tasks on one side in favor of the other side in such cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18255) spark-client jar should be prefixed with hive-

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308805#comment-16308805
 ] 

Hive QA commented on HIVE-18255:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904253/HIVE-18255.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 11542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_aggregator_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8402/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8402/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8402/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904253 - PreCommit-HIVE-Build

> spark-client jar should be prefixed with hive-
> --
>
> Key: HIVE-18255
> URL: https://issues.apache.org/jira/browse/HIVE-18255
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18255.1.patch, HIVE-18255.2.patch, 
> HIVE-18255.3.patch
>
>
> Other Hive jars are prefixed with "hive-" except for the spark-client jar. 
> Fixing this to make sure the jar name is consistent across all Hive jars.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18255) spark-client jar should be prefixed with hive-

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308732#comment-16308732
 ] 

Hive QA commented on HIVE-18255:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| modules | C: spark-client ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8402/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> spark-client jar should be prefixed with hive-
> --
>
> Key: HIVE-18255
> URL: https://issues.apache.org/jira/browse/HIVE-18255
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18255.1.patch, HIVE-18255.2.patch, 
> HIVE-18255.3.patch
>
>
> Other Hive jars are prefixed with "hive-" except for the spark-client jar. 
> Fixing this to make sure the jar name is consistent across all Hive jars.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18238) Driver execution may not have configuration changing sideeffects

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308707#comment-16308707
 ] 

Hive QA commented on HIVE-18238:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904252/HIVE-18238.02.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 50 failed/errored test(s), 11147 tests 
executed
*Failed tests:*
{noformat}
TestNegativeCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=92)


[jira] [Updated] (HIVE-18214) Flaky test: TestSparkClient

2018-01-02 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18214:

Status: Patch Available  (was: Open)

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18214) Flaky test: TestSparkClient

2018-01-02 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18214:

Attachment: HIVE-18214.1.patch

> Flaky test: TestSparkClient
> ---
>
> Key: HIVE-18214
> URL: https://issues.apache.org/jira/browse/HIVE-18214
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18214.1.patch
>
>
> Looks like there is a race condition in {{TestSparkClient#runTest}}. The test 
> creates a {{RemoteDriver}} in memory, which creates a {{JavaSparkContext}}. A 
> new {{JavaSparkContext}} is created for each test that is run. There is a 
> race condition where the {{RemoteDriver}} isn't given enough time to 
> shutdown, so when the next test starts running it creates another 
> {{JavaSparkContext}} which causes an exception like 
> {{org.apache.spark.SparkException: Only one SparkContext may be running in 
> this JVM (see SPARK-2243)}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18221) test acid default

2018-01-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18221:
--
Attachment: HIVE-18221.19.patch

> test acid default
> -
>
> Key: HIVE-18221
> URL: https://issues.apache.org/jira/browse/HIVE-18221
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18221.01.patch, HIVE-18221.02.patch, 
> HIVE-18221.03.patch, HIVE-18221.04.patch, HIVE-18221.07.patch, 
> HIVE-18221.08.patch, HIVE-18221.09.patch, HIVE-18221.10.patch, 
> HIVE-18221.11.patch, HIVE-18221.12.patch, HIVE-18221.13.patch, 
> HIVE-18221.14.patch, HIVE-18221.16.patch, HIVE-18221.18.patch, 
> HIVE-18221.19.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18238) Driver execution may not have configuration changing sideeffects

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308653#comment-16308653
 ] 

Hive QA commented on HIVE-18238:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
34s{color} | {color:red} ql: The patch generated 1 new + 192 unchanged - 0 
fixed = 193 total (was 192) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} hcatalog/core: The patch generated 0 new + 33 
unchanged - 1 fixed = 33 total (was 34) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} The patch hcatalog-pig-adapter passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} The patch server-extensions passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch hive-unit passed checkstyle {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8401/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql hcatalog/core hcatalog/hcatalog-pig-adapter 
hcatalog/server-extensions itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8401/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Driver execution may not have configuration changing sideeffects 
> -
>
> Key: HIVE-18238
> URL: https://issues.apache.org/jira/browse/HIVE-18238
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18238.01wip01.patch, HIVE-18238.02.patch
>
>
> {{Driver}} executes sql statements which use "hiveconf" settings;
> but the {{Driver}} itself may *not* change the configuration...
> I've found an example; which shows how hazardous this 

[jira] [Commented] (HIVE-18336) add Safe Mode

2018-01-02 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308654#comment-16308654
 ] 

Eugene Koifman commented on HIVE-18336:
---

It would be useful to add optional validation to acid processing.
For example, HIVE-18190 does a check whether the contents of a delta/ were 
written by Acid code path, i.e. have ROW_IDs embedded.  Perhaps in safe mode it 
should check each file rather than choosing one.

Another example, on read of a delta/ check that it has a full compliment of 
bucket files if running with MR. 
That no "extra" files are present.
That all files have the same schema.
Acid tables are meant to be written by Hive but I've seen multiple times where 
users forget this and try to move/copy files resulting strange issues.

Maybe "safe mode" is not the right name/approach.  Perhaps this should be a 
separate tool that can analyze an acid table and report potential issues.


> add Safe Mode
> -
>
> Key: HIVE-18336
> URL: https://issues.apache.org/jira/browse/HIVE-18336
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18356) Fixing license headers in checkstyle

2018-01-02 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308631#comment-16308631
 ] 

Andrew Sherman commented on HIVE-18356:
---

+1 LGTM. Thanks for doing this [~pvary]

> Fixing license headers in checkstyle
> 
>
> Key: HIVE-18356
> URL: https://issues.apache.org/jira/browse/HIVE-18356
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-18356.patch
>
>
> The checkstyle header contains the following ASF header:
> {code}
> /**
>   * Licensed to the Apache Software Foundation (ASF) under one
>   * or more contributor license agreements.  See the NOTICE file
> [..]
> {code}
> Even if we undecided what to do with the already existing headers 
> (HIVE-17952), the new ones should use the proper one with 1 '*' in the first 
> line:
> {code}
> /*
>   * Licensed to the Apache Software Foundation (ASF) under one
>   * or more contributor license agreements.  See the NOTICE file
> [..]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17952) Fix license headers to avoid dangling javadoc warnings

2018-01-02 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-17952:
-

Assignee: Andrew Sherman

> Fix license headers to avoid dangling javadoc warnings
> --
>
> Key: HIVE-17952
> URL: https://issues.apache.org/jira/browse/HIVE-17952
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Andrew Sherman
>Priority: Trivial
>
> All license headers starts with "/**" which are assumed to be javadocs and 
> IDE warns about dangling javadoc pointing to license headers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18096) add a user-friendly show plan command

2018-01-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308626#comment-16308626
 ] 

Sergey Shelukhin commented on HIVE-18096:
-

[~harishjp] ping?

> add a user-friendly show plan command
> -
>
> Key: HIVE-18096
> URL: https://issues.apache.org/jira/browse/HIVE-18096
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Harish Jaiprakash
>
> For admin to be able to get an overview of a resource plan.
> We need to try to do this using sysdb. 
> If that is not possible to do in a nice way, we'd do a text-based one like 
> query explain, or desc extended table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18273) add LLAP-level counters for WM

2018-01-02 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308622#comment-16308622
 ] 

Sergey Shelukhin commented on HIVE-18273:
-

Test failures appear to be the same as in other JIRAs. [~prasanth_j] can you 
take a look? thnx

> add LLAP-level counters for WM
> --
>
> Key: HIVE-18273
> URL: https://issues.apache.org/jira/browse/HIVE-18273
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18273.01.patch, HIVE-18273.02.patch, 
> HIVE-18273.patch
>
>
> On query fragment level (like IO counters)
> time queued as guaranteed;
> time running as guaranteed;
> time running as speculative.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18190) Consider looking at ORC file schema rather than using _metadata_acid file

2018-01-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18190:
--
Attachment: HIVE-18190.05.patch

> Consider looking at ORC file schema rather than using _metadata_acid file
> -
>
> Key: HIVE-18190
> URL: https://issues.apache.org/jira/browse/HIVE-18190
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18190.01.patch, HIVE-18190.02.patch, 
> HIVE-18190.04.patch, HIVE-18190.05.patch
>
>
> See if it's possible to just look at the schema of the file in base_ or 
> delta_ to see if it has Acid metadata columns.  If not, it's an 'original' 
> file and needs ROW_IDs generated.
> see more discussion at https://reviews.apache.org/r/64131/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17396) Support DPP with map joins where the source and target belong in the same stage

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308614#comment-16308614
 ] 

Hive QA commented on HIVE-17396:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904251/HIVE-17396.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 11542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_constprog_dpp]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=176)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_2]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_4]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=176)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8400/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8400/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8400/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904251 - PreCommit-HIVE-Build

> Support DPP with map joins where the source and target belong in the same 
> stage
> ---
>
> Key: HIVE-17396
> URL: https://issues.apache.org/jira/browse/HIVE-17396
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Attachments: HIVE-17396.1.patch, HIVE-17396.1.patch, 
> HIVE-17396.1.patch
>
>
> When the target of a partition pruning sink operator is in not the same as 
> the target of hash table sink operator, both source and target gets scheduled 
> within the same spark job, and that can result in File Not Found Exception.  
> HIVE-17225 has a fix to disable DPP in that scenario.  This JIRA is to 
> support DPP for such cases.
> Test Case:
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.auto.convert.join=true;
> SET hive.strict.checks.cartesian.product=false;
> CREATE TABLE part_table1 (col int) PARTITIONED BY (part1_col int);
> CREATE TABLE part_table2 (col int) PARTITIONED BY (part2_col int);
> CREATE TABLE reg_table (col int);
> ALTER TABLE part_table1 ADD PARTITION (part1_col = 1);
> ALTER TABLE part_table2 ADD PARTITION (part2_col = 1);
> ALTER TABLE part_table2 ADD PARTITION (part2_col = 2);
> INSERT INTO TABLE part_table1 PARTITION (part1_col = 1) VALUES (1);
> INSERT INTO TABLE part_table2 

[jira] [Commented] (HIVE-16826) Improvements for SeparatedValuesOutputFormat

2018-01-02 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308594#comment-16308594
 ] 

Aihua Xu commented on HIVE-16826:
-

[~belugabehr] That's great. +1.

> Improvements for SeparatedValuesOutputFormat
> 
>
> Key: HIVE-16826
> URL: https://issues.apache.org/jira/browse/HIVE-16826
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16826.1.patch, HIVE-16826.2.patch
>
>
> Proposing changes to class 
> {{org.apache.hive.beeline.SeparatedValuesOutputFormat}}.
> # Simplify the code
> # Code currently creates and destroys {{CsvListWriter}}, which contains a 
> buffer, for every line printed
> # Use Apache Commons libraries for certain actions
> # Prefer non-synchronized {{StringBuilderWriter}} to Java's synchronized 
> {{StringWriter}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18269) LLAP: Fast llap io with slow processing pipeline can lead to OOM

2018-01-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18269:

Status: Open  (was: Patch Available)

> LLAP: Fast llap io with slow processing pipeline can lead to OOM
> 
>
> Key: HIVE-18269
> URL: https://issues.apache.org/jira/browse/HIVE-18269
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18269.1.patch, Screen Shot 2017-12-13 at 1.15.16 
> AM.png
>
>
> pendingData linked list in Llap IO elevator (LlapRecordReader.java) may grow 
> indefinitely when Llap IO is faster than processing pipeline. Since we don't 
> have backpressure to slow down the IO, this can lead to indefinite growth of 
> pending data leading to severe GC pressure and eventually lead to OOM.
> This specific instance of LLAP was running on HDFS on top of EBS volume 
> backed by SSD. The query that triggered this is issue was ANALYZE STATISTICS 
> .. FOR COLUMNS which also gather bitvectors. Fast IO and Slow processing case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18326) LLAP Tez scheduler - only preempt tasks if there's a dependency between them

2018-01-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18326:

Attachment: HIVE-18326.02.patch

Same patch. 

> LLAP Tez scheduler - only preempt tasks if there's a dependency between them
> 
>
> Key: HIVE-18326
> URL: https://issues.apache.org/jira/browse/HIVE-18326
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18326.01.patch, HIVE-18326.02.patch, 
> HIVE-18326.patch
>
>
> It is currently possible for e.g. two sides of a union (or a join for that 
> matter) to have slightly different priorities. We don't want to preempt 
> running tasks on one side in favor of the other side in such cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15393) Update Guava version

2018-01-02 Thread slim bouguerra (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308573#comment-16308573
 ] 

slim bouguerra commented on HIVE-15393:
---

[~kgyrtkirk] Thanks this was working on my laptop and did not notice the that 
it is failing, i will investigate this. 

> Update Guava version
> 
>
> Key: HIVE-15393
> URL: https://issues.apache.org/jira/browse/HIVE-15393
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-15393.2.patch, HIVE-15393.3.patch, 
> HIVE-15393.5.patch, HIVE-15393.6.patch, HIVE-15393.7.patch, 
> HIVE-15393.8.patch, HIVE-15393.9.patch, HIVE-15393.patch
>
>
> Druid base code is using newer version of guava 16.0.1 that is not compatible 
> with the current version used by Hive.
> FYI Hadoop project is moving to Guava 18 not sure if it is better to move to 
> guava 18 or even 19.
> https://issues.apache.org/jira/browse/HADOOP-10101



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15393) Update Guava version

2018-01-02 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308552#comment-16308552
 ] 

Zoltan Haindrich commented on HIVE-15393:
-

[~bslim] it seems to me that {{TestDruidRecordWriter#testWrite}} is broken by 
this patchI suspect that the tests are not being run against the shaded 
jar; and different guava are mixed  on the classpath

> Update Guava version
> 
>
> Key: HIVE-15393
> URL: https://issues.apache.org/jira/browse/HIVE-15393
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-15393.2.patch, HIVE-15393.3.patch, 
> HIVE-15393.5.patch, HIVE-15393.6.patch, HIVE-15393.7.patch, 
> HIVE-15393.8.patch, HIVE-15393.9.patch, HIVE-15393.patch
>
>
> Druid base code is using newer version of guava 16.0.1 that is not compatible 
> with the current version used by Hive.
> FYI Hadoop project is moving to Guava 18 not sure if it is better to move to 
> guava 18 or even 19.
> https://issues.apache.org/jira/browse/HADOOP-10101



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2018-01-02 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308547#comment-16308547
 ] 

Sahil Takiar commented on HIVE-17486:
-

[~kellyzly] {quote} what i want to ask is there any possiblity to change 
current structure in the SparkTask in HoS {quote}

What exactly is the advantage of introducing a M -> M edge? I know Tez does it, 
but I think they mainly use it to broadcast data from one set of map tasks to 
another (which is useful for map-joins).

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, HIVE-17486.2.patch, 
> HIVE-17486.3.patch, HIVE-17486.4.patch, explain.28.share.false, 
> explain.28.share.true, scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16826) Improvements for SeparatedValuesOutputFormat

2018-01-02 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308537#comment-16308537
 ] 

BELUGA BEHR commented on HIVE-16826:


[~aihuaxu] I certainly tried to keep it the same...

https://github.com/apache/hive/blob/dd2697c00dffe17699f19f8accfbf5c14bd07219/itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java

> Improvements for SeparatedValuesOutputFormat
> 
>
> Key: HIVE-16826
> URL: https://issues.apache.org/jira/browse/HIVE-16826
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16826.1.patch, HIVE-16826.2.patch
>
>
> Proposing changes to class 
> {{org.apache.hive.beeline.SeparatedValuesOutputFormat}}.
> # Simplify the code
> # Code currently creates and destroys {{CsvListWriter}}, which contains a 
> buffer, for every line printed
> # Use Apache Commons libraries for certain actions
> # Prefer non-synchronized {{StringBuilderWriter}} to Java's synchronized 
> {{StringWriter}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18344) Remove LinkedList from SharedWorkOptimizer.java

2018-01-02 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308539#comment-16308539
 ] 

BELUGA BEHR commented on HIVE-18344:


[~pvary] :)

> Remove LinkedList from SharedWorkOptimizer.java
> ---
>
> Key: HIVE-18344
> URL: https://issues.apache.org/jira/browse/HIVE-18344
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Trivial
> Attachments: HIVE-18344.1.patch
>
>
> Prefer {{ArrayList}} over {{LinkedList}} especially in this class because the 
> initial size of the collection is known.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18348) Hive creates 4 different connection pools to metastore of different size

2018-01-02 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308532#comment-16308532
 ] 

Eugene Koifman commented on HIVE-18348:
---

It does require 2 pools.
connPoolMutex is used to manage locking (as in multiple metastores trying to 
perform the same op) via RDBMS.
This has to be done in a separate pool otherwise if you have many concurrent 
ops you may end up with the 1st pool being maxed out but and so the all 
operation get blocked because they need another connection to acquire the 
mutex.  HIVE-16321 has more details. 

> Hive creates 4 different connection pools to metastore of different size
> 
>
> Key: HIVE-18348
> URL: https://issues.apache.org/jira/browse/HIVE-18348
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> Enabling debug logging with HikariCP, I can see that Hive creates 4 
> connection pools. {code:title=first connection pool creation stack trace} 
> "main@1" prio=5 tid=0x1 nid=NA runnable java.lang.Thread.State: RUNNABLE at 
> com.zaxxer.hikari.HikariDataSource.(HikariDataSource.java:73) at 
> org.datanucleus.store.rdbms.connectionpool.HikariCPConnectionPoolFactory.createConnectionPool(HikariCPConnectionPoolFactory.java:176)
>  at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:213)
>  at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:117)
>  - locked <0x102b> (a org.datanucleus.store.rdbms.ConnectionFactoryImpl) at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.(ConnectionFactoryImpl.java:82)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeConstructorAccessorImpl.java:-1)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
>  at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:330)
>  at 
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:203)
>  at 
> org.datanucleus.store.AbstractStoreManager.(AbstractStoreManager.java:162)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:285)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeConstructorAccessorImpl.java:-1)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
>  at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
>  at 
> org.datanucleus.NucleusContextHelper.createStoreManagerForProperties(NucleusContextHelper.java:133)
>  at 
> org.datanucleus.PersistenceNucleusContextImpl.initialise(PersistenceNucleusContextImpl.java:422)
>  - locked <0x1035> (a org.datanucleus.PersistenceNucleusContextImpl) at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:817)
>  - locked <0x1036> (a org.datanucleus.api.jdo.JDOPersistenceManagerFactory) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:334)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:213)
>  - locked <0xeb8> (a java.lang.Class) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at 
> java.security.AccessController.doPrivileged(AccessController.java:-1) at 
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at 
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
>  at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) at 
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:619) - 
> locked <0x957> (a 

[jira] [Commented] (HIVE-18348) Hive creates 4 different connection pools to metastore of different size

2018-01-02 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308525#comment-16308525
 ] 

Prasanth Jayachandran commented on HIVE-18348:
--

[~ekoifman] Does it require 2 separate pools or guaranteed connection for 
cleaner and other threads? How is it different from having 2 separate pools vs 
single big pool of 2x size?

> Hive creates 4 different connection pools to metastore of different size
> 
>
> Key: HIVE-18348
> URL: https://issues.apache.org/jira/browse/HIVE-18348
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> Enabling debug logging with HikariCP, I can see that Hive creates 4 
> connection pools. {code:title=first connection pool creation stack trace} 
> "main@1" prio=5 tid=0x1 nid=NA runnable java.lang.Thread.State: RUNNABLE at 
> com.zaxxer.hikari.HikariDataSource.(HikariDataSource.java:73) at 
> org.datanucleus.store.rdbms.connectionpool.HikariCPConnectionPoolFactory.createConnectionPool(HikariCPConnectionPoolFactory.java:176)
>  at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:213)
>  at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:117)
>  - locked <0x102b> (a org.datanucleus.store.rdbms.ConnectionFactoryImpl) at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.(ConnectionFactoryImpl.java:82)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeConstructorAccessorImpl.java:-1)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
>  at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:330)
>  at 
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:203)
>  at 
> org.datanucleus.store.AbstractStoreManager.(AbstractStoreManager.java:162)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:285)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeConstructorAccessorImpl.java:-1)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
>  at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
>  at 
> org.datanucleus.NucleusContextHelper.createStoreManagerForProperties(NucleusContextHelper.java:133)
>  at 
> org.datanucleus.PersistenceNucleusContextImpl.initialise(PersistenceNucleusContextImpl.java:422)
>  - locked <0x1035> (a org.datanucleus.PersistenceNucleusContextImpl) at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:817)
>  - locked <0x1036> (a org.datanucleus.api.jdo.JDOPersistenceManagerFactory) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:334)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:213)
>  - locked <0xeb8> (a java.lang.Class) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at 
> java.security.AccessController.doPrivileged(AccessController.java:-1) at 
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at 
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
>  at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) at 
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:619) - 
> locked <0x957> (a java.lang.Class) at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:662)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.initializeHelper(ObjectStore.java:452)
>  at 

[jira] [Commented] (HIVE-18356) Fixing license headers in checkstyle

2018-01-02 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308526#comment-16308526
 ] 

Prasanth Jayachandran commented on HIVE-18356:
--

+1

> Fixing license headers in checkstyle
> 
>
> Key: HIVE-18356
> URL: https://issues.apache.org/jira/browse/HIVE-18356
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-18356.patch
>
>
> The checkstyle header contains the following ASF header:
> {code}
> /**
>   * Licensed to the Apache Software Foundation (ASF) under one
>   * or more contributor license agreements.  See the NOTICE file
> [..]
> {code}
> Even if we undecided what to do with the already existing headers 
> (HIVE-17952), the new ones should use the proper one with 1 '*' in the first 
> line:
> {code}
> /*
>   * Licensed to the Apache Software Foundation (ASF) under one
>   * or more contributor license agreements.  See the NOTICE file
> [..]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17396) Support DPP with map joins where the source and target belong in the same stage

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308519#comment-16308519
 ] 

Hive QA commented on HIVE-17396:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} common: The patch generated 1 new + 931 unchanged - 0 
fixed = 932 total (was 931) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
32s{color} | {color:red} ql: The patch generated 7 new + 21 unchanged - 2 fixed 
= 28 total (was 23) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5b0d993 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8400/yetus/diff-checkstyle-common.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8400/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8400/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support DPP with map joins where the source and target belong in the same 
> stage
> ---
>
> Key: HIVE-17396
> URL: https://issues.apache.org/jira/browse/HIVE-17396
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Attachments: HIVE-17396.1.patch, HIVE-17396.1.patch, 
> HIVE-17396.1.patch
>
>
> When the target of a partition pruning sink operator is in not the same as 
> the target of hash table sink operator, both source and target gets scheduled 
> within the same spark job, and that can result in File Not Found Exception.  
> HIVE-17225 has a fix to disable DPP in that scenario.  This JIRA is to 
> support DPP for such cases.
> Test Case:
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.auto.convert.join=true;
> SET hive.strict.checks.cartesian.product=false;
> CREATE TABLE part_table1 (col int) PARTITIONED BY (part1_col int);
> CREATE TABLE part_table2 (col int) PARTITIONED BY (part2_col int);
> CREATE 

[jira] [Commented] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2018-01-02 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308515#comment-16308515
 ] 

Marcelo Vanzin commented on HIVE-16484:
---

That's correct. Using it only with deploy mode == cluster sounds good too, 
although in that case the user still needs to deploy Spark separately somehow 
if they want to use client mode.

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16484.1.patch, HIVE-16484.2.patch, 
> HIVE-16484.3.patch, HIVE-16484.4.patch, HIVE-16484.5.patch, 
> HIVE-16484.6.patch, HIVE-16484.7.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18348) Hive creates 4 different connection pools to metastore of different size

2018-01-02 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308513#comment-16308513
 ] 

Eugene Koifman commented on HIVE-18348:
---

TxnHandler needs to use 2 pools otherwise it can deadlock.

> Hive creates 4 different connection pools to metastore of different size
> 
>
> Key: HIVE-18348
> URL: https://issues.apache.org/jira/browse/HIVE-18348
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> Enabling debug logging with HikariCP, I can see that Hive creates 4 
> connection pools. {code:title=first connection pool creation stack trace} 
> "main@1" prio=5 tid=0x1 nid=NA runnable java.lang.Thread.State: RUNNABLE at 
> com.zaxxer.hikari.HikariDataSource.(HikariDataSource.java:73) at 
> org.datanucleus.store.rdbms.connectionpool.HikariCPConnectionPoolFactory.createConnectionPool(HikariCPConnectionPoolFactory.java:176)
>  at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.generateDataSources(ConnectionFactoryImpl.java:213)
>  at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.initialiseDataSources(ConnectionFactoryImpl.java:117)
>  - locked <0x102b> (a org.datanucleus.store.rdbms.ConnectionFactoryImpl) at 
> org.datanucleus.store.rdbms.ConnectionFactoryImpl.(ConnectionFactoryImpl.java:82)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeConstructorAccessorImpl.java:-1)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
>  at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:330)
>  at 
> org.datanucleus.store.AbstractStoreManager.registerConnectionFactory(AbstractStoreManager.java:203)
>  at 
> org.datanucleus.store.AbstractStoreManager.(AbstractStoreManager.java:162)
>  at 
> org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:285)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeConstructorAccessorImpl.java:-1)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
> org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
>  at 
> org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
>  at 
> org.datanucleus.NucleusContextHelper.createStoreManagerForProperties(NucleusContextHelper.java:133)
>  at 
> org.datanucleus.PersistenceNucleusContextImpl.initialise(PersistenceNucleusContextImpl.java:422)
>  - locked <0x1035> (a org.datanucleus.PersistenceNucleusContextImpl) at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:817)
>  - locked <0x1036> (a org.datanucleus.api.jdo.JDOPersistenceManagerFactory) 
> at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:334)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:213)
>  - locked <0xeb8> (a java.lang.Class) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> javax.jdo.JDOHelper$16.run(JDOHelper.java:1965) at 
> java.security.AccessController.doPrivileged(AccessController.java:-1) at 
> javax.jdo.JDOHelper.invoke(JDOHelper.java:1960) at 
> javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
>  at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808) at 
> javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701) at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:619) - 
> locked <0x957> (a java.lang.Class) at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:662)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.initializeHelper(ObjectStore.java:452)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:389) 
> at 

[jira] [Commented] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2018-01-02 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308510#comment-16308510
 ] 

Sahil Takiar commented on HIVE-16484:
-

Thanks [~vanzin]. I think I was confused about Spark's deploy-mode option. To 
clarify, you don't recommend using {{InProcessLauncher}} in client mode (but 
unit tests should be ok), but it should be fine in cluster mode?

I think that should work for HoS, most HoS deployments should be using cluster 
mode already; we could also just change the code so that {{InProcessLauncher}} 
is only used when deploy mode = cluster.

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16484.1.patch, HIVE-16484.2.patch, 
> HIVE-16484.3.patch, HIVE-16484.4.patch, HIVE-16484.5.patch, 
> HIVE-16484.6.patch, HIVE-16484.7.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18353) CompactorMR should call jobclient.close() to trigger cleanup

2018-01-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18353:
--
Component/s: Transactions

> CompactorMR should call jobclient.close() to trigger cleanup
> 
>
> Key: HIVE-18353
> URL: https://issues.apache.org/jira/browse/HIVE-18353
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.2.1
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>
> HiveMetastore process is leaking TrustStore reloader threads when running 
> compaction as JobClient close is not called from CompactorMR - MAPREDUCE-6618 
> and MAPREDUCE-6621 
> {code}
> "Truststore reloader thread" #2814 daemon prio=1 os_prio=0 
> tid=0x00cdc800 nid=0x2f05a waiting on condition [0x7fdaef403000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:194)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18354) Fix test TestAcidOnTez

2018-01-02 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308504#comment-16308504
 ] 

Eugene Koifman commented on HIVE-18354:
---

it's not clear why stats would cause any changes but the new output is ok

+1

> Fix test TestAcidOnTez 
> ---
>
> Key: HIVE-18354
> URL: https://issues.apache.org/jira/browse/HIVE-18354
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18354.01.patch
>
>
> stats autogather is disabled in this test; because it caused some troubles



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13000) Hive returns useless parsing error

2018-01-02 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308476#comment-16308476
 ] 

Eugene Koifman commented on HIVE-13000:
---

There need to be a patch attached that passes the tests.  Patch 5 didn't apply 
and so the build system was not able to run the tests

> Hive returns useless parsing error 
> ---
>
> Key: HIVE-13000
> URL: https://issues.apache.org/jira/browse/HIVE-13000
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 1.0.0, 1.2.1, 2.2.0
>Reporter: Alina Abramova
>Assignee: Alina Abramova
>Priority: Minor
> Attachments: HIVE-13000.1.patch, HIVE-13000.2.patch, 
> HIVE-13000.3.patch, HIVE-13000.4.patch, HIVE-13000.5.patch
>
>
> When I run query like these I receive unclear exception
> hive> SELECT record FROM ctest GROUP BY record.instance_id;
> FAILED: SemanticException Error in parsing 
> It will be clearer if it would be like:
> hive> SELECT record FROM ctest GROUP BY record.instance_id;
> FAILED: SemanticException  Expression not in GROUP BY key record



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16826) Improvements for SeparatedValuesOutputFormat

2018-01-02 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308492#comment-16308492
 ] 

Aihua Xu commented on HIVE-16826:
-

[~belugabehr] Your change is much cleaner. I'm wondering if we have tests to 
cover SeparatedValuesOutputFormat since by reading the code I'm not sure if 
they behaves the same.

> Improvements for SeparatedValuesOutputFormat
> 
>
> Key: HIVE-16826
> URL: https://issues.apache.org/jira/browse/HIVE-16826
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-16826.1.patch, HIVE-16826.2.patch
>
>
> Proposing changes to class 
> {{org.apache.hive.beeline.SeparatedValuesOutputFormat}}.
> # Simplify the code
> # Code currently creates and destroys {{CsvListWriter}}, which contains a 
> buffer, for every line printed
> # Use Apache Commons libraries for certain actions
> # Prefer non-synchronized {{StringBuilderWriter}} to Java's synchronized 
> {{StringWriter}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18354) Fix test TestAcidOnTez

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308484#comment-16308484
 ] 

Hive QA commented on HIVE-18354:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904250/HIVE-18354.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 11542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[materialized_view_authorization_create_no_grant]
 (batchId=93)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[stats_publisher_error_1]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8399/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8399/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8399/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904250 - PreCommit-HIVE-Build

> Fix test TestAcidOnTez 
> ---
>
> Key: HIVE-18354
> URL: https://issues.apache.org/jira/browse/HIVE-18354
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18354.01.patch
>
>
> stats autogather is disabled in this test; because it caused some troubles



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2018-01-02 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308460#comment-16308460
 ] 

Marcelo Vanzin commented on HIVE-16484:
---

{{InProcessLauncher}} works for client mode, but I wouldn't recommend it. It 
will only allow one Spark application at a time, and has some other side 
effects (like the system properties being polluted by the application's 
configuration, and affecting applications launched afterwards).

That should be ok for unit tests, though.

For actual HoS deployments, I don't see a lot of advantages in supporting 
client mode, so you could use that API to launch everything in cluster mode.

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16484.1.patch, HIVE-16484.2.patch, 
> HIVE-16484.3.patch, HIVE-16484.4.patch, HIVE-16484.5.patch, 
> HIVE-16484.6.patch, HIVE-16484.7.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18294) add switch to make acid table the default

2018-01-02 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18294:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

committed to master
thanks Alan for the review

> add switch to make acid table the default
> -
>
> Key: HIVE-18294
> URL: https://issues.apache.org/jira/browse/HIVE-18294
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
> Attachments: HIVE-18294.01.patch, HIVE-18294.03.patch, 
> HIVE-18294.04.patch, HIVE-18294.05.patch
>
>
> it would be convenient for testing to have a switch that enables the behavior 
> where all suitable table tables (currently ORC + not sorted) are 
> automatically created with transactional=true, ie. full acid.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18354) Fix test TestAcidOnTez

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308417#comment-16308417
 ] 

Hive QA commented on HIVE-18354:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 4ecf2a7 |
| Default Java | 1.8.0_111 |
| modules | C: itests/hive-unit U: itests/hive-unit |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8399/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix test TestAcidOnTez 
> ---
>
> Key: HIVE-18354
> URL: https://issues.apache.org/jira/browse/HIVE-18354
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18354.01.patch
>
>
> stats autogather is disabled in this test; because it caused some troubles



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18255) spark-client jar should be prefixed with hive-

2018-01-02 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-18255:

Attachment: HIVE-18255.3.patch

> spark-client jar should be prefixed with hive-
> --
>
> Key: HIVE-18255
> URL: https://issues.apache.org/jira/browse/HIVE-18255
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-18255.1.patch, HIVE-18255.2.patch, 
> HIVE-18255.3.patch
>
>
> Other Hive jars are prefixed with "hive-" except for the spark-client jar. 
> Fixing this to make sure the jar name is consistent across all Hive jars.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2018-01-02 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16484:

Status: Open  (was: Patch Available)

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-16484.1.patch, HIVE-16484.2.patch, 
> HIVE-16484.3.patch, HIVE-16484.4.patch, HIVE-16484.5.patch, 
> HIVE-16484.6.patch, HIVE-16484.7.patch
>
>
> The {{SparkClientImpl#startDriver}} currently looks for the {{SPARK_HOME}} 
> directory and invokes the {{bin/spark-submit}} script, which spawns a 
> separate process to run the Spark application.
> {{SparkLauncher}} was added in SPARK-4924 and is a programatic way to launch 
> Spark applications.
> I see a few advantages:
> * No need to spawn a separate process to launch a HoS --> lower startup time
> * Simplifies the code in {{SparkClientImpl}} --> easier to debug
> * {{SparkLauncher#startApplication}} returns a {{SparkAppHandle}} which 
> contains some useful utilities for querying the state of the Spark job
> ** It also allows the launcher to specify a list of job listeners



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18238) Driver execution may not have configuration changing sideeffects

2018-01-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18238:

Attachment: HIVE-18238.02.patch

#2) set {{hive.query.id}} and {{hive.query.string}} at the session level to 
keep everything working

> Driver execution may not have configuration changing sideeffects 
> -
>
> Key: HIVE-18238
> URL: https://issues.apache.org/jira/browse/HIVE-18238
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18238.01wip01.patch, HIVE-18238.02.patch
>
>
> {{Driver}} executes sql statements which use "hiveconf" settings;
> but the {{Driver}} itself may *not* change the configuration...
> I've found an example; which shows how hazardous this is...
> {code}
> set hive.mapred.mode=strict;
> select "${hiveconf:hive.mapred.mode}";
> create table t (a int);
> analyze table t compute statistics;
> select "${hiveconf:hive.mapred.mode}";
> {code}
> currently; the last select returns {{nonstrict}} because of 
> [this|https://github.com/apache/hive/blob/7ddd915bf82a68c8ab73b0c4ca409f1a6d43d227/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L1696]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18255) spark-client jar should be prefixed with hive-

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308398#comment-16308398
 ] 

Hive QA commented on HIVE-18255:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904246/HIVE-18255.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8397/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8397/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8397/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-01-02 17:24:14.610
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-8397/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-01-02 17:24:14.613
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   64229b9..4ecf2a7  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 64229b9 HIVE-15393: addendum: querystring hashcode changes
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at 4ecf2a7 HIVE-18149: addendum; update missed TestAcidOnTez
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-01-02 17:24:19.869
+ rm -rf ../yetus
+ mkdir ../yetus
+ cp -R . ../yetus
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-8397/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/spark-client/pom.xml: does not exist in index
Going to apply patch with: git apply -p1
+ [[ maven == \m\a\v\e\n ]]
+ rm -rf /data/hiveptest/working/maven/org/apache/hive
+ mvn -B clean install -DskipTests -T 4 -q 
-Dmaven.repo.local=/data/hiveptest/working/maven
protoc-jar: protoc version: 250, detected platform: linux/amd64
protoc-jar: executing: [/tmp/protoc1817701067133710957.exe, 
-I/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore,
 
--java_out=/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources,
 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/protobuf/org/apache/hadoop/hive/metastore/metastore.proto]
ANTLR Parser Generator  Version 3.5.2
Output file 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/target/generated-sources/org/apache/hadoop/hive/metastore/parser/FilterParser.java
 does not exist: must build 
/data/hiveptest/working/apache-github-source-source/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g
org/apache/hadoop/hive/metastore/parser/Filter.g
log4j:WARN No appenders could be found for logger (DataNucleus.General).
log4j:WARN Please initialize the log4j system properly.
DataNucleus Enhancer (version 4.1.17) for API "JDO"
DataNucleus Enhancer : Classpath
>>  /usr/share/maven/boot/plexus-classworlds-2.x.jar
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDatabase
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MFieldSchema
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MType
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTable
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MConstraint
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MSerDeInfo
ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MOrder
ENHANCED (Persistable) : 

[jira] [Commented] (HIVE-16484) Investigate SparkLauncher for HoS as alternative to bin/spark-submit

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308400#comment-16308400
 ] 

Hive QA commented on HIVE-16484:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12865595/HIVE-16484.7.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8398/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8398/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8398/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-01-02 17:27:43.925
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-8398/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-01-02 17:27:43.929
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 4ecf2a7 HIVE-18149: addendum; update missed TestAcidOnTez
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 4ecf2a7 HIVE-18149: addendum; update missed TestAcidOnTez
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-01-02 17:27:44.476
+ rm -rf ../yetus
+ mkdir ../yetus
+ cp -R . ../yetus
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-8398/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: 
a/spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java: 
does not exist in index
error: 
a/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java:
 does not exist in index
error: 
a/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java: 
does not exist in index
error: 
a/spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java: 
does not exist in index
error: patch failed: 
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java:75
Falling back to three-way merge...
Applied patch to 
'spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java'
 with conflicts.
error: patch failed: 
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java:24
Falling back to three-way merge...
Applied patch to 
'spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java' 
with conflicts.
Going to apply patch with: git apply -p1
error: patch failed: 
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java:75
Falling back to three-way merge...
Applied patch to 
'spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java'
 with conflicts.
error: patch failed: 
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java:24
Falling back to three-way merge...
Applied patch to 
'spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java' 
with conflicts.
U 
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java
U spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12865595 - PreCommit-HIVE-Build

> Investigate SparkLauncher for HoS as alternative to bin/spark-submit
> 
>
> Key: HIVE-16484
> URL: https://issues.apache.org/jira/browse/HIVE-16484
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> 

[jira] [Commented] (HIVE-17896) TopNKey: Create a standalone vectorizable TopNKey operator

2018-01-02 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308397#comment-16308397
 ] 

Hive QA commented on HIVE-17896:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904230/HIVE-17896.4.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 11546 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view_ppd] 
(batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_reduce_groupby_duplicate_cols]
 (batchId=158)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=208)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.ql.TestAcidOnTez.testMapJoinOnTez (batchId=222)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8396/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8396/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8396/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12904230 - PreCommit-HIVE-Build

> TopNKey: Create a standalone vectorizable TopNKey operator
> --
>
> Key: HIVE-17896
> URL: https://issues.apache.org/jira/browse/HIVE-17896
> Project: Hive
>  Issue Type: New Feature
>  Components: Operators
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Teddy Choi
> Attachments: HIVE-17896.1.patch, HIVE-17896.3.patch, 
> HIVE-17896.4.patch
>
>
> For TPC-DS Query27, the TopN operation is delayed by the group-by - the 
> group-by operator buffers up all the rows before discarding the 99% of the 
> rows in the TopN Hash within the ReduceSink Operator.
> The RS TopN operator is very restrictive as it only supports doing the 
> filtering on the shuffle keys, but it is better to do this before breaking 
> the vectors into rows and losing the isRepeating properties.
> Adding a TopN Key operator in the physical operator tree allows the following 
> to happen.
> GBY->RS(Top=1)
> can become 
> TNK(1)->GBY->RS(Top=1)
> So that, the TopNKey can remove rows before they are buffered into the GBY 
> and consume memory.
> Here's the equivalent implementation in Presto
> https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35
> Adding this as a sub-feature of GroupBy prevents further optimizations if the 
> GBY is on keys "a,b,c" and the TopNKey is on just "a".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17396) Support DPP with map joins where the source and target belong in the same stage

2018-01-02 Thread Janaki Lahorani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-17396:
---
Attachment: HIVE-17396.1.patch

> Support DPP with map joins where the source and target belong in the same 
> stage
> ---
>
> Key: HIVE-17396
> URL: https://issues.apache.org/jira/browse/HIVE-17396
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Attachments: HIVE-17396.1.patch, HIVE-17396.1.patch, 
> HIVE-17396.1.patch
>
>
> When the target of a partition pruning sink operator is in not the same as 
> the target of hash table sink operator, both source and target gets scheduled 
> within the same spark job, and that can result in File Not Found Exception.  
> HIVE-17225 has a fix to disable DPP in that scenario.  This JIRA is to 
> support DPP for such cases.
> Test Case:
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.auto.convert.join=true;
> SET hive.strict.checks.cartesian.product=false;
> CREATE TABLE part_table1 (col int) PARTITIONED BY (part1_col int);
> CREATE TABLE part_table2 (col int) PARTITIONED BY (part2_col int);
> CREATE TABLE reg_table (col int);
> ALTER TABLE part_table1 ADD PARTITION (part1_col = 1);
> ALTER TABLE part_table2 ADD PARTITION (part2_col = 1);
> ALTER TABLE part_table2 ADD PARTITION (part2_col = 2);
> INSERT INTO TABLE part_table1 PARTITION (part1_col = 1) VALUES (1);
> INSERT INTO TABLE part_table2 PARTITION (part2_col = 1) VALUES (1);
> INSERT INTO TABLE part_table2 PARTITION (part2_col = 2) VALUES (2);
> INSERT INTO table reg_table VALUES (1), (2), (3), (4), (5), (6);
> EXPLAIN SELECT *
> FROM   part_table1 pt1,
>part_table2 pt2,
>reg_table rt
> WHERE  rt.col = pt1.part1_col
> ANDpt2.part2_col = pt1.part1_col;
> Plan:
> STAGE DEPENDENCIES:
>   Stage-2 is a root stage
>   Stage-1 depends on stages: Stage-2
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-2
> Spark
>  A masked pattern was here 
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: pt1
>   Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: col (type: int), part1_col (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Spark HashTable Sink Operator
>   keys:
> 0 _col1 (type: int)
> 1 _col1 (type: int)
> 2 _col0 (type: int)
> Select Operator
>   expressions: _col1 (type: int)
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> keys: _col0 (type: int)
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
> Spark Partition Pruning Sink Operator
>   Target column: part2_col (int)
>   partition key expr: part2_col
>   Statistics: Num rows: 1 Data size: 1 Basic stats: 
> COMPLETE Column stats: NONE
>   target work: Map 2
> Local Work:
>   Map Reduce Local Work
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: pt2
>   Statistics: Num rows: 2 Data size: 2 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: col (type: int), part2_col (type: int)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 2 Data size: 2 Basic stats: 
> COMPLETE Column stats: NONE
> Spark HashTable Sink Operator
>   keys:
> 0 _col1 (type: int)
> 1 _col1 (type: int)
> 2 _col0 (type: int)
> Local Work:
>   Map Reduce Local Work
>   Stage: Stage-1
> Spark
>  A masked pattern was here 
>   Vertices:
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: rt
>   Statistics: Num rows: 6 Data size: 6 Basic stats: COMPLETE 
> Column stats: NONE
>   Filter 

[jira] [Updated] (HIVE-18354) Fix test TestAcidOnTez

2018-01-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18354:

Status: Patch Available  (was: Open)

> Fix test TestAcidOnTez 
> ---
>
> Key: HIVE-18354
> URL: https://issues.apache.org/jira/browse/HIVE-18354
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18354.01.patch
>
>
> stats autogather is disabled in this test; because it caused some troubles



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18354) Fix test TestAcidOnTez

2018-01-02 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308379#comment-16308379
 ] 

Zoltan Haindrich commented on HIVE-18354:
-

[~ekoifman] could you please take a look?  

> Fix test TestAcidOnTez 
> ---
>
> Key: HIVE-18354
> URL: https://issues.apache.org/jira/browse/HIVE-18354
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18354.01.patch
>
>
> stats autogather is disabled in this test; because it caused some troubles



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18354) Fix test TestAcidOnTez

2018-01-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18354:

Description: stats autogather is disabled in this test; because it caused 
some troubles  (was: it seems like this test have been broken by HIVE-18149)

> Fix test TestAcidOnTez 
> ---
>
> Key: HIVE-18354
> URL: https://issues.apache.org/jira/browse/HIVE-18354
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18354.01.patch
>
>
> stats autogather is disabled in this test; because it caused some troubles



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18354) Fix test TestAcidOnTez

2018-01-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18354:

Attachment: HIVE-18354.01.patch

#1) remove disabling of autogather; update results in test

> Fix test TestAcidOnTez 
> ---
>
> Key: HIVE-18354
> URL: https://issues.apache.org/jira/browse/HIVE-18354
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18354.01.patch
>
>
> it seems like this test have been broken by HIVE-18149



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18311) Enable smb_mapjoin_8.q for cli driver

2018-01-02 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308358#comment-16308358
 ] 

Sahil Takiar commented on HIVE-18311:
-

+1

> Enable smb_mapjoin_8.q for cli driver
> -
>
> Key: HIVE-18311
> URL: https://issues.apache.org/jira/browse/HIVE-18311
> Project: Hive
>  Issue Type: Bug
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-18311.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18149) Stats: rownum estimation from datasize underestimates in most cases

2018-01-02 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-18149:

Labels:   (was: TODOC3.0)

Thank you [~leftylev], I've updated the docs.

> Stats: rownum estimation from datasize underestimates in most cases
> ---
>
> Key: HIVE-18149
> URL: https://issues.apache.org/jira/browse/HIVE-18149
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-18149.01.patch, HIVE-18149.01wip01.patch, 
> HIVE-18149.02.patch, HIVE-18149.03.patch, HIVE-18149.03wip01.patch, 
> HIVE-18149.03wip02.patch
>
>
> rownum estimation is based on the following fact as of now:
> * datasize being used from the following sources:
> ** basicstats aggregates the loaded "on-heap" row sizes ; other readers are 
> able to give "raw size" estimation - I've checked orc; but I'm sure others 
> will do the sameapi docs are a bit vague about the methods purpose...
> ** if the basicstats level info is not available; the filesystem level 
> "file-size-sums" are used as the "raw data size" ; which is multiplied by the 
> [deserialization 
> ratio|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L261]
>  ; which is currently 1.
> the problem with all of this is that deser factor is 1; and that rowsize 
> counts in the online object headers..
> example; 20 rows are loaded into a partition 
> [columnstats_partlvl_dp.q|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L7]
> after HIVE-18108 [this 
> explain|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L25]
>  will estimate the rowsize of the table to be 404 bytes; however the 20 rows 
> of text is only 169 bytes...so it ends up with 0 rows...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18149) Stats: rownum estimation from datasize underestimates in most cases

2018-01-02 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308351#comment-16308351
 ] 

Zoltan Haindrich commented on HIVE-18149:
-

I've added some addendums...I've missed TestAcidOnTez - fortunately it had set 
noconditionalthreshold already

> Stats: rownum estimation from datasize underestimates in most cases
> ---
>
> Key: HIVE-18149
> URL: https://issues.apache.org/jira/browse/HIVE-18149
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-18149.01.patch, HIVE-18149.01wip01.patch, 
> HIVE-18149.02.patch, HIVE-18149.03.patch, HIVE-18149.03wip01.patch, 
> HIVE-18149.03wip02.patch
>
>
> rownum estimation is based on the following fact as of now:
> * datasize being used from the following sources:
> ** basicstats aggregates the loaded "on-heap" row sizes ; other readers are 
> able to give "raw size" estimation - I've checked orc; but I'm sure others 
> will do the sameapi docs are a bit vague about the methods purpose...
> ** if the basicstats level info is not available; the filesystem level 
> "file-size-sums" are used as the "raw data size" ; which is multiplied by the 
> [deserialization 
> ratio|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L261]
>  ; which is currently 1.
> the problem with all of this is that deser factor is 1; and that rowsize 
> counts in the online object headers..
> example; 20 rows are loaded into a partition 
> [columnstats_partlvl_dp.q|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L7]
> after HIVE-18108 [this 
> explain|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/test/queries/clientpositive/columnstats_partlvl_dp.q#L25]
>  will estimate the rowsize of the table to be 404 bytes; however the 20 rows 
> of text is only 169 bytes...so it ends up with 0 rows...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >