date:20170222

[jira] [Updated] (HIVE-15959) LLAP: fix headroom calculation and move it to daemon

2017-02-22 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15959:
--
Labels: TODOC2.2  (was: )

> LLAP: fix headroom calculation and move it to daemon
> 
>
> Key: HIVE-15959
> URL: https://issues.apache.org/jira/browse/HIVE-15959
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.2
> Attachments: HIVE-15959.01.patch, HIVE-15959.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15959) LLAP: fix headroom calculation and move it to daemon

2017-02-22 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877749#comment-15877749
 ] 

Lefty Leverenz commented on HIVE-15959:
---

Doc note:  This adds *hive.llap.daemon.xmx.headroom* to HiveConf.java and 
removes *hive.llap.daemon.headroom.memory.per.instance.mb*, which was created 
by HIVE-15347 (also in release 2.2.0).  So *hive.llap.daemon.xmx.headroom* 
should be documented in the wiki and 
*hive.llap.daemon.headroom.memory.per.instance.mb* should not be documented.

* [Configuration Properties -- LLAP | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAP]

Added a TODOC2.2 label.

Also:  [~sershe], this issue needs a fix version.

> LLAP: fix headroom calculation and move it to daemon
> 
>
> Key: HIVE-15959
> URL: https://issues.apache.org/jira/browse/HIVE-15959
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.2
> Attachments: HIVE-15959.01.patch, HIVE-15959.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-15347) LLAP: Executor memory and Xmx should have some headroom for other services

2017-02-22 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753816#comment-15753816
 ] 

Lefty Leverenz edited comment on HIVE-15347 at 2/22/17 8:18 AM:


Doc note:  This adds *hive.llap.daemon.headroom.memory.per.instance.mb* to 
HiveConf.java and changes the default of 
*hive.llap.daemon.memory.per.instance.mb* back to 4096 (after HIVE-15159 
changed it to 3276, also in 2.2.0).

Only *hive.llap.daemon.headroom.memory.per.instance.mb* needs to be documented 
in the wiki.

* [Configuration Properties -- LLAP | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAP]

Added a TODOC2.2 label.

Edit (22/Feb/17):  HIVE-15959 removes 
*hive.llap.daemon.headroom.memory.per.instance.mb* in 2.2.0, so it doesn't need 
to be documented after all.  Removing the TODOC2.2 label.


was (Author: le...@hortonworks.com):
Doc note:  This adds *hive.llap.daemon.headroom.memory.per.instance.mb* to 
HiveConf.java and changes the default of 
*hive.llap.daemon.memory.per.instance.mb* back to 4096 (after HIVE-15159 
changed it to 3276, also in 2.2.0).

Only *hive.llap.daemon.headroom.memory.per.instance.mb* needs to be documented 
in the wiki.

* [Configuration Properties -- LLAP | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-LLAP]

Added a TODOC2.2 label.

> LLAP: Executor memory and Xmx should have some headroom for other services
> --
>
> Key: HIVE-15347
> URL: https://issues.apache.org/jira/browse/HIVE-15347
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15347.1.patch
>
>
> If executor memory + cache memory is configured close or equal to Xmx, the 
> task attempts that is causing OOM can take down the LLAP daemon. Provide some 
> leeway for other services during memory crunch. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15347) LLAP: Executor memory and Xmx should have some headroom for other services

2017-02-22 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15347:
--
Labels:   (was: TODOC2.2)

> LLAP: Executor memory and Xmx should have some headroom for other services
> --
>
> Key: HIVE-15347
> URL: https://issues.apache.org/jira/browse/HIVE-15347
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15347.1.patch
>
>
> If executor memory + cache memory is configured close or equal to Xmx, the 
> task attempts that is causing OOM can take down the LLAP daemon. Provide some 
> leeway for other services during memory crunch. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15958) LLAP: IPC connections are not being reused for umbilical protocol

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877758#comment-15877758
 ] 

Hive QA commented on HIVE-15958:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853868/HIVE-15958.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10252 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3687/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3687/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3687/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12853868 - PreCommit-HIVE-Build

> LLAP: IPC connections are not being reused for umbilical protocol
> -
>
> Key: HIVE-15958
> URL: https://issues.apache.org/jira/browse/HIVE-15958
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-15958.1.patch, HIVE-15958.2.patch
>
>
> During concurrency testing, observed 1000s of ipc thread creations. Ideally, 
> the connections to same hosts should be reused.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Work started] (HIVE-15996) Implement multiargument GROUPING function

2017-02-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-15996 started by Jesus Camacho Rodriguez.
--
> Implement multiargument GROUPING function
> -
>
> Key: HIVE-15996
> URL: https://issues.apache.org/jira/browse/HIVE-15996
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
>
> Per the SQL standard section 6.9:
> GROUPING ( CR1, ..., CRN-1, CRN )
> is equivalent to:
> CAST ( ( 2 * GROUPING ( CR1, ..., CRN-1 ) + GROUPING ( CRN ) ) AS IDT )
> So for example:
> select c1, c2, c3, grouping(c1, c2, c3) from e011_02 group by rollup(c1, c2, 
> c3);
> Should be allowed and equivalent to:
> select c1, c2, c3, 4*grouping(c1) + 2*grouping(c2) + grouping(c3) from 
> e011_02 group by rollup(c1, c2, c3);



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15996) Implement multiargument GROUPING function

2017-02-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877783#comment-15877783
 ] 

Jesus Camacho Rodriguez commented on HIVE-15996:


[~julianhyde], I think in Oracle this is similar to GROUPING_ID function.

> Implement multiargument GROUPING function
> -
>
> Key: HIVE-15996
> URL: https://issues.apache.org/jira/browse/HIVE-15996
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15996.patch
>
>
> Per the SQL standard section 6.9:
> GROUPING ( CR1, ..., CRN-1, CRN )
> is equivalent to:
> CAST ( ( 2 * GROUPING ( CR1, ..., CRN-1 ) + GROUPING ( CRN ) ) AS IDT )
> So for example:
> select c1, c2, c3, grouping(c1, c2, c3) from e011_02 group by rollup(c1, c2, 
> c3);
> Should be allowed and equivalent to:
> select c1, c2, c3, 4*grouping(c1) + 2*grouping(c2) + grouping(c3) from 
> e011_02 group by rollup(c1, c2, c3);



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15996) Implement multiargument GROUPING function

2017-02-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15996:
---
Attachment: HIVE-15996.patch

> Implement multiargument GROUPING function
> -
>
> Key: HIVE-15996
> URL: https://issues.apache.org/jira/browse/HIVE-15996
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15996.patch
>
>
> Per the SQL standard section 6.9:
> GROUPING ( CR1, ..., CRN-1, CRN )
> is equivalent to:
> CAST ( ( 2 * GROUPING ( CR1, ..., CRN-1 ) + GROUPING ( CRN ) ) AS IDT )
> So for example:
> select c1, c2, c3, grouping(c1, c2, c3) from e011_02 group by rollup(c1, c2, 
> c3);
> Should be allowed and equivalent to:
> select c1, c2, c3, 4*grouping(c1) + 2*grouping(c2) + grouping(c3) from 
> e011_02 group by rollup(c1, c2, c3);



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15996) Implement multiargument GROUPING function

2017-02-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877784#comment-15877784
 ] 

Jesus Camacho Rodriguez commented on HIVE-15996:


Patch attached; it needs HIVE-15994 to go in first.

> Implement multiargument GROUPING function
> -
>
> Key: HIVE-15996
> URL: https://issues.apache.org/jira/browse/HIVE-15996
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15996.patch
>
>
> Per the SQL standard section 6.9:
> GROUPING ( CR1, ..., CRN-1, CRN )
> is equivalent to:
> CAST ( ( 2 * GROUPING ( CR1, ..., CRN-1 ) + GROUPING ( CRN ) ) AS IDT )
> So for example:
> select c1, c2, c3, grouping(c1, c2, c3) from e011_02 group by rollup(c1, c2, 
> c3);
> Should be allowed and equivalent to:
> select c1, c2, c3, 4*grouping(c1) + 2*grouping(c2) + grouping(c3) from 
> e011_02 group by rollup(c1, c2, c3);



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877827#comment-15877827
 ] 

Hive QA commented on HIVE-15955:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853871/HIVE-15955.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10253 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union] 
(batchId=145)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3688/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3688/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3688/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12853871 - PreCommit-HIVE-Build

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15947) Enhance Templeton service job operations reliability

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877905#comment-15877905
 ] 

Hive QA commented on HIVE-15947:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853874/HIVE-15947.2.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10263 tests 
executed
*Failed tests:*
{noformat}
TestConcurrentJobRequestsBase - did not produce a TEST-*.xml file (likely timed 
out) (batchId=171)
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery 
(batchId=217)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3689/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3689/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3689/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12853874 - PreCommit-HIVE-Build

> Enhance Templeton service job operations reliability
> 
>
> Key: HIVE-15947
> URL: https://issues.apache.org/jira/browse/HIVE-15947
> Project: Hive
>  Issue Type: Bug
>Reporter: Subramanyam Pattipaka
>Assignee: Subramanyam Pattipaka
> Attachments: HIVE-15947.2.patch, HIVE-15947.patch
>
>
> Currently Templeton service doesn't restrict number of job operation 
> requests. It simply accepts and tries to run all operations. If more number 
> of concurrent job submit requests comes then the time to submit job 
> operations can increase significantly. Templetonused hdfs to store staging 
> file for job. If HDFS storage can't respond to large number of requests and 
> throttles then the job submission can take very large times in order of 
> minutes.
> This behavior may not be suitable for all applications and client 
> applications  may be looking for predictable and low response for successful 
> request or send throttle response to client to wait for some time before 
> re-requesting job operation.
> In this JIRA, I am trying to address following job operations 
> 1) Submit new Job
> 2) Get Job Status
> 3) List jobs
> These three operations has different complexity due to variance in use of 
> cluster resources like YARN/HDFS.
> The idea is to introduce a new config templeton.job.submit.exec.max-procs 
> which controls maximum number of concurrent active job submissions within 
> Templeton and use this config to control better response times. If a new job 
> submission request sees that there are already 
> templeton.job.submit.exec.max-procs jobs getting submitted concurrently then 
> the request will fail with Http error 503 with reason 
>“Too many concurrent job submission requests received. Please wait for 
> some time before retrying.”
>  
> The client is expected to catch this response and retry after waiting for 
> some time. The default value for the config 
> templeton.job.submit.exec.max-procs is set to ‘0’. This means by default job 
> submission requests are always accepted. The behavior needs to be enabled 
> based on requirements.
> We can have similar behavior for Status and List operations with configs 
> templeton.job.status.exec.max-procs and templeton.list.job.exec.max-procs 
> respectively.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15928) Parallelization of Select queries in Druid handler

2017-02-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15928:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master, thanks for reviewing [~ashutoshc]!

> Parallelization of Select queries in Druid handler
> --
>
> Key: HIVE-15928
> URL: https://issues.apache.org/jira/browse/HIVE-15928
> Project: Hive
>  Issue Type: Sub-task
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.2.0
>
> Attachments: HIVE-15928.01.patch, HIVE-15928.02.patch, 
> HIVE-15928.patch
>
>
> Even if we split a Select query along its time dimension, parallelization is 
> limited as all queries will hit the broker node. Instead, we can interrogate 
> the broker to get the Druid nodes that contain the data, and query those 
> nodes directly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15990) Always initialize connection properties in DruidSerDe

2017-02-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15990:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master, thanks for reviewing [~ashutoshc]!

> Always initialize connection properties in DruidSerDe
> -
>
> Key: HIVE-15990
> URL: https://issues.apache.org/jira/browse/HIVE-15990
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15990.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15668) change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword

2017-02-22 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-15668:

Status: Open  (was: Patch Available)

> change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword
> -
>
> Key: HIVE-15668
> URL: https://issues.apache.org/jira/browse/HIVE-15668
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15668.patch
>
>
> Currently, REPL DUMP syntax goes:
> {noformat}
> REPL DUMP [[.]] [FROM  [BATCH ]]
> {noformat}
> The BATCH directive says that when doing an event dump, to not dump out more 
> than _batchSize_ number of events. However, there is a clearer keyword for 
> the same effect, and that is LIMIT. Thus, rephrasing the syntax as follows 
> makes it clearer:
> {noformat}
> REPL DUMP [[.]] [FROM  [LIMIT ]]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15668) change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword

2017-02-22 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-15668:

Attachment: HIVE-15668.2.patch

Updated patch attached, with test updated.

> change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword
> -
>
> Key: HIVE-15668
> URL: https://issues.apache.org/jira/browse/HIVE-15668
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15668.2.patch, HIVE-15668.patch
>
>
> Currently, REPL DUMP syntax goes:
> {noformat}
> REPL DUMP [[.]] [FROM  [BATCH ]]
> {noformat}
> The BATCH directive says that when doing an event dump, to not dump out more 
> than _batchSize_ number of events. However, there is a clearer keyword for 
> the same effect, and that is LIMIT. Thus, rephrasing the syntax as follows 
> makes it clearer:
> {noformat}
> REPL DUMP [[.]] [FROM  [LIMIT ]]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15668) change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword

2017-02-22 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-15668:

Status: Patch Available  (was: Open)

> change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword
> -
>
> Key: HIVE-15668
> URL: https://issues.apache.org/jira/browse/HIVE-15668
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15668.2.patch, HIVE-15668.patch
>
>
> Currently, REPL DUMP syntax goes:
> {noformat}
> REPL DUMP [[.]] [FROM  [BATCH ]]
> {noformat}
> The BATCH directive says that when doing an event dump, to not dump out more 
> than _batchSize_ number of events. However, there is a clearer keyword for 
> the same effect, and that is LIMIT. Thus, rephrasing the syntax as follows 
> makes it clearer:
> {noformat}
> REPL DUMP [[.]] [FROM  [LIMIT ]]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878001#comment-15878001
 ] 

Hive QA commented on HIVE-16004:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853897/HIVE-16004.002.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10252 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3690/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3690/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3690/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12853897 - PreCommit-HIVE-Build

> OutOfMemory in SparkReduceRecordHandler with vectorization mode
> ---
>
> Key: HIVE-16004
> URL: https://issues.apache.org/jira/browse/HIVE-16004
> Project: Hive
>  Issue Type: Bug
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-16004.001.patch, HIVE-16004.002.patch
>
>
> For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. 
> Get the following exception:
> java.lang.OutOfMemoryError
>   at 
> java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
>   at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
>   at 
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
>   at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:85)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745) 
> I think DataOutputBuffer isn't cleared on time cause this problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15859) Hive client side shows Spark Driver disconnected while Spark Driver side could not get RPC header

2017-02-22 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878013#comment-15878013
 ] 

Rui Li commented on HIVE-15859:
---

Another way to fix (and also mentioned in the Livy PR) is to combine the 
message header with the payload. I think can have some class like
{code}
class RpcMessage {
  MessageHeader header;
  Object payload;
}
{code}
so that we can send/receive the header and payload as a whole. It may be a more 
thorough way to avoid potential race conditions.
[~xuefuz], [~vanzin] what do you think about it?

> Hive client side shows Spark Driver disconnected while Spark Driver side 
> could not get RPC header 
> --
>
> Key: HIVE-15859
> URL: https://issues.apache.org/jira/browse/HIVE-15859
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Spark
>Affects Versions: 2.2.0
> Environment: hadoop2.7.1
> spark1.6.2
> hive2.2
>Reporter: KaiXu
>Assignee: Rui Li
> Attachments: HIVE-15859.1.patch, HIVE-15859.2.patch
>
>
> Hive on Spark, failed with error:
> {noformat}
> 2017-02-08 09:50:59,331 Stage-2_0: 1039(+2)/1041 Stage-3_0: 796(+456)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:00,335 Stage-2_0: 1040(+1)/1041 Stage-3_0: 914(+398)/1520 
> Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> 2017-02-08 09:51:01,338 Stage-2_0: 1041/1041 Finished Stage-3_0: 
> 961(+383)/1520 Stage-4_0: 0/2021 Stage-5_0: 0/1009 Stage-6_0: 0/1
> Failed to monitor Job[ 2] with exception 'java.lang.IllegalStateException(RPC 
> channel is closed.)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> {noformat}
> application log shows the driver commanded a shutdown with some unknown 
> reason, but hive's log shows Driver could not get RPC header( Expected RPC 
> header, got org.apache.hive.spark.client.rpc.Rpc$NullMessage instead).
> {noformat}
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1169.0 in 
> stage 3.0 (TID 2519)
> 17/02/08 09:51:04 INFO executor.CoarseGrainedExecutorBackend: Driver 
> commanded a shutdown
> 17/02/08 09:51:04 INFO storage.MemoryStore: MemoryStore cleared
> 17/02/08 09:51:04 INFO storage.BlockManager: BlockManager stopped
> 17/02/08 09:51:04 INFO exec.Utilities: PLAN PATH = 
> hdfs://hsx-node1:8020/tmp/hive/root/b723c85d-2a7b-469e-bab1-9c165b25e656/hive_2017-02-08_09-49-37_890_6267025825539539056-1/-mr-10006/71a9dacb-a463-40ef-9e86-78d3b8e3738d/map.xml
> 17/02/08 09:51:04 WARN executor.CoarseGrainedExecutorBackend: An unknown 
> (hsx-node1:42777) driver disconnected.
> 17/02/08 09:51:04 ERROR executor.CoarseGrainedExecutorBackend: Driver 
> 192.168.1.1:42777 disassociated! Shutting down.
> 17/02/08 09:51:04 INFO executor.Executor: Executor killed task 1105.0 in 
> stage 3.0 (TID 2511)
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Shutdown hook called
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Shutting down remote daemon.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk6/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-71da1dfc-99bd-4687-bc2f-33452db8de3d
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk2/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-7f134d81-e77e-4b92-bd99-0a51d0962c14
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk5/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-77a90d63-fb05-4bc6-8d5e-1562cc502e6c
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remote daemon shut down; proceeding with flushing remote transports.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk4/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-91f8b91a-114d-4340-8560-d3cd085c1cd4
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk1/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-a3c24f9e-8609-48f0-9d37-0de7ae06682a
> 17/02/08 09:51:04 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
> Remoting shut down.
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk7/yarn/nm/usercache/root/appcache/application_1486453422616_0150/spark-f6120a43-2158-4780-927c-c5786b78f53e
> 17/02/08 09:51:04 INFO util.ShutdownHookManager: Deleting directory 
> /mnt/disk3/yarn/nm/usercache/root/appcache/application_1486453422616_0150/

[jira] [Updated] (HIVE-13780) Allow user to update AVRO table schema via command even if table's definition was defined through schema file

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-13780:
--
Attachment: HIVE-13780.1.patch

> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file
> -
>
> Key: HIVE-13780
> URL: https://issues.apache.org/jira/browse/HIVE-13780
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Eric Lin
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-13780.0.patch, HIVE-13780.1.patch
>
>
> If a table is defined as below:
> {code}
> CREATE TABLE test
> STORED AS AVRO 
> TBLPROPERTIES ('avro.schema.url'='/tmp/schema.json');
> {code}
> if user tries to run command:
> {code}
> ALTER TABLE test CHANGE COLUMN col1 col1 STRING COMMENT 'test comment';
> {code}
> The query will return without any warning, but has no affect to the table.
> It would be good if we can allow user to ALTER table (add/change column, 
> update comment etc) even though the schema is defined through schema file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878071#comment-15878071
 ] 

Hive QA commented on HIVE-16004:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853897/HIVE-16004.002.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10252 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct]
 (batchId=106)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3691/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3691/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3691/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12853897 - PreCommit-HIVE-Build

> OutOfMemory in SparkReduceRecordHandler with vectorization mode
> ---
>
> Key: HIVE-16004
> URL: https://issues.apache.org/jira/browse/HIVE-16004
> Project: Hive
>  Issue Type: Bug
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-16004.001.patch, HIVE-16004.002.patch
>
>
> For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. 
> Get the following exception:
> java.lang.OutOfMemoryError
>   at 
> java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
>   at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
>   at 
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
>   at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:85)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745) 
> I think DataOutputBuffer isn't cleared on time cause this problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
  Labels: patch  (was: )
Assignee: muxin  (was: Vihang Karajgaonkar)
  Status: Patch Available  (was: Open)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 1.2.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Status: Open  (was: Patch Available)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.1.1, 1.2.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Work started] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-15820 started by muxin.

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Attachment: beeline-log4j.properties

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: beeline-log4j.properties
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Work stopped] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-15820 stopped by muxin.

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
> Attachments: beeline-log4j.properties
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Attachment: (was: beeline-log4j.properties)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: muxin
>  Labels: patch
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin reassigned HIVE-15820:


Assignee: Vihang Karajgaonkar  (was: muxin)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Vihang Karajgaonkar
>  Labels: patch
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Attachment: beelinebug_1.patch

our solution to this issue

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Vihang Karajgaonkar
>  Labels: patch
> Attachments: beelinebug_1.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Issue Comment Deleted] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Comment: was deleted

(was: our solution to this issue)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Vihang Karajgaonkar
>  Labels: patch
> Attachments: beelinebug_1.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Attachment: HIVE-15820.patch

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Vihang Karajgaonkar
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878105#comment-15878105
 ] 

muxin commented on HIVE-15820:
--

sorry Vihang Karajgaonkar, I'm still not familiar with this.  I just attached a 
patch, please review it and inform me if there is anything inappropriate

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Vihang Karajgaonkar
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15820:
-
Attachment: (was: beelinebug_1.patch)

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Vihang Karajgaonkar
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15820) comment at the head of beeline -e

2017-02-22 Thread muxin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878108#comment-15878108
 ] 

muxin commented on HIVE-15820:
--

https://issues.apache.org/jira/browse/HIVE-11100
 is a issue we met before this one, patch it first would be better

> comment at the head of beeline -e
> -
>
> Key: HIVE-15820
> URL: https://issues.apache.org/jira/browse/HIVE-15820
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Vihang Karajgaonkar
>  Labels: patch
> Attachments: HIVE-15820.patch
>
>
> $ beeline -u jdbc:hive2://localhost:1 -n test -e "
> > --asdfasdfasdfasdf
> > select * from test_table;
> > "
> expected result of the above command should be all rows of test_table(same as 
> run in beeline interactive mode),but it does not output anything.
> the cause is that -e option will read commands as one string, and in method 
> dispatch(String line) it calls function isComment(String line) in the first, 
> which using
>  'lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--")' 
> to regard commands as a comment.
> two ways can be considered to fix this problem:
> 1. in method initArgs(String[] args), split command by '\n' into command list 
> before dispatch when cl.getOptionValues('e') != null
> 2. in method dispatch(String line), remove comments using this:
> static String removeComments(String line) {
> if (line == null || line.isEmpty()) {
> return line;
> }
> StringBuilder builder = new StringBuilder();
> int escape = -1;
> for (int index = 0; index < line.length(); index++) {
> if (index < line.length() - 1 && line.charAt(index) == 
> line.charAt(index + 1)) {
> if (escape == -1 && line.charAt(index) == '-') {
> //find \n as the end of comment
> index = line.indexOf('\n',index+1);
> //there is no sql after this comment,so just break out
> if (-1==index){
> break;
> }
> }
> }
> char letter = line.charAt(index);
> if (letter == escape) {
> escape = -1; // Turn escape off.
> } else if (escape == -1 && (letter == '\'' || letter == '"')) {
> escape = letter; // Turn escape on.
> }
> builder.append(letter);
> }
> return builder.toString();
>   }
> the second way can be a general solution to remove all comments start with 
> '--'  in a sql



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15822) beeline ignore all sql under comment after semicolon

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15822:
-
Attachment: HIVE-15822.patch

> beeline ignore all sql under comment after semicolon
> 
>
> Key: HIVE-15822
> URL: https://issues.apache.org/jira/browse/HIVE-15822
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, HiveServer2
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15822.patch
>
>
> way to reproduce this error:
> beeline -u jdbc:hive2://localhost:1 -n test -e "
> show databases;--some comment here
> show tables;"
> it will only execute 'show databases', and consider 
> '--some comment here
> show tables;'
> as comment( all sql under the first comment appeared after semicolon).
> when the comment is also end with semicolon, the result will be right.
> the root cause is that beeline will only consider a entire command is inputed 
> when a line is end with semicolon, otherwise if this line is not started with 
> '--' or '#' beeline will combine it with next line until meet semicolon in 
> the end. so actually the comment above is not removed(which causes the 
> error). then beeline split the entire line by ';', so 'show databases' is 
> recognized and executed,
> '--some comment here\n show tables' is considered a comment and discarded.
> my solution is to just remove comment before split by ';', the code can refer 
> to solution 2 for similar issue 
> :https://issues.apache.org/jira/browse/HIVE-15820



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15822) beeline ignore all sql under comment after semicolon

2017-02-22 Thread muxin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878112#comment-15878112
 ] 

muxin commented on HIVE-15822:
--

this patch depends on patch for 
https://issues.apache.org/jira/browse/HIVE-15820 , please review the code there 
first

> beeline ignore all sql under comment after semicolon
> 
>
> Key: HIVE-15822
> URL: https://issues.apache.org/jira/browse/HIVE-15822
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, HiveServer2
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-15822.patch
>
>
> way to reproduce this error:
> beeline -u jdbc:hive2://localhost:1 -n test -e "
> show databases;--some comment here
> show tables;"
> it will only execute 'show databases', and consider 
> '--some comment here
> show tables;'
> as comment( all sql under the first comment appeared after semicolon).
> when the comment is also end with semicolon, the result will be right.
> the root cause is that beeline will only consider a entire command is inputed 
> when a line is end with semicolon, otherwise if this line is not started with 
> '--' or '#' beeline will combine it with next line until meet semicolon in 
> the end. so actually the comment above is not removed(which causes the 
> error). then beeline split the entire line by ';', so 'show databases' is 
> recognized and executed,
> '--some comment here\n show tables' is considered a comment and discarded.
> my solution is to just remove comment before split by ';', the code can refer 
> to solution 2 for similar issue 
> :https://issues.apache.org/jira/browse/HIVE-15820



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15821) hiveserver2 cannot print diagnostic message in JobDebugger

2017-02-22 Thread muxin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

muxin updated HIVE-15821:
-
Attachment: HIVE-15821.patch

> hiveserver2 cannot print diagnostic message in JobDebugger
> --
>
> Key: HIVE-15821
> URL: https://issues.apache.org/jira/browse/HIVE-15821
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1, 2.1.1
>Reporter: muxin
> Attachments: HIVE-15821.patch
>
>
> we meet this issue when upgrading to hiveserver2. when error happened running 
> mapreduce jobs, diagnostic message are printed in hive log file but cannot be 
> seen in beeline.
>  here is steps to reproduce this issue:
> 1. write a simple udf, assuming com.example.udf.DateFormat1(String date, 
> String srcFormat, String DestFormat), notice do not catch exceptions in it.
> 2. add the udf jar, create temporary function dateformat1 as 
> 'com.example.udf.DateFormat'.
> 3. run ==> create temporary table test_table  as select 
> dateformat1('20170101', 'dd/MMM/', '-MM-dd'). of course it will end 
> up failed, in hive.log we can see the full diagnostic message and error 
> stacktrace, but in beeline we only see one line :
> return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredTask
> after analysis we found the cause:
> jobDebuger is started in a subthread, and hiveserver2 didnot pass current 
> operationlog into it, so beeline cannot fetch log printed in jobDebugger.
> so our solution is :
> 1. add a private parameter operationlog to JobDebugger,
> before start jobdebugger thread in HadoopJobExecHelper.progress(), set 
> jobDebugger.operationlog to OperationLog.getCurrentOperationLog()
> 2. in JobDebugger.run(), replace 
> showJobFailDebugInfo();
> with:
> OperationLog.setCurrentOperationLog(operationLog);
> showJobFailDebugInfo();
> OperationLog.removeCurrentOperationLog();



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16005) miscellaneous small fixes to help with llap debuggability

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878124#comment-15878124
 ] 

Hive QA commented on HIVE-16005:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853898/HIVE-16005.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10238 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=104)
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestFirstInFirstOutComparator.testWaitQueueComparatorParallelism
 (batchId=278)
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparator
 (batchId=278)
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparatorAging
 (batchId=278)
org.apache.hadoop.hive.llap.daemon.impl.comparator.TestShortestJobFirstComparator.testWaitQueueComparatorParallelism
 (batchId=278)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3692/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3692/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3692/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12853898 - PreCommit-HIVE-Build

> miscellaneous small fixes to help with llap debuggability
> -
>
> Key: HIVE-16005
> URL: https://issues.apache.org/jira/browse/HIVE-16005
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-16005.01.patch
>
>
> - Include proc_ in cli, beeline, metastore, hs2 process args
> - LLAP history logger - log QueryId instead of dagName (dag name is free 
> flowing text)
> - LLAP JXM ExecutorStatus - Log QueryId instead of dagName. Sort by running / 
> queued
> - Include thread name in TaskRunnerCallable so that it shows up in stack 
> traces (will cause extra output in logs)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-16007) When the query does not complie the LogRunnable never stops

2017-02-22 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-16007:
-


> When the query does not complie the LogRunnable never stops
> ---
>
> Key: HIVE-16007
> URL: https://issues.apache.org/jira/browse/HIVE-16007
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> When issuing a sql command which does not compile then the LogRunnable thread 
> is never closed



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14798) MSCK REPAIR TABLE throws null pointer exception

2017-02-22 Thread Tarik Yilmaz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878253#comment-15878253
 ] 

Tarik Yilmaz commented on HIVE-14798:
-

[~alunarbeach] We've faced with this issue too on Dataproc. Did this error 
appear on Dataproc?

> MSCK REPAIR TABLE throws null pointer exception
> ---
>
> Key: HIVE-14798
> URL: https://issues.apache.org/jira/browse/HIVE-14798
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.0
>Reporter: Anbu Cheeralan
>
> MSCK REPAIR TABLE statement throws null pointer exception in Hive 2.1
> I have tested the same against external/internal tables created both in HDFS 
> and in Google Cloud.
> The error shown in beeline/sql client 
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
> Hive Logs:
> 2016-09-20T17:28:00,717 ERROR [HiveServer2-Background-Pool: Thread-92]: 
> metadata.HiveMetaStoreChecker (:()) - java.lang.NullPointerException
> 2016-09-20T17:28:00,717 WARN  [HiveServer2-Background-Pool: Thread-92]: 
> exec.DDLTask (:()) - Failed to run metacheck: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:444)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.getAllLeafDirs(HiveMetaStoreChecker.java:388)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:309)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:285)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:230)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1814)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:403)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1077)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:90)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:299)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:312)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at 
> java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011)
> at 
> java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker$1.call(HiveMetaStoreChecker.java:432)
> at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker$1.call(HiveMetaStoreChecker.java:418)
> ... 4 more
> Here are the steps to recreate this issue:
> use default;
> DROP TABLE IF EXISTS repairtable;
> CREATE TABLE repairtable(col STRING) PARTITIONED BY (p1 STRING, p2 STRING);
> MSCK REPAIR TABLE default.repairtable;



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16007) When the query does not complie the LogRunnable never stops

2017-02-22 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878306#comment-15878306
 ] 

Peter Vary commented on HIVE-16007:
---

When the execute of the statement in HiveStatement.runAsyncOnServer() is not 
successful, then the stmtHandle is never set. See:
{code:title=HiveStatement.runAsyncOnServer}
  private void runAsyncOnServer(String sql) throws SQLException {
[..]
try {
  TExecuteStatementResp execResp = client.ExecuteStatement(execReq);
  Utils.verifySuccessWithInfo(execResp.getStatus());
  stmtHandle = execResp.getOperationHandle();
  isExecuteStatementFailed = false;
} catch (SQLException eS) {
  isExecuteStatementFailed = true;
  throw eS;
} catch (Exception ex) {
  isExecuteStatementFailed = true;
  throw new SQLException(ex.toString(), "08S01", ex);
}
  }
{code}

The problem is caused by HiveStatement.hasMoreLogs() returning true, since the 
isLogBeingGenerated is true by default, and not set in any of the catch 
statements above.
The LogRunnable.updateQueryLog() will call HiveStatement.getQueryLog() will 
throw an exception since the stmtHandle is null, and the 
isExecuteStatementFailed is true {{Method getQueryLog() failed. Because the 
stmtHandle in HiveStatement is null and the statement execution might fail.}}, 
and later when BeeLine closed the statement the HiveStatement.getQueryLog() 
will throw {{Can't getConnection after statement has been closed}} exception. 
But still the hasMoreLogs() will return true, so we have an infinite loop here:
{code:title=LogRunnable.run()}
@Override public void run() {
  while (hiveStatement.hasMoreLogs()) {
try {
  updateQueryLog();
  Thread.sleep(queryProgressInterval);
} catch (SQLException e) {
  commands.error(new SQLWarning(e));
} catch (InterruptedException e) {
  commands.debug("Getting log thread is interrupted, since query is 
done!");
  commands.showRemainingLogsIfAny(hiveStatement);
}
  }
}
{code}

> When the query does not complie the LogRunnable never stops
> ---
>
> Key: HIVE-16007
> URL: https://issues.apache.org/jira/browse/HIVE-16007
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> When issuing a sql command which does not compile then the LogRunnable thread 
> is never closed



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration

2017-02-22 Thread Alina Abramova (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878381#comment-15878381
 ] 

Alina Abramova commented on HIVE-15433:
---

Hello,
If it's possible please rebuild thrift files before building the project.

> setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in 
> hive configuration
> 
>
> Key: HIVE-15433
> URL: https://issues.apache.org/jira/browse/HIVE-15433
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 1.0.0, 1.2.0, 2.0.0
>Reporter: Alina Abramova
>Assignee: Alina Abramova
> Fix For: 1.2.0
>
> Attachments: HIVE-15433.1.patch, HIVE-15433-branch-1.2.patch
>
>
> Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It 
> will always take the default value from HiveConf until you define it in 
> hive-site.xml.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-13864) Beeline ignores the command that follows a semicolon and comment

2017-02-22 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen reassigned HIVE-13864:
---

Assignee: Yongzhi Chen  (was: Reuben Kuhnert)

> Beeline ignores the command that follows a semicolon and comment
> 
>
> Key: HIVE-13864
> URL: https://issues.apache.org/jira/browse/HIVE-13864
> Project: Hive
>  Issue Type: Bug
>Reporter: Muthu Manickam
>Assignee: Yongzhi Chen
> Attachments: HIVE-13864.01.patch, HIVE-13864.02.patch
>
>
> Beeline ignores the next line/command that follows a command with semicolon 
> and comments.
> Example 1:
> select *
> from table1; -- comments
> select * from table2;
> In this case, only the first command is executed.. second command "select * 
> from table2" is not executed.
> --
> Example 2:
> select *
> from table1; -- comments
> select * from table2;
> select * from table3;
> In this case, first command and third command is executed. second command 
> "select * from table2" is not executed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16007) When the query does not complie the LogRunnable never stops

2017-02-22 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-16007:
--
Description: 
When issuing a sql command which does not compile then the LogRunnable thread 
is never closed.

The issue can be easily detected when running beeline with showWarnings=true.

{code}
$ ./beeline -u "jdbc:hive2://localhost:1 pvary pvary" --showWarnings=true
[..]
Connecting to jdbc:hive2://localhost:1
Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 2.2.0-SNAPSHOT by Apache Hive
0: jdbc:hive2://localhost:1> selekt;
Warning: java.sql.SQLException: Method getQueryLog() failed. Because the 
stmtHandle in HiveStatement is null and the statement execution might fail. 
(state=,code=0)
[..]
Warning: java.sql.SQLException: Can't getQueryLog after statement has been 
closed (state=,code=0)
[..]
{code}

  was:When issuing a sql command which does not compile then the LogRunnable 
thread is never closed


> When the query does not complie the LogRunnable never stops
> ---
>
> Key: HIVE-16007
> URL: https://issues.apache.org/jira/browse/HIVE-16007
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> When issuing a sql command which does not compile then the LogRunnable thread 
> is never closed.
> The issue can be easily detected when running beeline with showWarnings=true.
> {code}
> $ ./beeline -u "jdbc:hive2://localhost:1 pvary pvary" --showWarnings=true
> [..]
> Connecting to jdbc:hive2://localhost:1
> Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
> Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 2.2.0-SNAPSHOT by Apache Hive
> 0: jdbc:hive2://localhost:1> selekt;
> Warning: java.sql.SQLException: Method getQueryLog() failed. Because the 
> stmtHandle in HiveStatement is null and the statement execution might fail. 
> (state=,code=0)
> [..]
> Warning: java.sql.SQLException: Can't getQueryLog after statement has been 
> closed (state=,code=0)
> [..]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16007) When the query does not complie the LogRunnable never stops

2017-02-22 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-16007:
--
Attachment: HIVE-16007.patch

The patch proposes two changes:
- Set the {{isLogBeingGenerated}} to false if there is an exception executing 
the statement
- Handle the interruption more agressively
-- When interrupt arrives then try to fetch the logs and then finish running 
the LogRunnable thread
-- We check the interruption status before fetching the logs, and if the thread 
is interrupted then fetch the remaining logs and exit. This prevents the 
possible active loop when we get an exception so we does not sleep, and will 
not get the InterruptedException even though the parent thread finished the 
statement execution

> When the query does not complie the LogRunnable never stops
> ---
>
> Key: HIVE-16007
> URL: https://issues.apache.org/jira/browse/HIVE-16007
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-16007.patch
>
>
> When issuing a sql command which does not compile then the LogRunnable thread 
> is never closed.
> The issue can be easily detected when running beeline with showWarnings=true.
> {code}
> $ ./beeline -u "jdbc:hive2://localhost:1 pvary pvary" --showWarnings=true
> [..]
> Connecting to jdbc:hive2://localhost:1
> Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
> Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 2.2.0-SNAPSHOT by Apache Hive
> 0: jdbc:hive2://localhost:1> selekt;
> Warning: java.sql.SQLException: Method getQueryLog() failed. Because the 
> stmtHandle in HiveStatement is null and the statement execution might fail. 
> (state=,code=0)
> [..]
> Warning: java.sql.SQLException: Can't getQueryLog after statement has been 
> closed (state=,code=0)
> [..]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-12492) MapJoin: 4 million unique integers seems to be a probe plateau

2017-02-22 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12492:
---
Attachment: HIVE-12492.01.patch

> MapJoin: 4 million unique integers seems to be a probe plateau
> --
>
> Key: HIVE-12492
> URL: https://issues.apache.org/jira/browse/HIVE-12492
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12492.01.patch, HIVE-12492.patch
>
>
> After 4 million keys, the map-join implementation seems to suffer from a 
> performance degradation. 
> The hashtable build & probe time makes this very inefficient, even if the 
> data is very compact (i.e 2 ints).
> Falling back onto the shuffle join or bucket map-join is useful after 2^22 
> items.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-12492) MapJoin: 4 million unique integers seems to be a probe plateau

2017-02-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878513#comment-15878513
 ] 

Jesus Camacho Rodriguez commented on HIVE-12492:


[~ashutoshc], I have updated the patch with new test cases in 
tez_dynpart_hashjoin_1.q so the effect of the logic in the patch wrt DHJ can be 
observed.

RB entry: https://reviews.apache.org/r/56929/

> MapJoin: 4 million unique integers seems to be a probe plateau
> --
>
> Key: HIVE-12492
> URL: https://issues.apache.org/jira/browse/HIVE-12492
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12492.01.patch, HIVE-12492.patch
>
>
> After 4 million keys, the map-join implementation seems to suffer from a 
> performance degradation. 
> The hashtable build & probe time makes this very inefficient, even if the 
> data is very compact (i.e 2 ints).
> Falling back onto the shuffle join or bucket map-join is useful after 2^22 
> items.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15951:
--
Attachment: HIVE-15951.2.patch

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-13780) Allow user to update AVRO table schema via command even if table's definition was defined through schema file

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-13780:
--
Status: Patch Available  (was: In Progress)

> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file
> -
>
> Key: HIVE-13780
> URL: https://issues.apache.org/jira/browse/HIVE-13780
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Eric Lin
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-13780.0.patch, HIVE-13780.1.patch
>
>
> If a table is defined as below:
> {code}
> CREATE TABLE test
> STORED AS AVRO 
> TBLPROPERTIES ('avro.schema.url'='/tmp/schema.json');
> {code}
> if user tries to run command:
> {code}
> ALTER TABLE test CHANGE COLUMN col1 col1 STRING COMMENT 'test comment';
> {code}
> The query will return without any warning, but has no affect to the table.
> It would be good if we can allow user to ALTER table (add/change column, 
> update comment etc) even though the schema is defined through schema file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-13780) Allow user to update AVRO table schema via command even if table's definition was defined through schema file

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-13780:
--
Status: In Progress  (was: Patch Available)

> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file
> -
>
> Key: HIVE-13780
> URL: https://issues.apache.org/jira/browse/HIVE-13780
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Eric Lin
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-13780.0.patch, HIVE-13780.1.patch
>
>
> If a table is defined as below:
> {code}
> CREATE TABLE test
> STORED AS AVRO 
> TBLPROPERTIES ('avro.schema.url'='/tmp/schema.json');
> {code}
> if user tries to run command:
> {code}
> ALTER TABLE test CHANGE COLUMN col1 col1 STRING COMMENT 'test comment';
> {code}
> The query will return without any warning, but has no affect to the table.
> It would be good if we can allow user to ALTER table (add/change column, 
> update comment etc) even though the schema is defined through schema file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15951:
--
Attachment: (was: HIVE-15951.2.patch)

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-1626) stop using java.util.Stack

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878614#comment-15878614
 ] 

Hive QA commented on HIVE-1626:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853906/HIVE-1626.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3693/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3693/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3693/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult 
[localFile=/data/hiveptest/logs/PreCommit-HIVE-Build-3693/failed/115-TestSparkCliDriver-escape_distributeby1.q-join9.q-groupby2.q-and-12-more,
 remoteFile=/home/hiveptest/35.184.41.238-hiveptest-0/logs/, getExitCode()=11, 
getException()=null, getUser()=hiveptest, getHost()=35.184.41.238, 
getInstance()=0]: 'Warning: Permanently added '35.184.41.238' (ECDSA) to the 
list of known hosts.
receiving incremental file list
./
maven-test.txt

  0   0%0.00kB/s0:00:00  
 42,159 100%   40.21MB/s0:00:00 (xfr#1, to-chk=16/18)
logs/
logs/derby.log

  0   0%0.00kB/s0:00:00  
  1,011 100%  987.30kB/s0:00:00 (xfr#2, to-chk=13/18)
logs/hive.log

  0   0%0.00kB/s0:00:00  
 40,370,176   0%   38.04MB/s0:13:26  
 90,112,000   0%   42.71MB/s0:11:57  
145,358,848   0%   46.04MB/s0:11:04  
198,934,528   0%   47.30MB/s0:10:45  
254,345,216   0%   51.02MB/s0:09:57  
310,149,120   0%   52.46MB/s0:09:39  
362,283,008   1%   51.72MB/s0:09:47  
417,431,552   1%   52.07MB/s0:09:42  
473,530,368   1%   52.22MB/s0:09:39  
529,301,504   1%   52.22MB/s0:09:38  
584,482,816   1%   52.95MB/s0:09:29  
630,456,320   2%   50.76MB/s0:09:53  
683,409,408   2%   50.04MB/s0:10:00  
738,983,936   2%   49.90MB/s0:10:01  
794,951,680   2%   50.07MB/s0:09:58  
850,427,904   2%   52.34MB/s0:09:31  
895,549,440   2%   50.45MB/s0:09:51  
950,763,520   3%   50.43MB/s0:09:50  
  1,006,731,264   3%   50.42MB/s0:09:49  
  1,050,148,864   3%   47.46MB/s0:10:25  
  1,075,838,976   3%   42.62MB/s0:11:36  
  1,102,741,504   3%   35.95MB/s0:13:44  
  1,131,937,792   3%   29.53MB/s0:16:42  
  1,159,200,768   3%   25.67MB/s0:19:12  
  1,187,250,176   3%   26.31MB/s0:18:43  
  1,211,629,568   3%   25.62MB/s0:19:12  
  1,235,484,672   3%   24.35MB/s0:20:12  
  1,259,077,632   4%   23.44MB/s0:20:58  
  1,281,622,016   4%   22.22MB/s0:22:06  
  1,306,525,696   4%   22.36MB/s0:21:56  
  1,330,642,944   4%   22.46MB/s0:21:49  
  1,353,973,760   4%   22.50MB/s0:21:46  
  1,377,042,432   4%   22.57MB/s0:21:41  
  1,400,111,104   4%   22.14MB/s0:22:06  
  1,423,704,064   4%   21.95MB/s0:22:16  
  1,445,986,304   4%   21.70MB/s0:22:30  
  1,465,909,248   4%   20.92MB/s0:23:20  
  1,489,240,064   4%   21.02MB/s0:23:12  
  1,510,211,584   4%   20.47MB/s0:23:49  
  1,530,396,672   4%   19.97MB/s0:24:23  
  1,553,465,344   4%   20.71MB/s0:23:29  
  1,577,058,304   5%   20.73MB/s0:23:27  
  1,600,389,120   5%   21.29MB/s0:22:49  
  1,623,719,936   5%   22.04MB/s0:22:02  
  1,647,312,896   5%   22.20MB/s0:21:51  
  1,670,643,712   5%   22.14MB/s0:21:53  
  1,693,974,528   5%   22.14MB/s0:21:52  
  1,717,305,344   5%   22.14MB/s0:21:52  
  1,740,898,304   5%   22.14MB/s0:21:51  
  1,764,229,120   5%   22.14MB/s0:21:50  
  1,788,608,512   5%   22.34MB/s0:21:36  
  1,813,610,496   5%   22.78MB/s0:21:10  
  1,839,464,448   5%   23.27MB/s0:20:42  
  1,864,368,128   5%   23.67MB/s0:20:21  
  1,890,058,240   6%   23.97MB/s0:20:04  
  1,915,748,352   6%   24.06MB/s0:19:58  
  1,941,962,752   6%   24.15MB/s0:19:53  
  1,966,342,144   6%   24.03MB/s0:19:58  
  1,991,507,968   6%   23.97MB/s0:20:00  
  2,017,198,080   6%   23.97MB/s0:19:59  
  2,042,626,048   6%   23.82MB/s0:20:06  
  2,067,824,640   6%   24.02MB/s0:19:55  
  2,093,481,984   6%   24.05MB/s0:19:52  
  2,119,172,096   6%   24.00MB/s0:19:53  
  2,144,600,064   6%   23.98MB/s0:19:53  
  2,170,290,176   6%   23.99MB/s0:19:52  
  2,196,504,576   6%   24.13MB/s0:19:44  
  2,220,883,968   7%   23.92MB/s0:19:53  
  2,246,574,080   7%   23.95MB/s0:19:51  
  2,272,264,192   7%   23.95MB/s0:19:50  
  2,297,954,304   7%   23.

[jira] [Commented] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878627#comment-15878627
 ] 

slim bouguerra commented on HIVE-15951:
---

[~ashutoshc] please checkout the new fix. The delete is done on the close call.

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15951) Make sure base persist directory is unique and deleted

2017-02-22 Thread slim bouguerra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-15951:
--
Attachment: HIVE-15951.2.patch

> Make sure base persist directory is unique and deleted
> --
>
> Key: HIVE-15951
> URL: https://issues.apache.org/jira/browse/HIVE-15951
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 2.2.0
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-15951.2.patch, HIVE-15951.patch
>
>
> In some cases the base persist directory will contain old data or shared 
> between reducer in the same physical VM.
> That will lead to the failure of the job till that the directory is cleaned.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-15992) LLAP: NPE in LlapTaskCommunicator.getCompletedLogsUrl for unsuccessful attempt

2017-02-22 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-15992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña reassigned HIVE-15992:
--

Assignee: Sergio Peña  (was: Rajesh Balamohan)

> LLAP: NPE in LlapTaskCommunicator.getCompletedLogsUrl for unsuccessful attempt
> --
>
> Key: HIVE-15992
> URL: https://issues.apache.org/jira/browse/HIVE-15992
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Sergio Peña
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15992.1.patch
>
>
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.getCompletedLogsUrl(LlapTaskCommunicator.java:563)
> at 
> org.apache.tez.dag.app.TaskCommunicatorWrapper.getCompletedLogsUrl(TaskCommunicatorWrapper.java:92)
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.getCompletedLogsUrl(TaskCommunicatorManager.java:674)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.getCompletedLogsUrl(TaskAttemptImpl.java:1147)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.logJobHistoryAttemptUnsuccesfulCompletion(TaskAttemptImpl.java:1089)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$TerminateTransition.transition(TaskAttemptImpl.java:1332)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$TerminatedBeforeRunningTransition.transition(TaskAttemptImpl.java:1430)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$TerminatedBeforeRunningTransition.transition(TaskAttemptImpl.java:1416)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:799)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:122)
> at 
> org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:2296)
> at 
> org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:2281)
> at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (HIVE-15992) LLAP: NPE in LlapTaskCommunicator.getCompletedLogsUrl for unsuccessful attempt

2017-02-22 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-15992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña reassigned HIVE-15992:
--

Assignee: Rajesh Balamohan  (was: Sergio Peña)

> LLAP: NPE in LlapTaskCommunicator.getCompletedLogsUrl for unsuccessful attempt
> --
>
> Key: HIVE-15992
> URL: https://issues.apache.org/jira/browse/HIVE-15992
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15992.1.patch
>
>
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.getCompletedLogsUrl(LlapTaskCommunicator.java:563)
> at 
> org.apache.tez.dag.app.TaskCommunicatorWrapper.getCompletedLogsUrl(TaskCommunicatorWrapper.java:92)
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.getCompletedLogsUrl(TaskCommunicatorManager.java:674)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.getCompletedLogsUrl(TaskAttemptImpl.java:1147)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.logJobHistoryAttemptUnsuccesfulCompletion(TaskAttemptImpl.java:1089)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$TerminateTransition.transition(TaskAttemptImpl.java:1332)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$TerminatedBeforeRunningTransition.transition(TaskAttemptImpl.java:1430)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$TerminatedBeforeRunningTransition.transition(TaskAttemptImpl.java:1416)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:799)
> at 
> org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:122)
> at 
> org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:2296)
> at 
> org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:2281)
> at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15881) Use new thread count variable name instead of mapred.dfsclient.parallelism.max

2017-02-22 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-15881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15881:
---
Attachment: HIVE-15881.3.patch

> Use new thread count variable name instead of mapred.dfsclient.parallelism.max
> --
>
> Key: HIVE-15881
> URL: https://issues.apache.org/jira/browse/HIVE-15881
> Project: Hive
>  Issue Type: Task
>  Components: Query Planning
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Minor
> Attachments: HIVE-15881.1.patch, HIVE-15881.2.patch, 
> HIVE-15881.3.patch
>
>
> The Utilities class has two methods, {{getInputSummary}} and 
> {{getInputPaths}}, that use the variable {{mapred.dfsclient.parallelism.max}} 
> to get the summary of a list of input locations in parallel. These methods 
> are Hive related, but the variable name does not look it is specific for Hive.
> Also, the above variable is not on HiveConf nor used anywhere else. I just 
> found a reference on the Hadoop MR1 code.
> I'd like to propose the deprecation of {{mapred.dfsclient.parallelism.max}}, 
> and use a different variable name, such as 
> {{hive.get.input.listing.num.threads}}, that reflects the intention of the 
> variable. The removal of the old variable might happen on Hive 3.x



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878726#comment-15878726
 ] 

Ashutosh Chauhan commented on HIVE-15955:
-

+1

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-22 Thread Norris Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Status: In Progress  (was: Patch Available)

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-22 Thread Norris Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Attachment: HIVE-14901.5.patch

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.5.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-22 Thread Norris Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Status: Patch Available  (was: In Progress)

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.5.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15668) change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878801#comment-15878801
 ] 

Hive QA commented on HIVE-15668:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853928/HIVE-15668.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10238 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr]
 (batchId=140)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=101)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3694/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3694/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3694/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12853928 - PreCommit-HIVE-Build

> change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword
> -
>
> Key: HIVE-15668
> URL: https://issues.apache.org/jira/browse/HIVE-15668
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15668.2.patch, HIVE-15668.patch
>
>
> Currently, REPL DUMP syntax goes:
> {noformat}
> REPL DUMP [[.]] [FROM  [BATCH ]]
> {noformat}
> The BATCH directive says that when doing an event dump, to not dump out more 
> than _batchSize_ number of events. However, there is a clearer keyword for 
> the same effect, and that is LIMIT. Thus, rephrasing the syntax as follows 
> makes it clearer:
> {noformat}
> REPL DUMP [[.]] [FROM  [LIMIT ]]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-12767) Implement table property to address Parquet int96 timestamp bug

2017-02-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-12767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878798#comment-15878798
 ] 

Sergio Peña commented on HIVE-12767:


Great, thanks [~zsombor.klara] for continuing on this work. I just left a 
couple of comments on the RB.

> Implement table property to address Parquet int96 timestamp bug
> ---
>
> Key: HIVE-12767
> URL: https://issues.apache.org/jira/browse/HIVE-12767
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Sergio Peña
>Assignee: Barna Zsombor Klara
> Attachments: HIVE-12767.10.patch, HIVE-12767.3.patch, 
> HIVE-12767.4.patch, HIVE-12767.5.patch, HIVE-12767.6.patch, 
> HIVE-12767.7.patch, HIVE-12767.8.patch, HIVE-12767.9.patch, 
> TestNanoTimeUtils.java
>
>
> Parque timestamps using INT96 are not compatible with other tools, like 
> Impala, due to issues in Hive because it adjusts timezones values in a 
> different way than Impala.
> To address such issues. a new table property (parquet.mr.int96.write.zone) 
> must be used in Hive that detects what timezone to use when writing and 
> reading timestamps from Parquet.
> The following is the exit criteria for the fix:
> * Hive will read Parquet MR int96 timestamp data and adjust values using a 
> time zone from a table property, if set, or using the local time zone if it 
> is absent. No adjustment will be applied to data written by Impala.
> * Hive will write Parquet int96 timestamps using a time zone adjustment from 
> the same table property, if set, or using the local time zone if it is 
> absent. This keeps the data in the table consistent.
> * New tables created by Hive will set the table property to UTC if the global 
> option to set the property for new tables is enabled.
> ** Tables created using CREATE TABLE and CREATE TABLE LIKE FILE will not set 
> the property unless the global setting to do so is enabled.
> ** Tables created using CREATE TABLE LIKE  will copy the 
> property of the table that is copied.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15796) HoS: poor reducer parallelism when operator stats are not accurate

2017-02-22 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-15796:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Thanks [~xuefuz] for the review! I just committed this to the master branch.

> HoS: poor reducer parallelism when operator stats are not accurate
> --
>
> Key: HIVE-15796
> URL: https://issues.apache.org/jira/browse/HIVE-15796
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 2.2.0
>
> Attachments: HIVE-15796.1.patch, HIVE-15796.2.patch, 
> HIVE-15796.3.patch, HIVE-15796.4.patch, HIVE-15796.5.patch, 
> HIVE-15796.6.patch, HIVE-15796.wip.1.patch, HIVE-15796.wip.2.patch, 
> HIVE-15796.wip.patch
>
>
> In HoS we use currently use operator stats to determine reducer parallelism. 
> However, it is often the case that operator stats are not accurate, 
> especially if column stats are not available. This sometimes will generate 
> extremely poor reducer parallelism, and cause HoS query to run forever. 
> This JIRA tries to offer an alternative way to compute reducer parallelism, 
> similar to how MR does. Here's the approach we are suggesting:
> 1. when computing the parallelism for a MapWork, use stats associated with 
> the TableScan operator;
> 2. when computing the parallelism for a ReduceWork, use the *maximum* 
> parallelism from all its parents.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15796) HoS: poor reducer parallelism when operator stats are not accurate

2017-02-22 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878817#comment-15878817
 ] 

Xuefu Zhang commented on HIVE-15796:


[~csun], do you mind creating a followup JIRA for refactoring as suggested 
above? Thanks.

> HoS: poor reducer parallelism when operator stats are not accurate
> --
>
> Key: HIVE-15796
> URL: https://issues.apache.org/jira/browse/HIVE-15796
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 2.2.0
>
> Attachments: HIVE-15796.1.patch, HIVE-15796.2.patch, 
> HIVE-15796.3.patch, HIVE-15796.4.patch, HIVE-15796.5.patch, 
> HIVE-15796.6.patch, HIVE-15796.wip.1.patch, HIVE-15796.wip.2.patch, 
> HIVE-15796.wip.patch
>
>
> In HoS we use currently use operator stats to determine reducer parallelism. 
> However, it is often the case that operator stats are not accurate, 
> especially if column stats are not available. This sometimes will generate 
> extremely poor reducer parallelism, and cause HoS query to run forever. 
> This JIRA tries to offer an alternative way to compute reducer parallelism, 
> similar to how MR does. Here's the approach we are suggesting:
> 1. when computing the parallelism for a MapWork, use stats associated with 
> the TableScan operator;
> 2. when computing the parallelism for a ReduceWork, use the *maximum* 
> parallelism from all its parents.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15796) HoS: poor reducer parallelism when operator stats are not accurate

2017-02-22 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878824#comment-15878824
 ] 

Chao Sun commented on HIVE-15796:
-

Sure. Created HIVE-16009 and linked here.

> HoS: poor reducer parallelism when operator stats are not accurate
> --
>
> Key: HIVE-15796
> URL: https://issues.apache.org/jira/browse/HIVE-15796
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
> Fix For: 2.2.0
>
> Attachments: HIVE-15796.1.patch, HIVE-15796.2.patch, 
> HIVE-15796.3.patch, HIVE-15796.4.patch, HIVE-15796.5.patch, 
> HIVE-15796.6.patch, HIVE-15796.wip.1.patch, HIVE-15796.wip.2.patch, 
> HIVE-15796.wip.patch
>
>
> In HoS we use currently use operator stats to determine reducer parallelism. 
> However, it is often the case that operator stats are not accurate, 
> especially if column stats are not available. This sometimes will generate 
> extremely poor reducer parallelism, and cause HoS query to run forever. 
> This JIRA tries to offer an alternative way to compute reducer parallelism, 
> similar to how MR does. Here's the approach we are suggesting:
> 1. when computing the parallelism for a MapWork, use stats associated with 
> the TableScan operator;
> 2. when computing the parallelism for a ReduceWork, use the *maximum* 
> parallelism from all its parents.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-16004) OutOfMemory in SparkReduceRecordHandler with vectorization mode

2017-02-22 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878852#comment-15878852
 ] 

Xuefu Zhang commented on HIVE-16004:


Patch looks good to me. Just wondering if the failure on 
TestSparkCliDriver.testCliDriver is related. (I hope not.)

> OutOfMemory in SparkReduceRecordHandler with vectorization mode
> ---
>
> Key: HIVE-16004
> URL: https://issues.apache.org/jira/browse/HIVE-16004
> Project: Hive
>  Issue Type: Bug
>Reporter: Colin Ma
>Assignee: Colin Ma
> Attachments: HIVE-16004.001.patch, HIVE-16004.002.patch
>
>
> For the query 28 of TPCs-BB with 1T data, the executor memory is set as 30G. 
> Get the following exception:
> java.lang.OutOfMemoryError
>   at 
> java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
>   at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
>   at 
> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
>   at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
>   at java.io.DataOutputStream.write(DataOutputStream.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.setVector(VectorizedBatchUtil.java:467)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedBatchUtil.addRowToBatchFrom(VectorizedBatchUtil.java:238)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processVectors(SparkReduceRecordHandler.java:367)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:286)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:85)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at 
> org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:1974)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>   at org.apache.spark.scheduler.Task.run(Task.scala:85)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745) 
> I think DataOutputBuffer isn't cleared on time cause this problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-13864) Beeline ignores the command that follows a semicolon and comment

2017-02-22 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-13864:

Attachment: HIVE-13864.3.patch

Patch 3 to handle comments in multipleline commands and add more tests

> Beeline ignores the command that follows a semicolon and comment
> 
>
> Key: HIVE-13864
> URL: https://issues.apache.org/jira/browse/HIVE-13864
> Project: Hive
>  Issue Type: Bug
>Reporter: Muthu Manickam
>Assignee: Yongzhi Chen
> Attachments: HIVE-13864.01.patch, HIVE-13864.02.patch, 
> HIVE-13864.3.patch
>
>
> Beeline ignores the next line/command that follows a command with semicolon 
> and comments.
> Example 1:
> select *
> from table1; -- comments
> select * from table2;
> In this case, only the first command is executed.. second command "select * 
> from table2" is not executed.
> --
> Example 2:
> select *
> from table1; -- comments
> select * from table2;
> select * from table3;
> In this case, first command and third command is executed. second command 
> "select * from table2" is not executed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-22 Thread Norris Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Status: Open  (was: Patch Available)

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-22 Thread Norris Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Attachment: (was: HIVE-14901.5.patch)

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-22 Thread Norris Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Attachment: HIVE-14901.5.patch

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.5.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-14901) HiveServer2: Use user supplied fetch size to determine #rows serialized in tasks

2017-02-22 Thread Norris Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norris Lee updated HIVE-14901:
--
Status: Patch Available  (was: Open)

> HiveServer2: Use user supplied fetch size to determine #rows serialized in 
> tasks
> 
>
> Key: HIVE-14901
> URL: https://issues.apache.org/jira/browse/HIVE-14901
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC, ODBC
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Norris Lee
> Attachments: HIVE-14901.1.patch, HIVE-14901.2.patch, 
> HIVE-14901.3.patch, HIVE-14901.4.patch, HIVE-14901.5.patch, HIVE-14901.patch
>
>
> Currently, we use {{hive.server2.thrift.resultset.max.fetch.size}} to decide 
> the max number of rows that we write in tasks. However, we should ideally use 
> the user supplied value (which can be extracted from the 
> ThriftCLIService.FetchResults' request parameter) to decide how many rows to 
> serialize in a blob in the tasks. We should however use 
> {{hive.server2.thrift.resultset.max.fetch.size}} to have an upper bound on 
> it, so that we don't go OOM in tasks and HS2. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14274) When columns are added to structs in a Hive table, HCatLoader breaks.

2017-02-22 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878879#comment-15878879
 ] 

Owen O'Malley commented on HIVE-14274:
--

For InputFormats that implement SelfDescribingInputFormatInterface, you can 
push down the desired type to the input format and it will handle schema 
evolution for you.

> When columns are added to structs in a Hive table, HCatLoader breaks.
> -
>
> Key: HIVE-14274
> URL: https://issues.apache.org/jira/browse/HIVE-14274
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-14274.1.patch
>
>
> Consider this sequence of table/partition creation and schema evolution:
> {code:sql}
> -- Create table.
> CREATE EXTERNAL TABLE `simple_text` (
> foo STRING,
> bar STRUCT
> )
> PARTITIONED BY ( dt STRING )
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t'
> COLLECTION ITEMS TERMINATED BY ':'
> STORED AS TEXTFILE ;
> -- Add partition.
> ALTER TABLE simple_text ADD PARTITION ( dt='0' );
> -- Alter the struct-column to add a new sub-field.
> ALTER TABLE simple_text CHANGE COLUMN bar bar STRUCT zoo:STRING>;
> {code}
> The {{dt='0'}} partition's schema indicates 2 fields in {{bar}}. The data can 
> be read using Hive, but not through HCatLoader. The error looks as follows:
> {noformat}
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception 
> while executing (Name: data_raw: 
> Store(hdfs://dilithiumblue-nn1.blue.ygrid.yahoo.com:8020/tmp/temp-643668868/tmp-1639945319:org.apache.pig.impl.io.TFileStorage)
>  - scope-1 Operator Key: scope-1): 
> org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:314)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:123)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:376)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:241)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:362)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: 
> org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error 
> converting read value to tuple
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POSimpleTezLoad.getNextTuple(POSimpleTezLoad.java:160)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
>   ... 16 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: 
> Error converting read value to tuple
>   at 
> org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76)
>   at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:63)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:118)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POSimpleTezLoad.getNextTuple(POSimpleTezLoad.java:140)
>   ... 17 more
> Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
>

[jira] [Commented] (HIVE-14476) Fix logging issue for branch-1

2017-02-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878883#comment-15878883
 ] 

Sergio Peña commented on HIVE-14476:


I don't have an answer for that. We haven't been too active on adding more 
fixes on the branch-1, so I believe we would need to improve the tests from it. 
Have you investigating the type of errors happening there?

> Fix logging issue for branch-1
> --
>
> Key: HIVE-14476
> URL: https://issues.apache.org/jira/browse/HIVE-14476
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14476.1-branch-1.2.patch
>
>
> This issue is from branch-1 code when we decide if a log entry is an 
> operational log or not (the operational logs are visible to the client). The 
> problem is that the code is checking the logging mode at the beginning of the 
> decide() method, while the logging mode is updated after that check. Due to 
> this issue, we ran into an issue that an operational log could be filtered 
> out if it's the very first log being checked from the this method. As a 
> result, that particular log is not showing up for the end user.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-14476) Fix logging issue for branch-1

2017-02-22 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-14476:
--
Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

> Fix logging issue for branch-1
> --
>
> Key: HIVE-14476
> URL: https://issues.apache.org/jira/browse/HIVE-14476
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14476.1-branch-1.2.patch
>
>
> This issue is from branch-1 code when we decide if a log entry is an 
> operational log or not (the operational logs are visible to the client). The 
> problem is that the code is checking the logging mode at the beginning of the 
> decide() method, while the logging mode is updated after that check. Due to 
> this issue, we ran into an issue that an operational log could be filtered 
> out if it's the very first log being checked from the this method. As a 
> result, that particular log is not showing up for the end user.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-14476) Fix logging issue for branch-1

2017-02-22 Thread Tao Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878901#comment-15878901
 ] 

Tao Li commented on HIVE-14476:
---

The tests from "TestOperationLoggingAPIWithMr" and 
"TestOperationLoggingAPIWithTez" would fail without this fix. However looks 
like we are seeing some issues with the test itself (see below). Given that 
there is no active development for hive1, I will mark this JIRA "won't fix".

TestOperationLoggingAPIWithMr - did not produce a TEST-*.xml file (likely timed 
out) (batchId=384)
TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file (likely 
timed out) (batchId=385)

> Fix logging issue for branch-1
> --
>
> Key: HIVE-14476
> URL: https://issues.apache.org/jira/browse/HIVE-14476
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14476.1-branch-1.2.patch
>
>
> This issue is from branch-1 code when we decide if a log entry is an 
> operational log or not (the operational logs are visible to the client). The 
> problem is that the code is checking the logging mode at the beginning of the 
> decide() method, while the logging mode is updated after that check. Due to 
> this issue, we ran into an issue that an operational log could be filtered 
> out if it's the very first log being checked from the this method. As a 
> result, that particular log is not showing up for the end user.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15668) change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword

2017-02-22 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878903#comment-15878903
 ] 

Thejas M Nair commented on HIVE-15668:
--

+1


> change REPL DUMP syntax to use "LIMIT" instead of "BATCH" keyword
> -
>
> Key: HIVE-15668
> URL: https://issues.apache.org/jira/browse/HIVE-15668
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-15668.2.patch, HIVE-15668.patch
>
>
> Currently, REPL DUMP syntax goes:
> {noformat}
> REPL DUMP [[.]] [FROM  [BATCH ]]
> {noformat}
> The BATCH directive says that when doing an event dump, to not dump out more 
> than _batchSize_ number of events. However, there is a clearer keyword for 
> the same effect, and that is LIMIT. Thus, rephrasing the syntax as follows 
> makes it clearer:
> {noformat}
> REPL DUMP [[.]] [FROM  [LIMIT ]]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-13780) Allow user to update AVRO table schema via command even if table's definition was defined through schema file

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878904#comment-15878904
 ] 

Hive QA commented on HIVE-13780:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12853941/HIVE-13780.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3695/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3695/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3695/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
java.io.IOException: Could not create 
/data/hiveptest/logs/PreCommit-HIVE-Build-3695/succeeded/211_UTBatch_itests__hive-unit_9_tests
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12853941 - PreCommit-HIVE-Build

> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file
> -
>
> Key: HIVE-13780
> URL: https://issues.apache.org/jira/browse/HIVE-13780
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Eric Lin
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-13780.0.patch, HIVE-13780.1.patch
>
>
> If a table is defined as below:
> {code}
> CREATE TABLE test
> STORED AS AVRO 
> TBLPROPERTIES ('avro.schema.url'='/tmp/schema.json');
> {code}
> if user tries to run command:
> {code}
> ALTER TABLE test CHANGE COLUMN col1 col1 STRING COMMENT 'test comment';
> {code}
> The query will return without any warning, but has no affect to the table.
> It would be good if we can allow user to ALTER table (add/change column, 
> update comment etc) even though the schema is defined through schema file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Work started] (HIVE-16006) Incremental REPL LOAD doesn't operate on the target database if name differs from source database.

2017-02-22 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-16006 started by Sankar Hariappan.
---
> Incremental REPL LOAD doesn't operate on the target database if name differs 
> from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. REPL DUMP default FROM 52;
> 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16006) Incremental REPL LOAD doesn't operate on the target database if name differs from source database.

2017-02-22 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16006:

Attachment: HIVE-16006.01.patch

[~thejas] [~sushanth]
Patch has the code fix. Will add some unit tests and provide another patch soon.
Please review the changes.

> Incremental REPL LOAD doesn't operate on the target database if name differs 
> from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: HIVE-16006.01.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. REPL DUMP default FROM 52;
> 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16006) Incremental REPL LOAD doesn't operate on the target database if name differs from source database.

2017-02-22 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-16006:

Status: Patch Available  (was: In Progress)

> Incremental REPL LOAD doesn't operate on the target database if name differs 
> from source database.
> --
>
> Key: HIVE-16006
> URL: https://issues.apache.org/jira/browse/HIVE-16006
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
> Attachments: HIVE-16006.01.patch
>
>
> During "Incremental Load", it is not considering the database name input in 
> the command line. Hence load doesn't happen. At the same time, database with 
> original name is getting modified.
> Steps:
> 1. REPL DUMP default FROM 52;
> 2. REPL LOAD replDb FROM '/tmp/dump/1487588522621';
> – This step modifies the default Db instead of replDb.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878911#comment-15878911
 ] 

Jason Dere commented on HIVE-15955:
---

I think this mostly looks good, though this misses one case - semijoin 
optimizations (min/max/bloomfilter) generate an extra SEL-GBY-RS-GBY-RS branch, 
but the last RS is never directly connected to the destination TableScan/Filter 
which uses the min/max/bloomfilter. Instead there is a separate mapping kept in 
the ParseContext to track these. This may require additional handling to call 
setOutputOperators() for this SEL-GBY-RS-GBY-RS branch (which is also generated 
during TezCompiler, so it might occur after the Optimizer rules have fired).
Should handling this be done in a separate Jira?

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-13780) Allow user to update AVRO table schema via command even if table's definition was defined through schema file

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-13780:
--
Status: In Progress  (was: Patch Available)

> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file
> -
>
> Key: HIVE-13780
> URL: https://issues.apache.org/jira/browse/HIVE-13780
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Eric Lin
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-13780.0.patch, HIVE-13780.1.patch
>
>
> If a table is defined as below:
> {code}
> CREATE TABLE test
> STORED AS AVRO 
> TBLPROPERTIES ('avro.schema.url'='/tmp/schema.json');
> {code}
> if user tries to run command:
> {code}
> ALTER TABLE test CHANGE COLUMN col1 col1 STRING COMMENT 'test comment';
> {code}
> The query will return without any warning, but has no affect to the table.
> It would be good if we can allow user to ALTER table (add/change column, 
> update comment etc) even though the schema is defined through schema file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-13780) Allow user to update AVRO table schema via command even if table's definition was defined through schema file

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-13780:
--
Attachment: HIVE-13780.2.patch

> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file
> -
>
> Key: HIVE-13780
> URL: https://issues.apache.org/jira/browse/HIVE-13780
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Eric Lin
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-13780.0.patch, HIVE-13780.1.patch, 
> HIVE-13780.2.patch
>
>
> If a table is defined as below:
> {code}
> CREATE TABLE test
> STORED AS AVRO 
> TBLPROPERTIES ('avro.schema.url'='/tmp/schema.json');
> {code}
> if user tries to run command:
> {code}
> ALTER TABLE test CHANGE COLUMN col1 col1 STRING COMMENT 'test comment';
> {code}
> The query will return without any warning, but has no affect to the table.
> It would be good if we can allow user to ALTER table (add/change column, 
> update comment etc) even though the schema is defined through schema file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-13780) Allow user to update AVRO table schema via command even if table's definition was defined through schema file

2017-02-22 Thread Adam Szita (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-13780:
--
Status: Patch Available  (was: In Progress)

> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file
> -
>
> Key: HIVE-13780
> URL: https://issues.apache.org/jira/browse/HIVE-13780
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Eric Lin
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-13780.0.patch, HIVE-13780.1.patch, 
> HIVE-13780.2.patch
>
>
> If a table is defined as below:
> {code}
> CREATE TABLE test
> STORED AS AVRO 
> TBLPROPERTIES ('avro.schema.url'='/tmp/schema.json');
> {code}
> if user tries to run command:
> {code}
> ALTER TABLE test CHANGE COLUMN col1 col1 STRING COMMENT 'test comment';
> {code}
> The query will return without any warning, but has no affect to the table.
> It would be good if we can allow user to ALTER table (add/change column, 
> update comment etc) even though the schema is defined through schema file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-13780) Allow user to update AVRO table schema via command even if table's definition was defined through schema file

2017-02-22 Thread Adam Szita (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878942#comment-15878942
 ] 

Adam Szita commented on HIVE-13780:
---

Looks like ptest failed for an unrelated reason, I'm attaching 
[^HIVE-13780.2.patch] (same as .1.patch) to retrigger precommit tests...

> Allow user to update AVRO table schema via command even if table's definition 
> was defined through schema file
> -
>
> Key: HIVE-13780
> URL: https://issues.apache.org/jira/browse/HIVE-13780
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 2.0.0
>Reporter: Eric Lin
>Assignee: Adam Szita
>Priority: Minor
> Attachments: HIVE-13780.0.patch, HIVE-13780.1.patch, 
> HIVE-13780.2.patch
>
>
> If a table is defined as below:
> {code}
> CREATE TABLE test
> STORED AS AVRO 
> TBLPROPERTIES ('avro.schema.url'='/tmp/schema.json');
> {code}
> if user tries to run command:
> {code}
> ALTER TABLE test CHANGE COLUMN col1 col1 STRING COMMENT 'test comment';
> {code}
> The query will return without any warning, but has no affect to the table.
> It would be good if we can allow user to ALTER table (add/change column, 
> update comment etc) even though the schema is defined through schema file.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15955:
---
Status: Open  (was: Patch Available)

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch, HIVE-15955.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878950#comment-15878950
 ] 

Pengcheng Xiong commented on HIVE-15955:


[~jdere], yes i think it should be done in a separate jira as this patch only 
deals with all the RS after logical optimization.

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch, HIVE-15955.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15955:
---
Attachment: HIVE-15955.04.patch

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch, HIVE-15955.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-15955:
---
Status: Patch Available  (was: Open)

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch, HIVE-15955.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15955) make explain formatted to include opId and etc

2017-02-22 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878958#comment-15878958
 ] 

Pengcheng Xiong commented on HIVE-15955:


[~ashutoshc], i remove "transient" key words for the outputoperators as 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union] 
(batchId=145)
is failing. It seems that TestMiniLlapLocalCliDriver is 
serializing/deserializing the operator trees.

> make explain formatted to include opId and etc
> --
>
> Key: HIVE-15955
> URL: https://issues.apache.org/jira/browse/HIVE-15955
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-15955.01.patch, HIVE-15955.02.patch, 
> HIVE-15955.03.patch, HIVE-15955.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-13864) Beeline ignores the command that follows a semicolon and comment

2017-02-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15878993#comment-15878993
 ] 

Hive QA commented on HIVE-13864:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854006/HIVE-13864.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10254 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys]
 (batchId=159)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=223)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=211)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3696/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3696/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3696/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: NonZeroExitCodeException
Command 'cd /data/hiveptest/logs/PreCommit-HIVE-Build-3696/ && tar -zvcf 
test-results.tar.gz test-results/' failed with exit status 2 and output 
'test-results/
test-results/TEST-261_UTBatch_ql_10_tests-TEST-org.apache.hadoop.hive.ql.exec.vector.expressions.gen.TestColumnScalarOperationVectorExpressionEvaluation.xml
test-results/TEST-276_UTBatch_serde_20_tests-TEST-org.apache.hadoop.hive.serde2.avro.TestThatEvolvedSchemasActAsWeWant.xml
test-results/TEST-271_UTBatch_ql_10_tests-TEST-org.apache.hadoop.hive.ql.lockmgr.zookeeper.TestZookeeperLockManager.xml
test-results/TEST-195_UTBatch_itests__hive-unit_9_tests-TEST-org.apache.hadoop.hive.metastore.TestMetaStoreConnectionUrlHook.xml
test-results/TEST-182_UTBatch_hcatalog__streaming_16_tests-TEST-org.apache.hive.hcatalog.streaming.mutate.worker.TestRecordInspectorImpl.xml
test-results/TEST-252_UTBatch_ql_10_tests-TEST-org.apache.hadoop.hive.ql.parse.TestSplitSample.xml
test-results/TEST-262_UTBatch_ql_10_tests-TEST-org.apache.hadoop.hive.ql.exec.vector.TestVectorSerDeRow.xml
test-results/TEST-241_UTBatch_ql_10_tests-TEST-org.apache.hadoop.hive.ql.udf.generic.TestGenericUDFBridge.xml
test-results/TEST-169_UTBatch_storage-api_13_tests-TEST-org.apache.hadoop.hive.ql.exec.vector.TestBytesColumnVector.xml
test-results/TEST-182_UTBatch_hcatalog__streaming_16_tests-TEST-org.apache.hive.hcatalog.streaming.mutate.worker.TestMutatorCoordinator.xml
test-results/TEST-213_UTBatch_itests__hive-unit_9_tests-TEST-org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.xml
test-results/TEST-210_UTBatch_itests__hive-unit_9_tests-TEST-org.apache.hadoop.hive.ql.security.TestStorageBasedMetastoreAuthorizationReads.xml
test-results/TEST-185_UTBatch_service_8_tests-TEST-org.apache.hive.service.auth.TestLdapAtnProviderWithMiniDS.xml
test-results/TEST-275_UTBatch_serde_20_tests-TEST-org.apache.hadoop.hive.serde2.objectinspector.TestStandardObjectInspectors.xml
test-results/TEST-199_UTBatch_itests__hive-unit_9_tests-TEST-org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool2.xml
test-results/TEST-65-TestCliDriver-cbo_rp_join1.q-materialized_view_create.q-orc_ppd_varchar.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml
test-results/TEST-277_UTBatch_serde_6_tests-TEST-org.apache.hadoop.hive.serde2.columnar.TestLazyBinaryColumnarSerDe.xml
test-results/TEST-122-TestSparkCliDriver-auto_sortmerge_join_13.q-join4.q-join35.q-and-12-more-TEST-org.apache.hadoop.hive.cli.TestSparkCliDriver.xml
test-results/TEST-106-TestSparkCliDriver-bucketsortoptimize_insert_4.q-multi_insert_mixed.q-vectorization_10.q-and-12-more-TEST-org.apache.hadoop.hive.cli.TestSparkCliDriver.xml
test-results/TEST-252_UTBatch_ql_10_tests-TEST-org.apache.hadoop.hive.ql.parse.positive.TestTransactionStatement.xml
test-results/TEST-49-TestCliDriver-intersect_all.q-join39.q-bucketsortoptimize_insert_7.q-and-27-more-TEST-org.apache.hadoop.hive.cli.TestCliDriver.xml
test-results/TEST-275_UTBatch_serde_20_tests-TEST-org.apache.hadoop.hive.serde2.lazy.TestLazyArrayMapStruct.xml
test-results/TEST-167_UTBatch_cli_4_tests-TEST-org.apache.hadoop.hive.cli.TestCliDriverMethods.xml
test-results/TEST-104-TestSparkCliDriver-skewjoinopt19.q-order.q-join_merge_multi_expressions.q-and-12-more-TEST-org.apache.hadoop.hive.cli.TestSparkCliDriver.xml
test-results/TEST-208_UTBatch_itests__hive-unit_9_tests-TEST-org.apache.hadoop.hive.ql.metadata.TestSemanticAnalyzerHookLoading.xml
test-results/TEST-276_UTBatch_serde_20_tests-TEST-org.apache.hadoop.hive.serde2.lazybinary.TestLaz

[jira] [Commented] (HIVE-15954) LLAP: some Tez INFO logs are too noisy

2017-02-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15879011#comment-15879011
 ] 

Sergey Shelukhin commented on HIVE-15954:
-

That disabled the INFO (and below) logging from these classes. 
By convention for log4j loggers are created by class name, and these are the 
classes/loggers we've seen. It disabled the logging created via the loggers 
that are initialized with this class as an argument; which is where the "bad" 
logging seems to go. I don't think there's extra granularity beyond that.

> LLAP: some Tez INFO logs are too noisy
> --
>
> Key: HIVE-15954
> URL: https://issues.apache.org/jira/browse/HIVE-15954
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15954.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-22 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Status: Patch Available  (was: Open)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, 
> HIVE-1555.6.patch, HIVE-1555.7.patch, JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-22 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Attachment: HIVE-1555.7.patch

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, 
> HIVE-1555.6.patch, HIVE-1555.7.patch, JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-1555) JDBC Storage Handler

2017-02-22 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-1555:
-
Status: Open  (was: Patch Available)

> JDBC Storage Handler
> 
>
> Key: HIVE-1555
> URL: https://issues.apache.org/jira/browse/HIVE-1555
> Project: Hive
>  Issue Type: New Feature
>  Components: JDBC
>Reporter: Bob Robertson
>Assignee: Gunther Hagleitner
> Attachments: HIVE-1555.3.patch, HIVE-1555.4.patch, HIVE-1555.5.patch, 
> HIVE-1555.6.patch, HIVE-1555.7.patch, JDBCStorageHandler Design Doc.pdf
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> With the Cassandra and HBase Storage Handlers I thought it would make sense 
> to include a generic JDBC RDBMS Storage Handler so that you could import a 
> standard DB table into Hive. Many people must want to perform HiveQL joins, 
> etc against tables in other systems etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF

2017-02-22 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16002:
---
Status: Open  (was: Patch Available)

> Correlated IN subquery with aggregate asserts in sq_count_check UDF
> ---
>
> Key: HIVE-16002
> URL: https://issues.apache.org/jira/browse/HIVE-16002
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16002.1.patch, HIVE-16002.2.patch
>
>
> Reproducer
> {code:SQL}
> create table t(i int, j int);
> insert into t values(0,1), (0,2);
> create table tt(i int, j int);
> insert into tt values(0,3);
> select * from t where i IN (select count(i) from tt where tt.j = t.j);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF

2017-02-22 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16002:
---
Attachment: HIVE-16002.2.patch

> Correlated IN subquery with aggregate asserts in sq_count_check UDF
> ---
>
> Key: HIVE-16002
> URL: https://issues.apache.org/jira/browse/HIVE-16002
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16002.1.patch, HIVE-16002.2.patch
>
>
> Reproducer
> {code:SQL}
> create table t(i int, j int);
> insert into t values(0,1), (0,2);
> create table tt(i int, j int);
> insert into tt values(0,3);
> select * from t where i IN (select count(i) from tt where tt.j = t.j);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (HIVE-16002) Correlated IN subquery with aggregate asserts in sq_count_check UDF

2017-02-22 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16002:
---
Status: Patch Available  (was: Open)

> Correlated IN subquery with aggregate asserts in sq_count_check UDF
> ---
>
> Key: HIVE-16002
> URL: https://issues.apache.org/jira/browse/HIVE-16002
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-16002.1.patch, HIVE-16002.2.patch
>
>
> Reproducer
> {code:SQL}
> create table t(i int, j int);
> insert into t values(0,1), (0,2);
> create table tt(i int, j int);
> insert into tt values(0,3);
> select * from t where i IN (select count(i) from tt where tt.j = t.j);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15996) Implement multiargument GROUPING function

2017-02-22 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15879040#comment-15879040
 ] 

Julian Hyde commented on HIVE-15996:


[~jcamachorodriguez] [~cartershanklin], Yes, I meant to say {{GROUPING_ID}}. 
The authors of the SQL standard were smart to extend GROUPING rather than 
create another function name, as Oracle did. The namespace (GROUPING_ID, 
GROUPING, GROUP_ID not to mention GROUP BY and GROUPING SETS) is over-full and 
confusing.

> Implement multiargument GROUPING function
> -
>
> Key: HIVE-15996
> URL: https://issues.apache.org/jira/browse/HIVE-15996
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15996.patch
>
>
> Per the SQL standard section 6.9:
> GROUPING ( CR1, ..., CRN-1, CRN )
> is equivalent to:
> CAST ( ( 2 * GROUPING ( CR1, ..., CRN-1 ) + GROUPING ( CRN ) ) AS IDT )
> So for example:
> select c1, c2, c3, grouping(c1, c2, c3) from e011_02 group by rollup(c1, c2, 
> c3);
> Should be allowed and equivalent to:
> select c1, c2, c3, 4*grouping(c1) + 2*grouping(c2) + grouping(c3) from 
> e011_02 group by rollup(c1, c2, c3);



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (HIVE-15960) LLAP: Known app masters should be cleaned on queryComplete

2017-02-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-15960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-15960.
--
Resolution: Duplicate

Will be fixed as part of HIVE-15958

> LLAP: Known app masters should be cleaned on queryComplete
> --
>
> Key: HIVE-15960
> URL: https://issues.apache.org/jira/browse/HIVE-15960
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> AM gets removed from knownAppMasters map when the task count becomes 0. Task 
> count becomes 0 during fragment execution as unregisterTask happens when the 
> task is actually scheduled to run.  This will make umbilical connection to AM 
> to be closed. Any new rejections or task kills will have to recreate the 
> connection. This can be improved by remove the AM info from knownAppMaster 
> after query completion. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

1 2 3 >

1 - 100 of 232 matches

Mail list logo