date:20160621

[jira] [Updated] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong results on HS2

2016-06-21 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14070:
---
Status: Open  (was: Patch Available)

> hive.tez.exec.print.summary=true returns wrong results on HS2
> -
>
> Key: HIVE-14070
> URL: https://issues.apache.org/jira/browse/HIVE-14070
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch
>
>
> On master, we have 
> {code}
> Query Execution Summary
> --
> OPERATIONDURATION
> --
> Compile Query   -1466208820.74s
> Prepare Plan0.00s
> Submit Plan 1466208825.50s
> Start DAG   0.26s
> Run DAG 4.39s
> --
> Task Execution Summary
> --
>   VERTICES   DURATION(ms)  CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  
> OUTPUT_RECORDS
> --
>  Map 11014.00 1,534   11  1,500   
> 1
>  Reducer 2  96.00   5410  1   
> 0
> --
> {code}
> sounds like a real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong results on HS2

2016-06-21 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14070:
---
Attachment: HIVE-14070.02.patch

> hive.tez.exec.print.summary=true returns wrong results on HS2
> -
>
> Key: HIVE-14070
> URL: https://issues.apache.org/jira/browse/HIVE-14070
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch
>
>
> On master, we have 
> {code}
> Query Execution Summary
> --
> OPERATIONDURATION
> --
> Compile Query   -1466208820.74s
> Prepare Plan0.00s
> Submit Plan 1466208825.50s
> Start DAG   0.26s
> Run DAG 4.39s
> --
> Task Execution Summary
> --
>   VERTICES   DURATION(ms)  CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  
> OUTPUT_RECORDS
> --
>  Map 11014.00 1,534   11  1,500   
> 1
>  Reducer 2  96.00   5410  1   
> 0
> --
> {code}
> sounds like a real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong results on HS2

2016-06-21 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14070:
---
Status: Patch Available  (was: Open)

> hive.tez.exec.print.summary=true returns wrong results on HS2
> -
>
> Key: HIVE-14070
> URL: https://issues.apache.org/jira/browse/HIVE-14070
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14070.01.patch, HIVE-14070.02.patch
>
>
> On master, we have 
> {code}
> Query Execution Summary
> --
> OPERATIONDURATION
> --
> Compile Query   -1466208820.74s
> Prepare Plan0.00s
> Submit Plan 1466208825.50s
> Start DAG   0.26s
> Run DAG 4.39s
> --
> Task Execution Summary
> --
>   VERTICES   DURATION(ms)  CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  
> OUTPUT_RECORDS
> --
>  Map 11014.00 1,534   11  1,500   
> 1
>  Reducer 2  96.00   5410  1   
> 0
> --
> {code}
> sounds like a real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14074) RELOAD FUNCTION should update dropped functions

2016-06-21 Thread Abdullah Yousufi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abdullah Yousufi updated HIVE-14074:

Attachment: HIVE-14074.01.patch

> RELOAD FUNCTION should update dropped functions
> ---
>
> Key: HIVE-14074
> URL: https://issues.apache.org/jira/browse/HIVE-14074
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
> Fix For: 2.2.0
>
> Attachments: HIVE-14074.01.patch
>
>
> Due to HIVE-2573, functions are stored in a per-session registry and only 
> loaded in from the metastore when hs2 or hive cli is started. Running RELOAD 
> FUNCTION in the current session is a way to force a reload of the functions, 
> so that changes that occurred in other running sessions will be reflected in 
> the current session, without having to restart the current session. However, 
> while functions that are created in other sessions will now appear in the 
> current session, functions that have been dropped are not removed from the 
> current session's registry. It seems inconsistent that created functions are 
> updated while dropped functions are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14074) RELOAD FUNCTION should update dropped functions

2016-06-21 Thread Abdullah Yousufi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abdullah Yousufi updated HIVE-14074:

Status: Patch Available  (was: Open)

> RELOAD FUNCTION should update dropped functions
> ---
>
> Key: HIVE-14074
> URL: https://issues.apache.org/jira/browse/HIVE-14074
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: Abdullah Yousufi
>Assignee: Abdullah Yousufi
> Fix For: 2.2.0
>
>
> Due to HIVE-2573, functions are stored in a per-session registry and only 
> loaded in from the metastore when hs2 or hive cli is started. Running RELOAD 
> FUNCTION in the current session is a way to force a reload of the functions, 
> so that changes that occurred in other running sessions will be reflected in 
> the current session, without having to restart the current session. However, 
> while functions that are created in other sessions will now appear in the 
> current session, functions that have been dropped are not removed from the 
> current session's registry. It seems inconsistent that created functions are 
> updated while dropped functions are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13744) LLAP IO - add complex types support

2016-06-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343752#comment-15343752
 ] 

Hive QA commented on HIVE-13744:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12812068/HIVE-13744.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10253 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/215/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/215/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-215/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12812068 - PreCommit-HIVE-MASTER-Build

> LLAP IO - add complex types support
> ---
>
> Key: HIVE-13744
> URL: https://issues.apache.org/jira/browse/HIVE-13744
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
>  Labels: llap, orc
> Attachments: HIVE-13744.1.patch, HIVE-13744.2.patch
>
>
> Recently, complex type column vectors were added to Hive. We should use them 
> in IO elevator.
> Vectorization itself doesn't support complex types (yet), but this would be 
> useful when it does, also it will enable LLAP IO elevator to be used in 
> non-vectorized context with complex types after HIVE-13617



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13380) Decimal should have lower precedence than double in type hierachy

2016-06-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343674#comment-15343674
 ] 

Ashutosh Chauhan commented on HIVE-13380:
-

I object.

> Decimal should have lower precedence than double in type hierachy
> -
>
> Key: HIVE-13380
> URL: https://issues.apache.org/jira/browse/HIVE-13380
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13380.2.patch, HIVE-13380.4.patch, 
> HIVE-13380.5.patch, HIVE-13380.patch, decimal_filter.q
>
>
> Currently its other way round. Also, decimal should be lower than float.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14057) Add an option in llapstatus to generate output to a file

2016-06-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343653#comment-15343653
 ] 

Hive QA commented on HIVE-14057:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12812257/HIVE-14057.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10236 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-vectorized_distinct_gby.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/214/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/214/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-214/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12812257 - PreCommit-HIVE-MASTER-Build

> Add an option in llapstatus to generate output to a file
> 
>
> Key: HIVE-14057
> URL: https://issues.apache.org/jira/browse/HIVE-14057
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14057.01.patch, HIVE-14057.02.patch, 
> HIVE-14057.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13878) Vectorization: Column pruning for Text vectorization

2016-06-21 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343637#comment-15343637
 ] 

Gopal V commented on HIVE-13878:


[~mmccline]: I realize that this has conflicts with HIVE-13872 with the 
projectedColumns -1 filler change, but will get a test-run before rebasing it 
over that.

> Vectorization: Column pruning for Text vectorization
> 
>
> Key: HIVE-13878
> URL: https://issues.apache.org/jira/browse/HIVE-13878
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13878.04.patch, HIVE-13878.1.patch, 
> HIVE-13878.2.patch, HIVE-13878.3.patch
>
>
> Column pruning in TextFile vectorization does not work with Vector SerDe 
> settings due to LazySimple deser codepath issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13878) Vectorization: Column pruning for Text vectorization

2016-06-21 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13878:
---
Attachment: HIVE-13878.04.patch

> Vectorization: Column pruning for Text vectorization
> 
>
> Key: HIVE-13878
> URL: https://issues.apache.org/jira/browse/HIVE-13878
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13878.04.patch, HIVE-13878.1.patch, 
> HIVE-13878.2.patch, HIVE-13878.3.patch
>
>
> Column pruning in TextFile vectorization does not work with Vector SerDe 
> settings due to LazySimple deser codepath issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14060) Hive: Remove bogus "localhost" from Hive splits

2016-06-21 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14060:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
 Release Note: Hive: Remove bogus "localhost" from Hive splits
   Status: Resolved  (was: Patch Available)

> Hive: Remove bogus "localhost" from Hive splits
> ---
>
> Key: HIVE-14060
> URL: https://issues.apache.org/jira/browse/HIVE-14060
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 2.2.0
>
> Attachments: HIVE-14060.1.patch
>
>
> On remote filesystems like Azure, GCP and S3, the splits contain a filler 
> location of "localhost".
> This is worse than having no location information at all - on large clusters 
> yarn waits upto 200[1] seconds for heartbeat from "localhost" before 
> allocating a container.
> To speed up this process, the split affinity provider should scrub the bogus 
> "localhost" from the locations and allow for the allocation of "*" containers 
> instead on each heartbeat.
> [1] - yarn.scheduler.capacity.node-locality-delay=40 x heartbeat of 5s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14060) Hive: Remove bogus "localhost" from Hive splits

2016-06-21 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343242#comment-15343242
 ] 

Gopal V commented on HIVE-14060:


Pushed to master, thanks [~sershe].

> Hive: Remove bogus "localhost" from Hive splits
> ---
>
> Key: HIVE-14060
> URL: https://issues.apache.org/jira/browse/HIVE-14060
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Fix For: 2.2.0
>
> Attachments: HIVE-14060.1.patch
>
>
> On remote filesystems like Azure, GCP and S3, the splits contain a filler 
> location of "localhost".
> This is worse than having no location information at all - on large clusters 
> yarn waits upto 200[1] seconds for heartbeat from "localhost" before 
> allocating a container.
> To speed up this process, the split affinity provider should scrub the bogus 
> "localhost" from the locations and allow for the allocation of "*" containers 
> instead on each heartbeat.
> [1] - yarn.scheduler.capacity.node-locality-delay=40 x heartbeat of 5s



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13617) LLAP: support non-vectorized execution in IO

2016-06-21 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343227#comment-15343227
 ] 

Prasanth Jayachandran commented on HIVE-13617:
--

Left some minor comments. Mostly questions. Otherwise LGTM +1

> LLAP: support non-vectorized execution in IO
> 
>
> Key: HIVE-13617
> URL: https://issues.apache.org/jira/browse/HIVE-13617
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13617-wo-11417.patch, HIVE-13617-wo-11417.patch, 
> HIVE-13617.01.patch, HIVE-13617.03.patch, HIVE-13617.04.patch, 
> HIVE-13617.05.patch, HIVE-13617.06.patch, HIVE-13617.patch, HIVE-13617.patch, 
> HIVE-15396-with-oi.patch
>
>
> Two approaches - a separate decoding path, into rows instead of VRBs; or 
> decoding VRBs into rows on a higher level (the original LlapInputFormat). I 
> think the latter might be better - it's not a hugely important path, and perf 
> in non-vectorized case is not the best anyway, so it's better to make do with 
> much less new code and architectural disruption. 
> Some ORC patches in progress introduce an easy to reuse (or so I hope, 
> anyway) VRB-to-row conversion, so we should just use that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13982) Extensions to RS dedup: execute with different column order and sorting direction if possible

2016-06-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343180#comment-15343180
 ] 

Hive QA commented on HIVE-13982:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12812247/HIVE-13982.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10253 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query17
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query85
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query89
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/213/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/213/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-213/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12812247 - PreCommit-HIVE-MASTER-Build

> Extensions to RS dedup: execute with different column order and sorting 
> direction if possible
> -
>
> Key: HIVE-13982
> URL: https://issues.apache.org/jira/browse/HIVE-13982
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13982.2.patch, HIVE-13982.3.patch, 
> HIVE-13982.4.patch, HIVE-13982.5.patch, HIVE-13982.patch
>
>
> Pointed out by [~gopalv].
> RS dedup should kick in for these cases, avoiding an additional shuffle stage.
> {code}
> select state, city, sum(sales) from table
> group by state, city
> order by state, city
> limit 10;
> {code}
> {code}
> select state, city, sum(sales) from table
> group by city, state
> order by state, city
> limit 10;
> {code}
> {code}
> select state, city, sum(sales) from table
> group by city, state
> order by state desc, city
> limit 10;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13380) Decimal should have lower precedence than double in type hierachy

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343173#comment-15343173
 ] 

Sergey Shelukhin commented on HIVE-13380:
-

I am going to revert from all branches tomorrow if no objections.

> Decimal should have lower precedence than double in type hierachy
> -
>
> Key: HIVE-13380
> URL: https://issues.apache.org/jira/browse/HIVE-13380
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13380.2.patch, HIVE-13380.4.patch, 
> HIVE-13380.5.patch, HIVE-13380.patch, decimal_filter.q
>
>
> Currently its other way round. Also, decimal should be lower than float.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-13945) Decimal value is displayed as rounded when selecting where clause with that decimal value.

2016-06-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-13945:
---

Assignee: Sergey Shelukhin

> Decimal value is displayed as rounded when selecting where clause with that 
> decimal value.
> --
>
> Key: HIVE-13945
> URL: https://issues.apache.org/jira/browse/HIVE-13945
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Create a table withe a column of decimal type(38,18) and insert 
> '4327269606205.029297'. Then select with that value displays its rounded 
> value, which is 4327269606205.029300
> {noformat}
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> drop table if exists test;
> No rows affected (0.229 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4>
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> create table test (dc 
> decimal(38,18));
> No rows affected (0.125 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4>
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> insert into table test values 
> (4327269606205.029297);
> No rows affected (2.372 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4>
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> select * from test;
> +---+--+
> |  test.dc  |
> +---+--+
> | 4327269606205.029297  |
> +---+--+
> 1 row selected (0.123 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4>
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> select * from test where dc = 
> 4327269606205.029297;
> +---+--+
> |  test.dc  |
> +---+--+
> | 4327269606205.029300  |
> +---+--+
> 1 row selected (0.109 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-13928) Hive2: float value need to be single quoted inside where clause to return rows when it doesn't have to be

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343155#comment-15343155
 ] 

Sergey Shelukhin edited comment on HIVE-13928 at 6/22/16 1:18 AM:
--

By design. Quotes may be forcing a different type; overall floating type 
equality is not expected to work.


was (Author: sershe):
By design. Quotes may be forcing a different type; overall floating type 
equalily is not expected to work.

> Hive2: float value need to be single quoted inside where clause to return 
> rows when it doesn't have to be
> -
>
> Key: HIVE-13928
> URL: https://issues.apache.org/jira/browse/HIVE-13928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Takahiko Saito
>Priority: Critical
>
> The below select where with float value does not return any row:
> {noformat}
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> drop table test;
> No rows affected (0.212 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> create table test (f float);
> No rows affected (1.131 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> insert into table test values 
> (-35664.76),(29497.34);
> No rows affected (2.482 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> select * from test;
> ++--+
> |   test.f   |
> ++--+
> | -35664.76  |
> | 29497.34   |
> ++--+
> 2 rows selected (0.142 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> select * from test where f = 
> -35664.76;
> +-+--+
> | test.f  |
> +-+--+
> +-+--+
> {noformat}
> The workaround is to single quote float value:
> {noformat}
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> select * from test where f = 
> '-35664.76';
> ++--+
> |   test.f   |
> ++--+
> | -35664.76  |
> ++--+
> 1 row selected (0.163 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-13928) Hive2: float value need to be single quoted inside where clause to return rows when it doesn't have to be

2016-06-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-13928.
-
Resolution: Not A Problem

By design. Quotes may be forcing a different type; overall floating type 
equalily is not expected to work.

> Hive2: float value need to be single quoted inside where clause to return 
> rows when it doesn't have to be
> -
>
> Key: HIVE-13928
> URL: https://issues.apache.org/jira/browse/HIVE-13928
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Takahiko Saito
>Priority: Critical
>
> The below select where with float value does not return any row:
> {noformat}
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> drop table test;
> No rows affected (0.212 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> create table test (f float);
> No rows affected (1.131 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> insert into table test values 
> (-35664.76),(29497.34);
> No rows affected (2.482 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> select * from test;
> ++--+
> |   test.f   |
> ++--+
> | -35664.76  |
> | 29497.34   |
> ++--+
> 2 rows selected (0.142 seconds)
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> select * from test where f = 
> -35664.76;
> +-+--+
> | test.f  |
> +-+--+
> +-+--+
> {noformat}
> The workaround is to single quote float value:
> {noformat}
> 0: jdbc:hive2://os-r7-mvjkcu-hiveserver2-11-4> select * from test where f = 
> '-35664.76';
> ++--+
> |   test.f   |
> ++--+
> | -35664.76  |
> ++--+
> 1 row selected (0.163 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14055) directSql - getting the number of partitions is broken

2016-06-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14055:

Attachment: HIVE-14055.02.patch

Updated. I think we should fix the bug first and then refactor the existing 
code conventions (all other methods in directSQL that have internal 
restrictions have the same semantics with nulls). So, if this affects branch-1, 
we should still commit patch .01 there

> directSql - getting the number of partitions is broken
> --
>
> Key: HIVE-14055
> URL: https://issues.apache.org/jira/browse/HIVE-14055
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14055.01.patch, HIVE-14055.02.patch, 
> HIVE-14055.patch
>
>
> Noticed while looking at something else. If the filter cannot be pushed down 
> it just returns 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14068) make more effort to find hive-site.xml

2016-06-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343105#comment-15343105
 ] 

Thejas M Nair commented on HIVE-14068:
--

1. The errors should go to stderr. Can you replace System.out with System.err
2. The found message can be logged at INFO level. This can be very useful 
message and it would get logged just once per jvm, so that should be fine.


> make more effort to find hive-site.xml
> --
>
> Key: HIVE-14068
> URL: https://issues.apache.org/jira/browse/HIVE-14068
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14068.01.patch, HIVE-14068.patch
>
>
> It pretty much doesn't make sense to run Hive w/o the config, so we should 
> make more effort to find one if it's missing on the classpath, or the 
> classloader does not return it for some reason (e.g. classloader ignores some 
> permission issues; explicitly looking for the file may expose them better)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14073) update config whiltelist for sql std authorization

2016-06-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343102#comment-15343102
 ] 

Thejas M Nair commented on HIVE-14073:
--

[~sershe] When SQL std auth or ranger is enabled, only configs in this 
whitelist can be set by HS2 users .
Yes, if you can create a list of configs that should be settable per query by 
the user that would be great.


> update config whiltelist for sql std authorization 
> ---
>
> Key: HIVE-14073
> URL: https://issues.apache.org/jira/browse/HIVE-14073
> Project: Hive
>  Issue Type: Bug
>  Components: Security, SQLStandardAuthorization
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>
> New configs that should go in security whitelist have been added. Whitelist 
> needs updating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14068) make more effort to find hive-site.xml

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343100#comment-15343100
 ] 

Sergey Shelukhin commented on HIVE-14068:
-

[~thejas] added; can you take a look?

> make more effort to find hive-site.xml
> --
>
> Key: HIVE-14068
> URL: https://issues.apache.org/jira/browse/HIVE-14068
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14068.01.patch, HIVE-14068.patch
>
>
> It pretty much doesn't make sense to run Hive w/o the config, so we should 
> make more effort to find one if it's missing on the classpath, or the 
> classloader does not return it for some reason (e.g. classloader ignores some 
> permission issues; explicitly looking for the file may expose them better)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14068) make more effort to find hive-site.xml

2016-06-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14068:

Attachment: HIVE-14068.01.patch

> make more effort to find hive-site.xml
> --
>
> Key: HIVE-14068
> URL: https://issues.apache.org/jira/browse/HIVE-14068
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14068.01.patch, HIVE-14068.patch
>
>
> It pretty much doesn't make sense to run Hive w/o the config, so we should 
> make more effort to find one if it's missing on the classpath, or the 
> classloader does not return it for some reason (e.g. classloader ignores some 
> permission issues; explicitly looking for the file may expose them better)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14071) HIVE-14014 breaks non-file outputs

2016-06-21 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343056#comment-15343056
 ] 

Pengcheng Xiong commented on HIVE-14071:


I had a discussion with [~sershe]. The main draw back of current solution in 
this patch is that we call "createBucketFiles" and we call "closeWriters" to 
close it immediately. The reason behind is to keep the client from endless 
waiting. We explored and discussed different other solutions. As far as we 
know, it seems that we could not find anything better. I would suggest that we 
could put some comments about this drawback and some "TODO" comments for future 
improvement in the code. +1. ccing [~ashutoshc]

> HIVE-14014 breaks non-file outputs
> --
>
> Key: HIVE-14071
> URL: https://issues.apache.org/jira/browse/HIVE-14071
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14071.patch, HIVE-14071.patch
>
>
> Cannot avoid creating outputs when outputs are e.g. streaming



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong results on HS2

2016-06-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343047#comment-15343047
 ] 

Hive QA commented on HIVE-14070:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12812244/HIVE-14070.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10251 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithMr.testFetchResultsOfLogWithPerformanceMode
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithMr.testFetchResultsOfLogWithVerboseMode
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithTez.testFetchResultsOfLogWithPerformanceMode
org.apache.hive.service.cli.operation.TestOperationLoggingAPIWithTez.testFetchResultsOfLogWithVerboseMode
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/212/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/212/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-212/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12812244 - PreCommit-HIVE-MASTER-Build

> hive.tez.exec.print.summary=true returns wrong results on HS2
> -
>
> Key: HIVE-14070
> URL: https://issues.apache.org/jira/browse/HIVE-14070
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14070.01.patch
>
>
> On master, we have 
> {code}
> Query Execution Summary
> --
> OPERATIONDURATION
> --
> Compile Query   -1466208820.74s
> Prepare Plan0.00s
> Submit Plan 1466208825.50s
> Start DAG   0.26s
> Run DAG 4.39s
> --
> Task Execution Summary
> --
>   VERTICES   DURATION(ms)  CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  
> OUTPUT_RECORDS
> --
>  Map 11014.00 1,534   11  1,500   
> 1
>  Reducer 2  96.00   5410  1   
> 0
> --
> {code}
> sounds like a real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-14041) llap scripts add hadoop and other libraries from the machine local install to the daemon classpath

2016-06-21 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-14041.
---
   Resolution: Fixed
Fix Version/s: 2.1.1

> llap scripts add hadoop and other libraries from the machine local install to 
> the daemon classpath
> --
>
> Key: HIVE-14041
> URL: https://issues.apache.org/jira/browse/HIVE-14041
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.1.1
>
> Attachments: HIVE-14041.01.patch
>
>
> `hadoop classpath` ends up getting added to the classpath of llap daemons. 
> This essentially means picking up the classpath from the local deploy.
> This isn't required since the slider package includes relevant libraries 
> (shipped from the client)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14041) llap scripts add hadoop and other libraries from the machine local install to the daemon classpath

2016-06-21 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343023#comment-15343023
 ] 

Siddharth Seth commented on HIVE-14041:
---

Thanks for the review. Committing.

> llap scripts add hadoop and other libraries from the machine local install to 
> the daemon classpath
> --
>
> Key: HIVE-14041
> URL: https://issues.apache.org/jira/browse/HIVE-14041
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14041.01.patch
>
>
> `hadoop classpath` ends up getting added to the classpath of llap daemons. 
> This essentially means picking up the classpath from the local deploy.
> This isn't required since the slider package includes relevant libraries 
> (shipped from the client)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-06-21 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343018#comment-15343018
 ] 

Matt McCline commented on HIVE-13872:
-

I looked at https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/207/

No Hive QA report was produced.

> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, 
> HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.WIP.patch, 
> customer_demographics.txt, vector_include_no_sel.q, 
> vector_include_no_sel.q.out
>
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}
> {code}
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: customer_demographics
>   Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order: 
> Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
> value expressions: cd_demo_sk (type: int), 
> cd_marital_status (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-06-21 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343020#comment-15343020
 ] 

Matt McCline commented on HIVE-13872:
-

Thanks for checking the patch out.

> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, 
> HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.WIP.patch, 
> customer_demographics.txt, vector_include_no_sel.q, 
> vector_include_no_sel.q.out
>
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}
> {code}
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: customer_demographics
>   Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order: 
> Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
> value expressions: cd_demo_sk (type: int), 
> cd_marital_status (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-06-21 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343014#comment-15343014
 ] 

Gopal V commented on HIVE-13872:


[~mmccline]: I don't see a test run for 03.patch?? 

My build went through fine & I confirmed that the LLAP instance does not cache 
any unprojected column in this case.


> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, 
> HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.WIP.patch, 
> customer_demographics.txt, vector_include_no_sel.q, 
> vector_include_no_sel.q.out
>
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}
> {code}
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: customer_demographics
>   Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order: 
> Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
> value expressions: cd_demo_sk (type: int), 
> cd_marital_status (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-06-21 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13872:

Status: In Progress  (was: Patch Available)

> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, 
> HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.WIP.patch, 
> customer_demographics.txt, vector_include_no_sel.q, 
> vector_include_no_sel.q.out
>
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}
> {code}
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: customer_demographics
>   Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order: 
> Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
> value expressions: cd_demo_sk (type: int), 
> cd_marital_status (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-06-21 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13872:

Status: Patch Available  (was: In Progress)

> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, 
> HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.WIP.patch, 
> customer_demographics.txt, vector_include_no_sel.q, 
> vector_include_no_sel.q.out
>
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}
> {code}
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: customer_demographics
>   Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order: 
> Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
> value expressions: cd_demo_sk (type: int), 
> cd_marital_status (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-06-21 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13872:

Attachment: HIVE-13872.04.patch

> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, 
> HIVE-13872.03.patch, HIVE-13872.04.patch, HIVE-13872.WIP.patch, 
> customer_demographics.txt, vector_include_no_sel.q, 
> vector_include_no_sel.q.out
>
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}
> {code}
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: customer_demographics
>   Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order: 
> Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
> value expressions: cd_demo_sk (type: int), 
> cd_marital_status (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13872) Vectorization: Fix cross-product reduce sink serialization

2016-06-21 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343009#comment-15343009
 ] 

Matt McCline commented on HIVE-13872:
-

Nonsense errors in console output for build of patch #3.

> Vectorization: Fix cross-product reduce sink serialization
> --
>
> Key: HIVE-13872
> URL: https://issues.apache.org/jira/browse/HIVE-13872
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Matt McCline
> Attachments: HIVE-13872.01.patch, HIVE-13872.02.patch, 
> HIVE-13872.03.patch, HIVE-13872.WIP.patch, customer_demographics.txt, 
> vector_include_no_sel.q, vector_include_no_sel.q.out
>
>
> TPC-DS Q13 produces a cross-product without CBO simplifying the query
> {code}
> Caused by: java.lang.RuntimeException: null STRING entry: batchIndex 0 
> projection column num 1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.nullBytesReadError(VectorExtractRow.java:349)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRowColumn(VectorExtractRow.java:267)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:343)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:103)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:762)
> ... 18 more
> {code}
> Simplified query
> {code}
> set hive.cbo.enable=false;
> -- explain
> select count(1)  
>  from store_sales
>  ,customer_demographics
>  where (
> ( 
>   customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'M'
>  )or
>  (
>customer_demographics.cd_demo_sk = ss_cdemo_sk
>   and customer_demographics.cd_marital_status = 'U'
>  ))
> ;
> {code}
> {code}
> Map 3 
> Map Operator Tree:
> TableScan
>   alias: customer_demographics
>   Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
>   Reduce Output Operator
> sort order: 
> Statistics: Num rows: 1920800 Data size: 717255532 Basic 
> stats: COMPLETE Column stats: NONE
> value expressions: cd_demo_sk (type: int), 
> cd_marital_status (type: string)
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14052) Cleanup of structures required when LLAP access from external clients completes

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342997#comment-15342997
 ] 

Sergey Shelukhin commented on HIVE-14052:
-

Why should cancel and esp. submit fail? I was referring to that as more of a 
thread safety issue than logic issue. I will take a look some time this week.

> Cleanup of structures required when LLAP access from external clients 
> completes
> ---
>
> Key: HIVE-14052
> URL: https://issues.apache.org/jira/browse/HIVE-14052
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-14052.1.patch
>
>
> Per [~sseth]: There's no cleanup at the moment, and structures used in LLAP 
> to track a query will keep building up slowly over time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14073) update config whiltelist for sql std authorization

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342993#comment-15342993
 ] 

Sergey Shelukhin commented on HIVE-14073:
-

Hmm... what does the whitelist do? Does it mean users cannot modify these 
configs? Some of them can be set by user, I can make a list.

> update config whiltelist for sql std authorization 
> ---
>
> Key: HIVE-14073
> URL: https://issues.apache.org/jira/browse/HIVE-14073
> Project: Hive
>  Issue Type: Bug
>  Components: Security, SQLStandardAuthorization
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>
> New configs that should go in security whitelist have been added. Whitelist 
> needs updating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14028) stats is not updated

2016-06-21 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14028:
---
Status: Patch Available  (was: Open)

> stats is not updated
> 
>
> Key: HIVE-14028
> URL: https://issues.apache.org/jira/browse/HIVE-14028
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14028.01.patch
>
>
> {code}
> DROP TABLE users;
> CREATE TABLE users(key string, state string, country string, country_id int)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "info:state,info:country,info:country_id"
> );
> INSERT OVERWRITE TABLE users SELECT 'user1', 'IA', 'USA', 0 FROM src;
> desc formatted users;
> {code}
> the result is
> {code}
>  A masked pattern was here 
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
> numFiles0
> numRows 0
> rawDataSize 0
> storage_handler 
> org.apache.hadoop.hive.hbase.HBaseStorageHandler
> totalSize   0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14028) stats is not updated

2016-06-21 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14028:
---
Attachment: HIVE-14028.01.patch

> stats is not updated
> 
>
> Key: HIVE-14028
> URL: https://issues.apache.org/jira/browse/HIVE-14028
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14028.01.patch
>
>
> {code}
> DROP TABLE users;
> CREATE TABLE users(key string, state string, country string, country_id int)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "info:state,info:country,info:country_id"
> );
> INSERT OVERWRITE TABLE users SELECT 'user1', 'IA', 'USA', 0 FROM src;
> desc formatted users;
> {code}
> the result is
> {code}
>  A masked pattern was here 
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
> numFiles0
> numRows 0
> rawDataSize 0
> storage_handler 
> org.apache.hadoop.hive.hbase.HBaseStorageHandler
> totalSize   0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-06-21 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Attachment: HIVE-14035.03.patch

Added few unit tests to test the merge algorithm for split-update acid events.

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-06-21 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Status: Patch Available  (was: In Progress)

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-06-21 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Status: In Progress  (was: Patch Available)

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13380) Decimal should have lower precedence than double in type hierachy

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342986#comment-15342986
 ] 

Sergey Shelukhin commented on HIVE-13380:
-

[~ashutoshc] I think I'm missing some background on this patch. Why exactly 
does it cause the issue with arithmetics on literals, is it just the order of 
the types? Should we revert it for now from master too, and make a patch that 
would combine this patch and ensuring correct decimal propagation for literals 
and ops?

> Decimal should have lower precedence than double in type hierachy
> -
>
> Key: HIVE-13380
> URL: https://issues.apache.org/jira/browse/HIVE-13380
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13380.2.patch, HIVE-13380.4.patch, 
> HIVE-13380.5.patch, HIVE-13380.patch, decimal_filter.q
>
>
> Currently its other way round. Also, decimal should be lower than float.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14071) HIVE-14014 breaks non-file outputs

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342976#comment-15342976
 ] 

Sergey Shelukhin edited comment on HIVE-14071 at 6/21/16 10:57 PM:
---

This is writer; this is server pushing data to client. If the server never 
pushes anything or disconnects, the client just sits around forever waiting for 
data. At least the close() call should be performed, so that the notification 
is sent to client and/or the socket is closed. Otherwise the client doesn't 
know whether server is still doing work, or if it's done. So there's no way to 
avoid the pipeline.


was (Author: sershe):
This is writer; this is server pushing data to client. If the server never 
pushes anything or disconnects, the client just sits around forever waiting for 
data. At least the close() call should be performed, so that the notification 
is sent to client and/or the socket is closed.

> HIVE-14014 breaks non-file outputs
> --
>
> Key: HIVE-14071
> URL: https://issues.apache.org/jira/browse/HIVE-14071
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14071.patch, HIVE-14071.patch
>
>
> Cannot avoid creating outputs when outputs are e.g. streaming



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14071) HIVE-14014 breaks non-file outputs

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342976#comment-15342976
 ] 

Sergey Shelukhin commented on HIVE-14071:
-

This is writer; this is server pushing data to client. If the server never 
pushes anything or disconnects, the client just sits around forever waiting for 
data. At least the close() call should be performed, so that the notification 
is sent to client and/or the socket is closed.

> HIVE-14014 breaks non-file outputs
> --
>
> Key: HIVE-14071
> URL: https://issues.apache.org/jira/browse/HIVE-14071
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14071.patch, HIVE-14071.patch
>
>
> Cannot avoid creating outputs when outputs are e.g. streaming



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13930) upgrade Hive to latest Hadoop version

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342971#comment-15342971
 ] 

Sergey Shelukhin commented on HIVE-13930:
-

[~spena]  spark-assembly-1.6.0-hadoop2.4.0.jar contains Hadoop classes (for 
2.4, I presume) internally. It's hard to debug this, so I'm not sure if that is 
the culprit, but it seems like all other tests are passing. I'll take a look at 
encryption_move_tbl, which is one non-spark test that is failing and could 
expose some other issue... I will take a look this or next week

> upgrade Hive to latest Hadoop version
> -
>
> Key: HIVE-13930
> URL: https://issues.apache.org/jira/browse/HIVE-13930
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13930.01.patch, HIVE-13930.02.patch, 
> HIVE-13930.03.patch, HIVE-13930.04.patch, HIVE-13930.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14069) update curator version to 2.10.0

2016-06-21 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-14069:
-
Status: Open  (was: Patch Available)

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-14069.1.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14073) update config whiltelist for sql std authorization

2016-06-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342967#comment-15342967
 ] 

Thejas M Nair commented on HIVE-14073:
--

[~sershe] [~sseth] Are the hive.llap configs expected to be modified by the 
individual end user or are they considered 'hive admin' parameters that should 
be set by hive admins in hive config files ?
I am guessing its the later, and even if user tries to set many of these 
params, it won't take effect as the LLAP deamons would already be running with 
config values from the files.
Please confirm.


> update config whiltelist for sql std authorization 
> ---
>
> Key: HIVE-14073
> URL: https://issues.apache.org/jira/browse/HIVE-14073
> Project: Hive
>  Issue Type: Bug
>  Components: Security, SQLStandardAuthorization
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>
> New configs that should go in security whitelist have been added. Whitelist 
> needs updating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14073) update config whiltelist for sql std authorization

2016-06-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342964#comment-15342964
 ] 

Thejas M Nair commented on HIVE-14073:
--

New configs that have been added since HIVE-10678, that are candidates for the 
whitelist -

hive.strict.checks.large.query
hive.strict.checks.type.safety
hive.strict.checks.cartesian.product
hive.transpose.aggr.join
hive.mapjoin.optimized.hashtable.probe.percent
hive.groupby.limit.extrastep
hive.query.result.fileformat
hive.exec.schema.evolution
hive.exec.orc.base.delta.ratio
hive.orc.splits.ms.footer.cache.enabled
hive.orc.splits.ms.footer.cache.ppd.enabled
hive.orc.splits.directory.batch.ms
hive.orc.splits.include.fileid
hive.orc.splits.allow.synthetic.fileid
hive.orc.cache.use.soft.references
hive.optimize.dynamic.partition.hashjoin
hive.optimize.*
hive.stats.column.autogather
hive.stats.dbclass
hive.cli.tez.session.async
hive.vectorized.*
hive.support.special.characters.tablename
hive.tez.bucket.pruning
hive.tez.bucket.pruning.compat
hive.llap.* ?


> update config whiltelist for sql std authorization 
> ---
>
> Key: HIVE-14073
> URL: https://issues.apache.org/jira/browse/HIVE-14073
> Project: Hive
>  Issue Type: Bug
>  Components: Security, SQLStandardAuthorization
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>
> New configs that should go in security whitelist have been added. Whitelist 
> needs updating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14073) update config whiltelist for sql std authorization

2016-06-21 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-14073:
-
Description: New configs that should go in security whitelist have been 
added. Whitelist needs updating.

> update config whiltelist for sql std authorization 
> ---
>
> Key: HIVE-14073
> URL: https://issues.apache.org/jira/browse/HIVE-14073
> Project: Hive
>  Issue Type: Bug
>  Components: Security, SQLStandardAuthorization
>Affects Versions: 2.1.0
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>
> New configs that should go in security whitelist have been added. Whitelist 
> needs updating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14045) (Vectorization) Add missing case for BINARY in VectorizationContext.getNormalizedName method

2016-06-21 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342887#comment-15342887
 ] 

Jason Dere commented on HIVE-14045:
---

+1, if the tests look ok.

> (Vectorization) Add missing case for BINARY in 
> VectorizationContext.getNormalizedName method
> 
>
> Key: HIVE-14045
> URL: https://issues.apache.org/jira/browse/HIVE-14045
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-14045.01.patch, HIVE-14045.02.patch, 
> HIVE-14045.03.patch
>
>
> Missing case for BINARY data type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14068) make more effort to find hive-site.xml

2016-06-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342884#comment-15342884
 ] 

Thejas M Nair commented on HIVE-14068:
--

Can you also log the location from which the config .xml got picked from ? It 
would be very useful for debugging.


> make more effort to find hive-site.xml
> --
>
> Key: HIVE-14068
> URL: https://issues.apache.org/jira/browse/HIVE-14068
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14068.patch
>
>
> It pretty much doesn't make sense to run Hive w/o the config, so we should 
> make more effort to find one if it's missing on the classpath, or the 
> classloader does not return it for some reason (e.g. classloader ignores some 
> permission issues; explicitly looking for the file may expose them better)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14045) (Vectorization) Add missing case for BINARY in VectorizationContext.getNormalizedName method

2016-06-21 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14045:

Status: In Progress  (was: Patch Available)

> (Vectorization) Add missing case for BINARY in 
> VectorizationContext.getNormalizedName method
> 
>
> Key: HIVE-14045
> URL: https://issues.apache.org/jira/browse/HIVE-14045
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-14045.01.patch, HIVE-14045.02.patch, 
> HIVE-14045.03.patch
>
>
> Missing case for BINARY data type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14045) (Vectorization) Add missing case for BINARY in VectorizationContext.getNormalizedName method

2016-06-21 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14045:

Attachment: HIVE-14045.03.patch

> (Vectorization) Add missing case for BINARY in 
> VectorizationContext.getNormalizedName method
> 
>
> Key: HIVE-14045
> URL: https://issues.apache.org/jira/browse/HIVE-14045
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-14045.01.patch, HIVE-14045.02.patch, 
> HIVE-14045.03.patch
>
>
> Missing case for BINARY data type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14045) (Vectorization) Add missing case for BINARY in VectorizationContext.getNormalizedName method

2016-06-21 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14045:

Status: Patch Available  (was: In Progress)

> (Vectorization) Add missing case for BINARY in 
> VectorizationContext.getNormalizedName method
> 
>
> Key: HIVE-14045
> URL: https://issues.apache.org/jira/browse/HIVE-14045
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-14045.01.patch, HIVE-14045.02.patch, 
> HIVE-14045.03.patch
>
>
> Missing case for BINARY data type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14072) QueryIds reused across different queries

2016-06-21 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14072:
--
Summary: QueryIds reused across different queries  (was: SessionIds reused 
across different queries)

> QueryIds reused across different queries
> 
>
> Key: HIVE-14072
> URL: https://issues.apache.org/jira/browse/HIVE-14072
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Priority: Critical
>
> While testing HIVE-14023, and running TestMiniLlapCluster - query ids were 
> re-uesd for the entire init scripts. 30+ different queries - same queryId, 
> new Tez dag submission, for different queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14071) HIVE-14014 breaks non-file outputs

2016-06-21 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342782#comment-15342782
 ] 

Pengcheng Xiong commented on HIVE-14071:


[~sershe], thanks for your fast patch and response. So basically, when the file 
(to be streamed) is empty, your patch will still create the file (the streaming 
pipeline)? Is there any other way we can improve that so that we do not create 
the pipeline at all? If I understand correctly, the root cause of the problem 
is that streaming client is writing data to streaming server but server may 
wait infinitely when client does not have anything to write (file is empty). Is 
it possible to use "push" model for the client to push data to the server 
rather than current "pull" model for the server to pull data from client? 
Thanks.

> HIVE-14014 breaks non-file outputs
> --
>
> Key: HIVE-14071
> URL: https://issues.apache.org/jira/browse/HIVE-14071
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14071.patch, HIVE-14071.patch
>
>
> Cannot avoid creating outputs when outputs are e.g. streaming



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14021) When converting to CNF, fail if the expression exceeds a threshold

2016-06-21 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14021:
---
Status: Open  (was: Patch Available)

> When converting to CNF, fail if the expression exceeds a threshold
> --
>
> Key: HIVE-14021
> URL: https://issues.apache.org/jira/browse/HIVE-14021
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-14021.1.patch, HIVE-14021.patch
>
>
> When converting to conjunctive normal form (CNF), fail if the expression 
> exceeds a threshold. CNF can explode exponentially in the size of the input 
> expression, but rarely does so in practice. Add a maxNodeCount parameter to 
> RexUtil.toCnf and throw or return null if it is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14021) When converting to CNF, fail if the expression exceeds a threshold

2016-06-21 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14021:
---
Status: Patch Available  (was: In Progress)

> When converting to CNF, fail if the expression exceeds a threshold
> --
>
> Key: HIVE-14021
> URL: https://issues.apache.org/jira/browse/HIVE-14021
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-14021.1.patch, HIVE-14021.patch
>
>
> When converting to conjunctive normal form (CNF), fail if the expression 
> exceeds a threshold. CNF can explode exponentially in the size of the input 
> expression, but rarely does so in practice. Add a maxNodeCount parameter to 
> RexUtil.toCnf and throw or return null if it is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14021) When converting to CNF, fail if the expression exceeds a threshold

2016-06-21 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14021:
---
Attachment: HIVE-14021.1.patch

> When converting to CNF, fail if the expression exceeds a threshold
> --
>
> Key: HIVE-14021
> URL: https://issues.apache.org/jira/browse/HIVE-14021
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-14021.1.patch, HIVE-14021.patch
>
>
> When converting to conjunctive normal form (CNF), fail if the expression 
> exceeds a threshold. CNF can explode exponentially in the size of the input 
> expression, but rarely does so in practice. Add a maxNodeCount parameter to 
> RexUtil.toCnf and throw or return null if it is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-14021) When converting to CNF, fail if the expression exceeds a threshold

2016-06-21 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-14021 started by Jesus Camacho Rodriguez.
--
> When converting to CNF, fail if the expression exceeds a threshold
> --
>
> Key: HIVE-14021
> URL: https://issues.apache.org/jira/browse/HIVE-14021
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: 2.1.0, 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-14021.1.patch, HIVE-14021.patch
>
>
> When converting to conjunctive normal form (CNF), fail if the expression 
> exceeds a threshold. CNF can explode exponentially in the size of the input 
> expression, but rarely does so in practice. Add a maxNodeCount parameter to 
> RexUtil.toCnf and throw or return null if it is exceeded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7443) Fix HiveConnection to communicate with Kerberized Hive JDBC server and alternative JDKs

2016-06-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342752#comment-15342752
 ] 

Hive QA commented on HIVE-7443:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12812169/HIVE-7443.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 31 failed/errored test(s), 10251 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd
org.apache.hive.minikdc.TestHs2HooksWithMiniKdc.testHookContexts
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testConnection
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testIsValid
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testIsValidNeg
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testNegativeProxyAuth
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testNegativeTokenAuth
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testProxyAuth
org.apache.hive.minikdc.TestJdbcNonKrbSASLWithMiniKdc.testTokenAuth
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testConnection
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValid
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testIsValidNeg
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeProxyAuth
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testNegativeTokenAuth
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testProxyAuth
org.apache.hive.minikdc.TestJdbcWithDBTokenStore.testTokenAuth
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testConnection
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testIsValid
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testIsValidNeg
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeProxyAuth
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testProxyAuth
org.apache.hive.minikdc.TestJdbcWithMiniKdc.testTokenAuth
org.apache.hive.minikdc.TestJdbcWithMiniKdcCookie.testCookie
org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthBinary.testAuthorization1
org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthHttp.testAuthorization1
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/210/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/210/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-210/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 31 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12812169 - PreCommit-HIVE-MASTER-Build

> Fix HiveConnection to communicate with Kerberized Hive JDBC server and 
> alternative JDKs
> ---
>
> Key: HIVE-7443
> URL: https://issues.apache.org/jira/browse/HIVE-7443
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC, Security
>Affects Versions: 0.12.0, 0.13.1
> Environment: Kerberos
> Run Hive server2 and client with IBM JDK7.1
>Reporter: Yu Gao
>Assignee: Aihua Xu
> Attachments: HIVE-7443.2.patch, HIVE-7443.patch
>
>
> Hive Kerberos authentication has been enabled in my cluster. I ran kinit to 
> initialize the current login user's ticket cache successfully, and then tried 
> to use beeline to connect to Hive Server2, but failed. After I manually added 
> some logging to catch the failure exception, this is what I got that caused 
> the failure:
> beeline>  !connect 
> jdbc:hive2://:1/default;principal=hive/@REALM.COM
>  org.apache.hive.jdbc.HiveDriver
> scan complete in 2ms
> Connecting to 
> jdbc:hive2://:1/default;principal=hive/@REALM.COM
> Enter password for 
> jdbc:hive2://:1/default;principal=hive/@REALM.COM:
> 14/07/17 15:12:45 ERROR jdbc.HiveConnection: Failed to open client transport
> javax.security.sasl.SaslException: Failed to open client transport [Caused by 
> java.io.IOException: Could not instantiate SASL transport]
> at 
> org.apache.hive.service.auth.KerberosSaslHelper.getKerberosTransport(KerberosSaslHelper.java:78)
>

[jira] [Commented] (HIVE-14052) Cleanup of structures required when LLAP access from external clients completes

2016-06-21 Thread Jason Dere (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342737#comment-15342737
 ] 

Jason Dere commented on HIVE-14052:
---

[~sershe] [~sseth] review?
For you question about if the cleanup completes just before trying to cancel, 
yeah I guess the cancel would fail (as well as the submit).
Couple options here:
- Just allow the cancel (and the fragment submission) to fail. Hopefully this 
is not a common case.
- Add a DONE state here and have cancel() succeed if the cleanup is already done

> Cleanup of structures required when LLAP access from external clients 
> completes
> ---
>
> Key: HIVE-14052
> URL: https://issues.apache.org/jira/browse/HIVE-14052
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-14052.1.patch
>
>
> Per [~sseth]: There's no cleanup at the moment, and structures used in LLAP 
> to track a query will keep building up slowly over time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14071) HIVE-14014 breaks non-file outputs

2016-06-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14071:

Attachment: HIVE-14071.patch

> HIVE-14014 breaks non-file outputs
> --
>
> Key: HIVE-14071
> URL: https://issues.apache.org/jira/browse/HIVE-14071
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14071.patch, HIVE-14071.patch
>
>
> Cannot avoid creating outputs when outputs are e.g. streaming



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14071) HIVE-14014 breaks non-file outputs

2016-06-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14071:

Description: 
Cannot avoid creating outputs when outputs are e.g. streaming


> HIVE-14014 breaks non-file outputs
> --
>
> Key: HIVE-14071
> URL: https://issues.apache.org/jira/browse/HIVE-14071
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14071.patch
>
>
> Cannot avoid creating outputs when outputs are e.g. streaming



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14071) HIVE-14014 breaks non-file outputs

2016-06-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14071:

Attachment: HIVE-14071.patch

[~jdere] [~pxiong] can you take a look? thanks

> HIVE-14014 breaks non-file outputs
> --
>
> Key: HIVE-14071
> URL: https://issues.apache.org/jira/browse/HIVE-14071
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14071.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14071) HIVE-14014 breaks non-file outputs

2016-06-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14071:

Status: Patch Available  (was: Open)

> HIVE-14014 breaks non-file outputs
> --
>
> Key: HIVE-14071
> URL: https://issues.apache.org/jira/browse/HIVE-14071
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14071.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-3939) INSERT INTO behaves like INSERT OVERWRITE if the table name referred is not all lowercase

2016-06-21 Thread Xiao Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342662#comment-15342662
 ] 

Xiao Yu edited comment on HIVE-3939 at 6/21/16 9:06 PM:


[~navis] It appears this was a regression I'm currently on version 1.1.0 and 
can still reproduce this issue on both partitioned and non-partitioned tables. 
Issuing a `USE database;` then using non database prefixed table names works.


was (Author: xyu):
[~navis] It appears this was fixed by HIVE-3465 for partitioned tables however 
I can still reproduce the `database.table` overwriting problem with a 
non-partitioned table. (Version 1.1.0)

> INSERT INTO behaves like INSERT OVERWRITE if the table name referred is not 
> all lowercase
> -
>
> Key: HIVE-3939
> URL: https://issues.apache.org/jira/browse/HIVE-3939
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 0.9.0
> Environment: Windows 2012, HDInsight
>Reporter: mohan dharmarajan
>
> If table referred does not use all lowercase in INSERT INTO command, the data 
> is not appended but overwritten.
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.dynamic.partition=true;
> CREATE TABLE test (key int, value string) PARTITIONED BY (ds string);
> SELECT * FROM test;
> INSERT INTO TABLE test  PARTITION (ds) SELECT key, value, value FROM src;
> SELECT * FROM test;
> The following statement works as expected. The data from src is appended to 
> test
> SELECT * FROM test;
> INSERT INTO TABLE test  PARTITION (ds) SELECT key, value, value FROM src;
> SELECT * FROM test;
> The following is copied from the processing log
> Loading data to table default.test partition (ds=null)
> Loading partition {ds=1}
> Loading partition {ds=2}
> The following statement does not work. Note the table name referred as Test 
> (not test). INSERT INTO behaves like INSERT OVERWRITE
> SELECT * FROM test;
> INSERT INTO TABLE Test  PARTITION (ds) SELECT key, value, value FROM src;
> SELECT * FROM test;
> The following is copied from the processing log
> Loading data to table default.test partition (ds=null)
> Moved to trash: hdfs://localhost:8020/hive/warehouse/test/ds=1
> Moved to trash: hdfs://localhost:8020/hive/warehouse/test/ds=2
> Loading partition {ds=1}
> Loading partition {ds=2}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14041) llap scripts add hadoop and other libraries from the machine local install to the daemon classpath

2016-06-21 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342695#comment-15342695
 ] 

Gopal V commented on HIVE-14041:


LGTM - +1.

> llap scripts add hadoop and other libraries from the machine local install to 
> the daemon classpath
> --
>
> Key: HIVE-14041
> URL: https://issues.apache.org/jira/browse/HIVE-14041
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14041.01.patch
>
>
> `hadoop classpath` ends up getting added to the classpath of llap daemons. 
> This essentially means picking up the classpath from the local deploy.
> This isn't required since the slider package includes relevant libraries 
> (shipped from the client)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14068) make more effort to find hive-site.xml

2016-06-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342688#comment-15342688
 ] 

Ashutosh Chauhan commented on HIVE-14068:
-

+1

> make more effort to find hive-site.xml
> --
>
> Key: HIVE-14068
> URL: https://issues.apache.org/jira/browse/HIVE-14068
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14068.patch
>
>
> It pretty much doesn't make sense to run Hive w/o the config, so we should 
> make more effort to find one if it's missing on the classpath, or the 
> classloader does not return it for some reason (e.g. classloader ignores some 
> permission issues; explicitly looking for the file may expose them better)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14068) make more effort to find hive-site.xml

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342673#comment-15342673
 ] 

Sergey Shelukhin commented on HIVE-14068:
-

It's a known system variable

> make more effort to find hive-site.xml
> --
>
> Key: HIVE-14068
> URL: https://issues.apache.org/jira/browse/HIVE-14068
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14068.patch
>
>
> It pretty much doesn't make sense to run Hive w/o the config, so we should 
> make more effort to find one if it's missing on the classpath, or the 
> classloader does not return it for some reason (e.g. classloader ignores some 
> permission issues; explicitly looking for the file may expose them better)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-3939) INSERT INTO behaves like INSERT OVERWRITE if the table name referred is not all lowercase

2016-06-21 Thread Xiao Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342662#comment-15342662
 ] 

Xiao Yu commented on HIVE-3939:
---

[~navis] It appears this was fixed by HIVE-3465 for partitioned tables however 
I can still reproduce the `database.table` overwriting problem with a 
non-partitioned table. (Version 1.1.0)

> INSERT INTO behaves like INSERT OVERWRITE if the table name referred is not 
> all lowercase
> -
>
> Key: HIVE-3939
> URL: https://issues.apache.org/jira/browse/HIVE-3939
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 0.9.0
> Environment: Windows 2012, HDInsight
>Reporter: mohan dharmarajan
>
> If table referred does not use all lowercase in INSERT INTO command, the data 
> is not appended but overwritten.
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.dynamic.partition=true;
> CREATE TABLE test (key int, value string) PARTITIONED BY (ds string);
> SELECT * FROM test;
> INSERT INTO TABLE test  PARTITION (ds) SELECT key, value, value FROM src;
> SELECT * FROM test;
> The following statement works as expected. The data from src is appended to 
> test
> SELECT * FROM test;
> INSERT INTO TABLE test  PARTITION (ds) SELECT key, value, value FROM src;
> SELECT * FROM test;
> The following is copied from the processing log
> Loading data to table default.test partition (ds=null)
> Loading partition {ds=1}
> Loading partition {ds=2}
> The following statement does not work. Note the table name referred as Test 
> (not test). INSERT INTO behaves like INSERT OVERWRITE
> SELECT * FROM test;
> INSERT INTO TABLE Test  PARTITION (ds) SELECT key, value, value FROM src;
> SELECT * FROM test;
> The following is copied from the processing log
> Loading data to table default.test partition (ds=null)
> Moved to trash: hdfs://localhost:8020/hive/warehouse/test/ds=1
> Moved to trash: hdfs://localhost:8020/hive/warehouse/test/ds=2
> Loading partition {ds=1}
> Loading partition {ds=2}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14068) make more effort to find hive-site.xml

2016-06-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342661#comment-15342661
 ] 

Ashutosh Chauhan commented on HIVE-14068:
-

Is HIVE_CONF_DIR a documented system variable or is it you are introducing here 
for the first time? Looks good to me. 
[~thejas]  you may also want to know about this change.

> make more effort to find hive-site.xml
> --
>
> Key: HIVE-14068
> URL: https://issues.apache.org/jira/browse/HIVE-14068
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14068.patch
>
>
> It pretty much doesn't make sense to run Hive w/o the config, so we should 
> make more effort to find one if it's missing on the classpath, or the 
> classloader does not return it for some reason (e.g. classloader ignores some 
> permission issues; explicitly looking for the file may expose them better)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-06-21 Thread Kevin Liew (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: (was: proposal.pdf)

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-06-21 Thread Kevin Liew (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: proposal.pdf

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14040) insert overwrite for HBase doesn't overwrite

2016-06-21 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342582#comment-15342582
 ] 

Pengcheng Xiong commented on HIVE-14040:


Sure, we would like to know if you agree with the idea to disallow users to 
create partition table with HBaseHandler.

> insert overwrite for HBase doesn't overwrite
> 
>
> Key: HIVE-14040
> URL: https://issues.apache.org/jira/browse/HIVE-14040
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Creating a table and doing insert overwrite twice with two different rows 
> (for example) results in the table with both rows, rather than only one as 
> per "overwrite"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14040) insert overwrite for HBase doesn't overwrite

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342570#comment-15342570
 ] 

Sergey Shelukhin commented on HIVE-14040:
-

That seems to be a separate issue that would require a separate JIRA ;)

> insert overwrite for HBase doesn't overwrite
> 
>
> Key: HIVE-14040
> URL: https://issues.apache.org/jira/browse/HIVE-14040
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Creating a table and doing insert overwrite twice with two different rows 
> (for example) results in the table with both rows, rather than only one as 
> per "overwrite"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14040) insert overwrite for HBase doesn't overwrite

2016-06-21 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342541#comment-15342541
 ] 

Pengcheng Xiong commented on HIVE-14040:


[~ashutoshc]'s suggestion is to disallow users to create partition table with 
HBaseHandler.

> insert overwrite for HBase doesn't overwrite
> 
>
> Key: HIVE-14040
> URL: https://issues.apache.org/jira/browse/HIVE-14040
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Creating a table and doing insert overwrite twice with two different rows 
> (for example) results in the table with both rows, rather than only one as 
> per "overwrite"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14040) insert overwrite for HBase doesn't overwrite

2016-06-21 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342537#comment-15342537
 ] 

Pengcheng Xiong commented on HIVE-14040:


Here is some other finding. It seems that we allow create partition table with 
HBaseHandler but insert overwrite does not work.
{code}
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=10;

DROP TABLE users;
CREATE TABLE users(k string, state string, country string) partitioned by(key 
string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = "info:state,info:country"
);

INSERT OVERWRITE TABLE users partition(key) SELECT 'user1', 'IA', 'USA', key 
FROM src;

desc formatted users;

select * from users;
{code}

the output
{code}
POSTHOOK: query: desc formatted users
POSTHOOK: type: DESCTABLE
POSTHOOK: Input: default@users
# col_name  data_type   comment

k   string
state   string
country string

# Partition Information
# col_name  data_type   comment

key string

# Detailed Table Information
Database:   default
 A masked pattern was here 
Retention:  0
 A masked pattern was here 
Table Type: MANAGED_TABLE
Table Parameters:
storage_handler org.apache.hadoop.hive.hbase.HBaseStorageHandler
 A masked pattern was here 

# Storage Information
SerDe Library:  org.apache.hadoop.hive.hbase.HBaseSerDe
InputFormat:null
OutputFormat:   null
Compressed: No
Num Buckets:-1
Bucket Columns: []
Sort Columns:   []
Storage Desc Params:
hbase.columns.mapping   info:state,info:country
serialization.format1
PREHOOK: query: select * from users
PREHOOK: type: QUERY
PREHOOK: Input: default@users
 A masked pattern was here 
POSTHOOK: query: select * from users
POSTHOOK: type: QUERY
POSTHOOK: Input: default@users
 A masked pattern was here 
{code}

> insert overwrite for HBase doesn't overwrite
> 
>
> Key: HIVE-14040
> URL: https://issues.apache.org/jira/browse/HIVE-14040
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Creating a table and doing insert overwrite twice with two different rows 
> (for example) results in the table with both rows, rather than only one as 
> per "overwrite"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14038) miscellaneous acid improvements

2016-06-21 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14038:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   1.3.0
   Status: Resolved  (was: Patch Available)

committed to branch-1 and master
thanks Wei for the review

> miscellaneous acid improvements
> ---
>
> Key: HIVE-14038
> URL: https://issues.apache.org/jira/browse/HIVE-14038
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-14038.2.patch, HIVE-14038.3.patch, 
> HIVE-14038.8.patch, HIVE-14038.patch
>
>
> 1. fix thread name inHouseKeeperServiceBase (currently they are all 
> "org.apache.hadoop.hive.ql.txn.compactor.HouseKeeperServiceBase$1-0")
> 2. dump metastore configs from HiveConf on start up to help record values of 
> properties
> 3. add some tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14045) (Vectorization) Add missing case for BINARY in VectorizationContext.getNormalizedName method

2016-06-21 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342499#comment-15342499
 ] 

Hive QA commented on HIVE-14045:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12812091/HIVE-14045.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10250 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_casts
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_casts
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/209/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/209/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-209/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12812091 - PreCommit-HIVE-MASTER-Build

> (Vectorization) Add missing case for BINARY in 
> VectorizationContext.getNormalizedName method
> 
>
> Key: HIVE-14045
> URL: https://issues.apache.org/jira/browse/HIVE-14045
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
> Fix For: 2.2.0
>
> Attachments: HIVE-14045.01.patch, HIVE-14045.02.patch
>
>
> Missing case for BINARY data type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14069) update curator version to 2.10.0

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342494#comment-15342494
 ] 

Sergey Shelukhin commented on HIVE-14069:
-

https://issues.apache.org/jira/browse/HIVE-13930 is the Hadoop JIRA.

As for lack of compat, I am not sure if that's backward compat issue in 
binaries in Curator, or just a consequence of having multiple jar versions 
(because of the difference between Hive and YARN); we've seen an error like 
this: {noformat}Caused by: java.lang.NoSuchMethodError: 
org.apache.curator.framework.recipes.shared.SharedCount.getVersionedValue()Lorg/apache/curator/framework/recipes/shared/VersionedValue;
{noformat}

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-14069.1.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14069) update curator version to 2.10.0

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342494#comment-15342494
 ] 

Sergey Shelukhin edited comment on HIVE-14069 at 6/21/16 7:23 PM:
--

HIVE-13930 is the Hadoop JIRA.

As for lack of compat, I am not sure if that's backward compat issue in 
binaries in Curator, or just a consequence of having multiple jar versions 
(because of the difference between Hive and YARN); we've seen an error like 
this: {noformat}Caused by: java.lang.NoSuchMethodError: 
org.apache.curator.framework.recipes.shared.SharedCount.getVersionedValue()Lorg/apache/curator/framework/recipes/shared/VersionedValue;
{noformat}


was (Author: sershe):
https://issues.apache.org/jira/browse/HIVE-13930 is the Hadoop JIRA.

As for lack of compat, I am not sure if that's backward compat issue in 
binaries in Curator, or just a consequence of having multiple jar versions 
(because of the difference between Hive and YARN); we've seen an error like 
this: {noformat}Caused by: java.lang.NoSuchMethodError: 
org.apache.curator.framework.recipes.shared.SharedCount.getVersionedValue()Lorg/apache/curator/framework/recipes/shared/VersionedValue;
{noformat}

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-14069.1.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong results on HS2

2016-06-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342489#comment-15342489
 ] 

Thejas M Nair commented on HIVE-14070:
--

Are you saying that you the call is not actually needed and you are removing it 
?


> hive.tez.exec.print.summary=true returns wrong results on HS2
> -
>
> Key: HIVE-14070
> URL: https://issues.apache.org/jira/browse/HIVE-14070
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14070.01.patch
>
>
> On master, we have 
> {code}
> Query Execution Summary
> --
> OPERATIONDURATION
> --
> Compile Query   -1466208820.74s
> Prepare Plan0.00s
> Submit Plan 1466208825.50s
> Start DAG   0.26s
> Run DAG 4.39s
> --
> Task Execution Summary
> --
>   VERTICES   DURATION(ms)  CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  
> OUTPUT_RECORDS
> --
>  Map 11014.00 1,534   11  1,500   
> 1
>  Reducer 2  96.00   5410  1   
> 0
> --
> {code}
> sounds like a real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14063) beeline to auto connect to the HiveServer2

2016-06-21 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342487#comment-15342487
 ] 

Vihang Karajgaonkar commented on HIVE-14063:


I think beeline.properties should ideally be the right name for this but since 
we already have a BeeLine.properties file which is used by beeline 
ResourceBundle it causes exceptions like below if we have beeline.properties

{code}
java.util.MissingResourceException: Can't find resource for bundle 
java.util.PropertyResourceBundle, key help-addlocaldrivername
at java.util.ResourceBundle.getObject(ResourceBundle.java:395)
at java.util.ResourceBundle.getString(ResourceBundle.java:355)
at org.apache.hive.beeline.BeeLine.loc(BeeLine.java:470)
at org.apache.hive.beeline.BeeLine.loc(BeeLine.java:447)
at 
org.apache.hive.beeline.ReflectiveCommandHandler.(ReflectiveCommandHandler.java:45)
at org.apache.hive.beeline.BeeLine.(BeeLine.java:176)
at org.apache.hive.beeline.BeeLine.(BeeLine.java:519)
at 
org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:509)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:493)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}


Hence the other good name for this file could be beeline.conf instead of 
beeline-default.properties.

> beeline to auto connect to the HiveServer2
> --
>
> Key: HIVE-14063
> URL: https://issues.apache.org/jira/browse/HIVE-14063
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> Currently one has to give an jdbc:hive2 url in order for Beeline to connect a 
> hiveserver2 instance. It would be great if Beeline can get the info somehow 
> (from a properties file at a well-known location?) and connect automatically 
> if user doesn't specify such a url. If the properties file is not present, 
> then beeline would expect user to provide the url and credentials using 
> !connect or ./beeline -u .. commands
> While Beeline is flexible (being a mere JDBC client), most environments would 
> have just a single HS2. Having users to manually connect into this via either 
> "beeline ~/.propsfile" or -u or !connect statements is lowering the 
> experience part.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14069) update curator version to 2.10.0

2016-06-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342481#comment-15342481
 ] 

Thejas M Nair commented on HIVE-14069:
--

[~sershe]
Thanks for the inputs. I wasn't aware of any incompatibilities, my quick review 
of the [curator release 
notes|https://cwiki.apache.org/confluence/display/CURATOR/Releases] didn't 
suggest any incompatible changes.
Do you have more information about the 2.6 vs 2.7 incompatibility ? 
Also, I couldn't find the jira you refer to, do you have that ? Searching hive 
jira curator returned too many issues with curator in the build warnings.


> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
> Attachments: HIVE-14069.1.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14040) insert overwrite for HBase doesn't overwrite

2016-06-21 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-14040:
---

Assignee: Sergey Shelukhin

> insert overwrite for HBase doesn't overwrite
> 
>
> Key: HIVE-14040
> URL: https://issues.apache.org/jira/browse/HIVE-14040
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Creating a table and doing insert overwrite twice with two different rows 
> (for example) results in the table with both rows, rather than only one as 
> per "overwrite"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14055) directSql - getting the number of partitions is broken

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342471#comment-15342471
 ] 

Sergey Shelukhin commented on HIVE-14055:
-

Hmm... the thing is that not being able to push down a filter is not an error, 
it's an expected condition - most filters cannot be pushed down here, only a 
narrow set of filters is supported. I could separate it into two functions, one 
returning boolean, and the other non-nullable, but an exception shouldn't be 
used in this case.

> directSql - getting the number of partitions is broken
> --
>
> Key: HIVE-14055
> URL: https://issues.apache.org/jira/browse/HIVE-14055
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14055.01.patch, HIVE-14055.patch
>
>
> Noticed while looking at something else. If the filter cannot be pushed down 
> it just returns 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14055) directSql - getting the number of partitions is broken

2016-06-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342466#comment-15342466
 ] 

Sergio Peña commented on HIVE-14055:


Hey Sergey, sorry I am still against the null value. I was thinking that if a 
filter push down couldn't be done, then the method should throw an exception 
with the reason of that. While I was running some tests with HIVE-14063, I did 
a test you mentioned about falling-back to ORM if direct-sql fails, and ORM 
also failed but with an exception returned by {{getNumPartitionsViaOrmFilter}} 
due to problems with the filter. In this scenario, I could know exactly what 
the problem was.

In my opinion, I think that we should throw a checked exception in case an 
internal error happens, and let the caller decide what to do with it. 
[~ashutoshc] [~sushanth] what do you think about this? I'd like to know other 
opinions.

> directSql - getting the number of partitions is broken
> --
>
> Key: HIVE-14055
> URL: https://issues.apache.org/jira/browse/HIVE-14055
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14055.01.patch, HIVE-14055.patch
>
>
> Noticed while looking at something else. If the filter cannot be pushed down 
> it just returns 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14055) directSql - getting the number of partitions is broken

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342454#comment-15342454
 ] 

Sergey Shelukhin commented on HIVE-14055:
-

Test failures appear unrelated (hashmap ordering) or known.

> directSql - getting the number of partitions is broken
> --
>
> Key: HIVE-14055
> URL: https://issues.apache.org/jira/browse/HIVE-14055
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14055.01.patch, HIVE-14055.patch
>
>
> Noticed while looking at something else. If the filter cannot be pushed down 
> it just returns 0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13884) Disallow queries fetching more than a configured number of partitions in PartitionPruner

2016-06-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342416#comment-15342416
 ] 

Sergio Peña commented on HIVE-13884:


Thanks [~sershe].

[~mohitsabharwal] [~brocknoland] I run a test with 10K partitions {{select * 
from table12 where dt < 1}} with the variable enabled and disabled. There's 
not too much difference. I got a difference of 1 second, and I tested it 5 
times each time, even without the patch applied. I think we are good to go for 
this.

I'll wait until HIVE-14055 is fixed as I would need to change this patch as 
well.

> Disallow queries fetching more than a configured number of partitions in 
> PartitionPruner
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Attachments: HIVE-13884.1.patch, HIVE-13884.2.patch, 
> HIVE-13884.3.patch, HIVE-13884.4.patch, HIVE-13884.5.patch, HIVE-13884.6.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14038) miscellaneous acid improvements

2016-06-21 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342412#comment-15342412
 ] 

Wei Zheng commented on HIVE-14038:
--

+1

> miscellaneous acid improvements
> ---
>
> Key: HIVE-14038
> URL: https://issues.apache.org/jira/browse/HIVE-14038
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14038.2.patch, HIVE-14038.3.patch, 
> HIVE-14038.8.patch, HIVE-14038.patch
>
>
> 1. fix thread name inHouseKeeperServiceBase (currently they are all 
> "org.apache.hadoop.hive.ql.txn.compactor.HouseKeeperServiceBase$1-0")
> 2. dump metastore configs from HiveConf on start up to help record values of 
> properties
> 3. add some tests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14057) Add an option in llapstatus to generate output to a file

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342398#comment-15342398
 ] 

Sergey Shelukhin commented on HIVE-14057:
-

+1

> Add an option in llapstatus to generate output to a file
> 
>
> Key: HIVE-14057
> URL: https://issues.apache.org/jira/browse/HIVE-14057
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14057.01.patch, HIVE-14057.02.patch, 
> HIVE-14057.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14068) make more effort to find hive-site.xml

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342394#comment-15342394
 ] 

Sergey Shelukhin commented on HIVE-14068:
-

Most failures are known; TestJdbcWithMiniLlap is broken by some other JIRA

> make more effort to find hive-site.xml
> --
>
> Key: HIVE-14068
> URL: https://issues.apache.org/jira/browse/HIVE-14068
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14068.patch
>
>
> It pretty much doesn't make sense to run Hive w/o the config, so we should 
> make more effort to find one if it's missing on the classpath, or the 
> classloader does not return it for some reason (e.g. classloader ignores some 
> permission issues; explicitly looking for the file may expose them better)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14014) zero length file is being created for empty bucket in tez mode (II)

2016-06-21 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342386#comment-15342386
 ] 

Sergey Shelukhin commented on HIVE-14014:
-

This patch causes TestJdbcWithMiniLlap to get stuck, and it has failed for a 
while in HiveQA here. I wonder why this was committed?

> zero length file is being created for empty bucket in tez mode (II)
> ---
>
> Key: HIVE-14014
> URL: https://issues.apache.org/jira/browse/HIVE-14014
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14014.01.patch, HIVE-14014.02.patch, 
> HIVE-14014.03.patch, HIVE-14014.04.patch
>
>
> The same problem happens when source table is not empty, e.g,, when "limit 0" 
> is not there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14057) Add an option in llapstatus to generate output to a file

2016-06-21 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14057:
--
Attachment: HIVE-14057.03.patch

Updated patch with RB comments fixed.

In terms of a remote host - that's a good point. I suppose it's possible to 
setup transfer of output after the command runs. (e.g. ptest does this).

Even on the local system - this can go wrong. Again with a MOTD setup in 
/etc/motd. When login is invoked, this gets printed. If a service runs this 
command as a different user than the service user - a long will likely be 
invoked, and the output gets messed up.

> Add an option in llapstatus to generate output to a file
> 
>
> Key: HIVE-14057
> URL: https://issues.apache.org/jira/browse/HIVE-14057
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14057.01.patch, HIVE-14057.02.patch, 
> HIVE-14057.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13966) DbNotificationListener: can loose DDL operation notifications

2016-06-21 Thread Rahul Sharma (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342362#comment-15342362
 ] 

Rahul Sharma commented on HIVE-13966:
-

Can a listener be registered with multiple properties, for ex:  can 
DBNotification listener be part of 
hive.metastore.synchronous.event.listeners(proposed above) and 
hive.metastore.event.listeners . If yes, should there be a check 
to not add the notification log again?

> DbNotificationListener: can loose DDL operation notifications
> -
>
> Key: HIVE-13966
> URL: https://issues.apache.org/jira/browse/HIVE-13966
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Nachiket Vaidya
>Assignee: Rahul Sharma
>Priority: Critical
>
> The code for each API in HiveMetaStore.java is like this:
> 1. openTransaction()
> 2. -- operation--
> 3. commit() or rollback() based on result of the operation.
> 4. add entry to notification log (unconditionally)
> If the operation is failed (in step 2), we still add entry to notification 
> log. Found this issue in testing.
> It is still ok as this is the case of false positive.
> If the operation is successful and adding to notification log failed, the 
> user will get an MetaException. It will not rollback the operation, as it is 
> already committed. We need to handle this case so that we will not have false 
> negatives.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong results on HS2

2016-06-21 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342355#comment-15342355
 ] 

Pengcheng Xiong edited comment on HIVE-14070 at 6/21/16 6:11 PM:
-

Originally,i used SessionState.getPerfLogger().setPerfLogger(parentPerfLogger), 
then Eclipse asked me to refactor that to this version.


was (Author: pxiong):
Originally,i used SessionState.getPerfLogger().setPerfLogger(parentPerfLogger), 
then java asked me to refactor that to this version.

> hive.tez.exec.print.summary=true returns wrong results on HS2
> -
>
> Key: HIVE-14070
> URL: https://issues.apache.org/jira/browse/HIVE-14070
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14070.01.patch
>
>
> On master, we have 
> {code}
> Query Execution Summary
> --
> OPERATIONDURATION
> --
> Compile Query   -1466208820.74s
> Prepare Plan0.00s
> Submit Plan 1466208825.50s
> Start DAG   0.26s
> Run DAG 4.39s
> --
> Task Execution Summary
> --
>   VERTICES   DURATION(ms)  CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  
> OUTPUT_RECORDS
> --
>  Map 11014.00 1,534   11  1,500   
> 1
>  Reducer 2  96.00   5410  1   
> 0
> --
> {code}
> sounds like a real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14063) beeline to auto connect to the HiveServer2

2016-06-21 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342345#comment-15342345
 ] 

Vihang Karajgaonkar commented on HIVE-14063:


I am thinking of adding a file in HIVE_CONF which will be called 
beeline-default.properties.template . This file will have comments and 
explanation of various properties which could be added. All the properties in 
this template will be commented by default. User can copy this template file 
and create beeline-default.properties in $HIVE_CONF location 
(beeline.properties is already present in the code base and we cannot use that 
name since it is causing class load order issues). User can modified the copied 
file and uncomment the properties while providing their values as per need like 
url, user, password (principal in case of Kerberos enabled environment). 

When beeline is started as a part of its initialization, beeline will look for 
beeline-default.properties file in $HIVE_CONF location. If the file is found it 
will parse it to get the default url, user,password and attempt to connect to 
the hs2 using the information provided. If the user decides not to provide 
password in the file, beeline will ask for it when attempting to connect to the 
default url.

User has the ability to disregard this auto-connect feature by not providing 
beeline-default.properties. If the file is not found, beeline will default to 
current behavior where it will expect the user to use !connect or other 
commands to initiate the connection.

If the user needs to connect to a different url user can still use the existing 
ways to provide it using beeline -u commands or create another property-file 
similar to earlier (location of this file is not bound to $HIVE_CONF) and use 
the beeline --property-file  command to initiate the connection. 

The design adds the ability for the user to have a default connection 
information (probably common to most use-cases) but at the same time doesn't 
limit the experience/flexibility of Beeline to be able to connect to other urls.

[~spena], [~xuefuz] [~dlo] [~sircodesalot] [~aihuaxu] please review the design 
approach above and let me know if you can think of any problems. 

Anyone else interested please feel free to chim in. Thanks!

> beeline to auto connect to the HiveServer2
> --
>
> Key: HIVE-14063
> URL: https://issues.apache.org/jira/browse/HIVE-14063
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
>
> Currently one has to give an jdbc:hive2 url in order for Beeline to connect a 
> hiveserver2 instance. It would be great if Beeline can get the info somehow 
> (from a properties file at a well-known location?) and connect automatically 
> if user doesn't specify such a url. If the properties file is not present, 
> then beeline would expect user to provide the url and credentials using 
> !connect or ./beeline -u .. commands
> While Beeline is flexible (being a mere JDBC client), most environments would 
> have just a single HS2. Having users to manually connect into this via either 
> "beeline ~/.propsfile" or -u or !connect statements is lowering the 
> experience part.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong results on HS2

2016-06-21 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342355#comment-15342355
 ] 

Pengcheng Xiong commented on HIVE-14070:


Originally,i used SessionState.getPerfLogger().setPerfLogger(parentPerfLogger), 
then java asked me to refactor that to this version.

> hive.tez.exec.print.summary=true returns wrong results on HS2
> -
>
> Key: HIVE-14070
> URL: https://issues.apache.org/jira/browse/HIVE-14070
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14070.01.patch
>
>
> On master, we have 
> {code}
> Query Execution Summary
> --
> OPERATIONDURATION
> --
> Compile Query   -1466208820.74s
> Prepare Plan0.00s
> Submit Plan 1466208825.50s
> Start DAG   0.26s
> Run DAG 4.39s
> --
> Task Execution Summary
> --
>   VERTICES   DURATION(ms)  CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  
> OUTPUT_RECORDS
> --
>  Map 11014.00 1,534   11  1,500   
> 1
>  Reducer 2  96.00   5410  1   
> 0
> --
> {code}
> sounds like a real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14070) hive.tez.exec.print.summary=true returns wrong results on HS2

2016-06-21 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342344#comment-15342344
 ] 

Thejas M Nair commented on HIVE-14070:
--

A comment on just the SQLOperation.java change -

{code}
+  SessionState.getPerfLogger();
+  PerfLogger.setPerfLogger(parentPerfLogger);
{code}

Why is the above SessionState.getPerfLogger() call needed ?
 

> hive.tez.exec.print.summary=true returns wrong results on HS2
> -
>
> Key: HIVE-14070
> URL: https://issues.apache.org/jira/browse/HIVE-14070
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14070.01.patch
>
>
> On master, we have 
> {code}
> Query Execution Summary
> --
> OPERATIONDURATION
> --
> Compile Query   -1466208820.74s
> Prepare Plan0.00s
> Submit Plan 1466208825.50s
> Start DAG   0.26s
> Run DAG 4.39s
> --
> Task Execution Summary
> --
>   VERTICES   DURATION(ms)  CPU_TIME(ms)  GC_TIME(ms)  INPUT_RECORDS  
> OUTPUT_RECORDS
> --
>  Map 11014.00 1,534   11  1,500   
> 1
>  Reducer 2  96.00   5410  1   
> 0
> --
> {code}
> sounds like a real issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 141 matches

Mail list logo