date:20160322

[jira] [Updated] (HIVE-13336) Transform unix_timestamp(args) into to_unix_timestamp(args)

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13336:
---
Attachment: HIVE-13336.2.patch

> Transform unix_timestamp(args) into to_unix_timestamp(args)
> ---
>
> Key: HIVE-13336
> URL: https://issues.apache.org/jira/browse/HIVE-13336
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13336.1.patch, HIVE-13336.2.patch
>
>
> Transformation is necessary because the isDeterministic is a class annotation 
> & is not dependent on the argument count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13336) Transform unix_timestamp(args) into to_unix_timestamp(args)

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13336:
---
Status: Patch Available  (was: Open)

> Transform unix_timestamp(args) into to_unix_timestamp(args)
> ---
>
> Key: HIVE-13336
> URL: https://issues.apache.org/jira/browse/HIVE-13336
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13336.1.patch, HIVE-13336.2.patch
>
>
> Transformation is necessary because the isDeterministic is a class annotation 
> & is not dependent on the argument count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-13331) Failures when concatenating ORC files using tez

2016-03-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-13331.
--
Resolution: Won't Fix

Closing this issue as it has been fixed already.

> Failures when concatenating ORC files using tez
> ---
>
> Key: HIVE-13331
> URL: https://issues.apache.org/jira/browse/HIVE-13331
> Project: Hive
>  Issue Type: Bug
> Environment: HDP 2.2
> Hive 0.14 with Tez as execution engine
>Reporter: Ashish Shenoy
>Assignee: Prasanth Jayachandran
>
> I hit this issue consistently when I try to concatenate the ORC files in a 
> hive partition using 'ALTER TABLE ... PARTITION(...) CONCATENATE'. In an 
> email thread on the hive users mailing list 
> [http://mail-archives.apache.org/mod_mbox/hive-user/201504.mbox/%3c553a2a9e.70...@uib.no%3E],
>  I read that tez should be used as the execution engine for hive, so I 
> updated my hive configs to use tez as the exec engine.
> Here's the stack trace when I use the Tez execution engine:
> 
> VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> 
> File Merge FAILED -1 0 0 -1 0 0
> 
> VERTICES: 00/01 [>>--] 0% ELAPSED TIME: 1458666880.00 
> s
> 
> Status: Failed
> Vertex failed, vertexName=File Merge, 
> vertexId=vertex_1455906569416_0009_1_00, diagnostics=[Vertex 
> vertex_1455906569416_0009_1_00 [File Merge] killed/failed due 
> to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [] initializer 
> failed, vertex=vertex_1455906569416_0009_1_00 [File Merge], 
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452)
> at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441)
> at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295)
> at 
> org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> ]
> DAG failed due to vertex failure. failedVertices:1 killedVertices:0
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> Please let me know if this has been fixed ? This seems like a very basic 
> thing for Hive to get wrong, so I am wondering if I am using the right 
> configs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores

2016-03-22 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207886#comment-15207886
 ] 

Wei Zheng commented on HIVE-11388:
--

Thanks Eugene.
1. I see. It would be helpful to have a comment by the first stmt.executeQuery 
since it's not explicit. I didn't realize that in the first round :)
2. I mean we do need such logic to filter out compactions cleaned by other 
Cleaners. I'm saying we can have simpler code by directly using toClean. But I 
just realized that we need to extract id from CompactionInfo to have a 
convenient set, so never mind.
3. Agree.

Btw what's the purpose of having column MT_KEY2?

> Allow ACID Compactor components to run in multiple metastores
> -
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, 
> HIVE-11388.5.patch, HIVE-11388.6.patch, HIVE-11388.7.patch, HIVE-11388.patch
>
>
> (this description is no loner accurate; see further comments)
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13262) LLAP: Remove log levels from DebugUtils

2016-03-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207871#comment-15207871
 ] 

Hive QA commented on HIVE-13262:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12794566/HIVE-13262.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9836 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7341/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7341/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7341/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12794566 - PreCommit-HIVE-TRUNK-Build

> LLAP: Remove log levels from DebugUtils
> ---
>
> Key: HIVE-13262
> URL: https://issues.apache.org/jira/browse/HIVE-13262
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13262.1.patch, HIVE-13262.2.patch, 
> HIVE-13262.2.patch
>
>
> DebugUtils has many hardcoded log levels. To enable logging we need to 
> recompile code with desired value. Instead configure add loggers for these 
> classes with log levels via log4j properties. Also use parametrized logging 
> in IO elevator. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13336) Transform unix_timestamp(args) into to_unix_timestamp(args)

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13336:
---
Attachment: HIVE-13336.1.patch

> Transform unix_timestamp(args) into to_unix_timestamp(args)
> ---
>
> Key: HIVE-13336
> URL: https://issues.apache.org/jira/browse/HIVE-13336
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-13336.1.patch
>
>
> Transformation is necessary because the isDeterministic is a class annotation 
> & is not dependent on the argument count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-13336) Transform unix_timestamp(args) into to_unix_timestamp(args)

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-13336:
--

Assignee: Gopal V  (was: Jason Dere)

> Transform unix_timestamp(args) into to_unix_timestamp(args)
> ---
>
> Key: HIVE-13336
> URL: https://issues.apache.org/jira/browse/HIVE-13336
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>
> Transformation is necessary because 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13336) Transform unix_timestamp(args) into to_unix_timestamp(args)

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13336:
---
Description: Transformation is necessary because the isDeterministic is a 
class annotation & is not dependent on the argument count.  (was: 
Transformation is necessary because )

> Transform unix_timestamp(args) into to_unix_timestamp(args)
> ---
>
> Key: HIVE-13336
> URL: https://issues.apache.org/jira/browse/HIVE-13336
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>
> Transformation is necessary because the isDeterministic is a class annotation 
> & is not dependent on the argument count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13336) Transform unix_timestamp(args) into to_unix_timestamp(args)

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13336:
---
Description: Transformation is necessary because 

> Transform unix_timestamp(args) into to_unix_timestamp(args)
> ---
>
> Key: HIVE-13336
> URL: https://issues.apache.org/jira/browse/HIVE-13336
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Jason Dere
>
> Transformation is necessary because 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2016-03-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207820#comment-15207820
 ] 

Sergey Shelukhin commented on HIVE-9660:


[~prasanth_j] fyi

> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9660.WIP2.patch, HIVE-9660.patch, HIVE-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2016-03-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9660:
---
Attachment: HIVE-9660.patch

The attempt #1

> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9660.WIP2.patch, HIVE-9660.patch, HIVE-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-03-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207731#comment-15207731
 ] 

Hive QA commented on HIVE-13149:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12794542/HIVE-13149.6.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9807 tests executed
*Failed tests:*
{noformat}
TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_13.q-alter_merge_2_orc.q-vector_outer_join2.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_partition_diff_num_cols.q-tez_joins_explain.q-vector_decimal_aggregate.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7340/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7340/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7340/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12794542 - PreCommit-HIVE-TRUNK-Build

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch, HIVE-13149.5.patch, HIVE-13149.6.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10249) ACID: show locks should show who the lock is waiting for

2016-03-22 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10249:
--
Status: Patch Available  (was: Open)

> ACID: show locks should show who the lock is waiting for
> 
>
> Key: HIVE-10249
> URL: https://issues.apache.org/jira/browse/HIVE-10249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-10249.patch
>
>
> instead of just showing state WAITING, we should include what the lock is 
> waiting for.  It will make diagnostics easier.
> It would also be useful to add QueryPlan.getQueryId() so it's easy to see 
> which query the lock belongs to.
> # need to store this in HIVE_LOCKS (additional field); this has a perf hit to 
> do another update on failed attempt and to clear filed on successful attempt. 
>  (Actually on success, we update anyway).  How exactly would this be 
> displayed?  Each lock can block but we acquire all parts of external lock at 
> once.  Since we stop at first one that blocked, we’d only update that one…
> # This needs a matching Thrift change to pass to client: ShowLocksResponse
> # Perhaps we can start updating this info after lock was in W state for some 
> time to reduce perf hit.
> # This is mostly useful for “Why is my query stuck”



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores

2016-03-22 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207696#comment-15207696
 ] 

Eugene Koifman commented on HIVE-11388:
---

1. The purpose is to run Select For Update.  So if the key is already there, 
the 1st "rs = stmt.executeQuery(sqlStmt);" will do it.
2. we do need this.  Since you may have several Cleaner processes running, they 
will each accumulate state in these data structures.  But you don't know which 
instance will end up actually cleaning files so if you remove data from these 
structures you'll have a memory leak.
3. What would that confirm?  If the counts are off at this point, it means the 
2nd thread somehow ran ahead and thus it will see its counts being different.

> Allow ACID Compactor components to run in multiple metastores
> -
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, 
> HIVE-11388.5.patch, HIVE-11388.6.patch, HIVE-11388.7.patch, HIVE-11388.patch
>
>
> (this description is no loner accurate; see further comments)
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10249) ACID: show locks should show who the lock is waiting for

2016-03-22 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10249:
--
Attachment: HIVE-10249.patch

> ACID: show locks should show who the lock is waiting for
> 
>
> Key: HIVE-10249
> URL: https://issues.apache.org/jira/browse/HIVE-10249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-10249.patch
>
>
> instead of just showing state WAITING, we should include what the lock is 
> waiting for.  It will make diagnostics easier.
> It would also be useful to add QueryPlan.getQueryId() so it's easy to see 
> which query the lock belongs to.
> # need to store this in HIVE_LOCKS (additional field); this has a perf hit to 
> do another update on failed attempt and to clear filed on successful attempt. 
>  (Actually on success, we update anyway).  How exactly would this be 
> displayed?  Each lock can block but we acquire all parts of external lock at 
> once.  Since we stop at first one that blocked, we’d only update that one…
> # This needs a matching Thrift change to pass to client: ShowLocksResponse
> # Perhaps we can start updating this info after lock was in W state for some 
> time to reduce perf hit.
> # This is mostly useful for “Why is my query stuck”



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10600) optimize group by for GC

2016-03-22 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-10600.
-
Resolution: Duplicate

HIVE-12369

> optimize group by for GC
> 
>
> Key: HIVE-10600
> URL: https://issues.apache.org/jira/browse/HIVE-10600
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Matt McCline
>
> Quoting [~gopalv]:
> {noformat}
> So, something like a sum() GROUP BY will create a few hundred thousand
> AbstractAggregationBuffer objects all of which will suddenly go out of
> scope when the map.aggr flushes it down to the sort buffer.
> That particular GC collection takes forever because the tiny buffers take
> a lot of time to walk over and then they leave the memory space
> fragmented, which requires a compaction pass (which btw, writes to a
> page-interleaved NUMA zone).
> And to make things worse, the pre-allocated sort buffers with absolutely
> zero data in them take up most of the tenured regions causing these chunks
> of memory to be visited more and more often as they are part of the Eden
> space.
> {noformat}
> We need flat data structures to be GC friendly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10249) ACID: show locks should show who the lock is waiting for

2016-03-22 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10249:
--
Priority: Critical  (was: Major)

> ACID: show locks should show who the lock is waiting for
> 
>
> Key: HIVE-10249
> URL: https://issues.apache.org/jira/browse/HIVE-10249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-10249.patch
>
>
> instead of just showing state WAITING, we should include what the lock is 
> waiting for.  It will make diagnostics easier.
> It would also be useful to add QueryPlan.getQueryId() so it's easy to see 
> which query the lock belongs to.
> # need to store this in HIVE_LOCKS (additional field); this has a perf hit to 
> do another update on failed attempt and to clear filed on successful attempt. 
>  (Actually on success, we update anyway).  How exactly would this be 
> displayed?  Each lock can block but we acquire all parts of external lock at 
> once.  Since we stop at first one that blocked, we’d only update that one…
> # This needs a matching Thrift change to pass to client: ShowLocksResponse
> # Perhaps we can start updating this info after lock was in W state for some 
> time to reduce perf hit.
> # This is mostly useful for “Why is my query stuck”



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11386) Improve Vectorized GROUP BY Performance (Phase 1)

2016-03-22 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-11386:

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

HIVE-12369

> Improve Vectorized GROUP BY Performance (Phase 1)
> -
>
> Key: HIVE-11386
> URL: https://issues.apache.org/jira/browse/HIVE-11386
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-11386.01.patch, HIVE-11386.02.patch
>
>
> Improve vectorized GROUP BY performance, with an eye towards the new LLAP 
> memory management (dramatically reduce the number of Java object, allocate 
> very large objects, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-13334) stats state is not captured correctly

2016-03-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong reassigned HIVE-13334:
--

Assignee: Pengcheng Xiong

> stats state is not captured correctly
> -
>
> Key: HIVE-13334
> URL: https://issues.apache.org/jira/browse/HIVE-13334
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Statistics
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Pengcheng Xiong
>
> As a results StatsOptimizer gives incorrect result. Can be reproduced with 
> for following queries:
> {code}
>  mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
> -Dqfile=insert_orig_table.q,insert_values_orig_table.q,orc_merge9.q,sample_islocalmode_hook.-Dhive.compute.query.using.stats=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13334) stats state is not captured correctly

2016-03-22 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207671#comment-15207671
 ] 

Pengcheng Xiong commented on HIVE-13334:


[~ashutoshc], sure. I would like to turn this on by default for quite a long 
time.  :)

> stats state is not captured correctly
> -
>
> Key: HIVE-13334
> URL: https://issues.apache.org/jira/browse/HIVE-13334
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Statistics
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>
> As a results StatsOptimizer gives incorrect result. Can be reproduced with 
> for following queries:
> {code}
>  mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
> -Dqfile=insert_orig_table.q,insert_values_orig_table.q,orc_merge9.q,sample_islocalmode_hook.-Dhive.compute.query.using.stats=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13041) Backport to branch-1 HIVE-9862 Vectorized execution corrupts timestamp values

2016-03-22 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207660#comment-15207660
 ] 

Matt McCline commented on HIVE-13041:
-

Very large change.  Holding off for now.

> Backport to branch-1 HIVE-9862 Vectorized execution corrupts timestamp values
> -
>
> Key: HIVE-13041
> URL: https://issues.apache.org/jira/browse/HIVE-13041
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13041.1-branch1.patch, HIVE-13041.2-branch1.patch
>
>
> Backport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13041) Backport to branch-1 HIVE-9862 Vectorized execution corrupts timestamp values

2016-03-22 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13041:

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

> Backport to branch-1 HIVE-9862 Vectorized execution corrupts timestamp values
> -
>
> Key: HIVE-13041
> URL: https://issues.apache.org/jira/browse/HIVE-13041
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13041.1-branch1.patch, HIVE-13041.2-branch1.patch
>
>
> Backport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13334) stats state is not captured correctly

2016-03-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207636#comment-15207636
 ] 

Ashutosh Chauhan commented on HIVE-13334:
-


[~pxiong] Can you take a look at this one ?
There might be different root causes for these failures. If so, lets create 
separate jira for each.

> stats state is not captured correctly
> -
>
> Key: HIVE-13334
> URL: https://issues.apache.org/jira/browse/HIVE-13334
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Statistics
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>
> As a results StatsOptimizer gives incorrect result. Can be reproduced with 
> for following queries:
> {code}
>  mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
> -Dqfile=insert_orig_table.q,insert_values_orig_table.q,orc_merge9.q,sample_islocalmode_hook.-Dhive.compute.query.using.stats=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13334) stats state is not captured correctly

2016-03-22 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13334:

Description: 
As a results StatsOptimizer gives incorrect result. Can be reproduced with for 
following queries:
{code}
 mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
-Dqfile=insert_orig_table.q,insert_values_orig_table.q,orc_merge9.q,sample_islocalmode_hook.-Dhive.compute.query.using.stats=true
{code}


  was:
As a results StatsOptimizer gives incorrect result. Can be reproduced with for 
following queries:
{code}
 mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
-Dqfile=insert_orig_table.q,insert_values_orig_table.q,orc_merge9.q,sample_islocalmode_hook.-Dhive.compute.query.using.stats=true
{code}

[~pxiong] Can you take a look at this one ?


> stats state is not captured correctly
> -
>
> Key: HIVE-13334
> URL: https://issues.apache.org/jira/browse/HIVE-13334
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Statistics
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>
> As a results StatsOptimizer gives incorrect result. Can be reproduced with 
> for following queries:
> {code}
>  mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
> -Dqfile=insert_orig_table.q,insert_values_orig_table.q,orc_merge9.q,sample_islocalmode_hook.-Dhive.compute.query.using.stats=true
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13333) StatsOptimizer throws ClassCastException

2016-03-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207618#comment-15207618
 ] 

Ashutosh Chauhan commented on HIVE-1:
-

{code}
Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to 
java.lang.Integer
at 
org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaIntObjectInspector.get(JavaIntObjectInspector.java:40)
at 
org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:239)
at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292)
at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247)
at 
org.apache.hadoop.hive.serde2.DelimitedJSONSerDe.serializeField(DelimitedJSONSerDe.java:72)
at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231)
at 
org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55)
at 
org.apache.hadoop.hive.ql.exec.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:71)
at 
org.apache.hadoop.hive.ql.exec.DefaultFetchFormatter.convert(DefaultFetchFormatter.java:40)
at 
org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:99)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
at 
org.apache.hadoop.hive.ql.exec.ListSinkOperator.process(ListSinkOperator.java:102)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:415)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:145)
{code}

from query:
{code}
select f,a,e,b from (select count(*) as a, count(c_int) as b, sum(c_int) as c, 
avg(c_int) as d, max(c_int) as e, min(c_int) as f from cbo_t1) cbo_t1
{code}



> StatsOptimizer throws ClassCastException
> 
>
> Key: HIVE-1
> URL: https://issues.apache.org/jira/browse/HIVE-1
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>
> mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
> -Dqfile=cbo_rp_udf_udaf.q -Dhive.compute.query.using.stats=true repros the 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13333) StatsOptimizer throws ClassCastException

2016-03-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207620#comment-15207620
 ] 

Ashutosh Chauhan commented on HIVE-1:
-

[~pxiong] Can you take a look at this one?

> StatsOptimizer throws ClassCastException
> 
>
> Key: HIVE-1
> URL: https://issues.apache.org/jira/browse/HIVE-1
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>
> mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
> -Dqfile=cbo_rp_udf_udaf.q -Dhive.compute.query.using.stats=true repros the 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores

2016-03-22 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207615#comment-15207615
 ] 

Wei Zheng commented on HIVE-11388:
--

[~ekoifman] I have several questions regarding patch 7.
1. In TxnHandler.acquireLock implementation, there's a {code}if 
(!rs.next()){code}block, after that, shouldn't there be an else block that 
deals with the case when there's existing key in AUX_TABLE (thus roll back the 
select for update and retry)?
2. In Cleaner.run(), I'm not sure if we need currentToCleanSet, since we're 
essentially checking the existence of compactId2CompactInfoMap members in 
toClean set.
3. In TestTxnHandler.testMutexAPI, we can add two more asserts after //now 2 
and //now 3 to confirm.

> Allow ACID Compactor components to run in multiple metastores
> -
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, 
> HIVE-11388.5.patch, HIVE-11388.6.patch, HIVE-11388.7.patch, HIVE-11388.patch
>
>
> (this description is no loner accurate; see further comments)
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13332) support dumping all row indexes in ORC FileDump

2016-03-22 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207613#comment-15207613
 ] 

Prasanth Jayachandran commented on HIVE-13332:
--

LGTM, +1

> support dumping all row indexes in ORC FileDump
> ---
>
> Key: HIVE-13332
> URL: https://issues.apache.org/jira/browse/HIVE-13332
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13332.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13261) Can not compute column stats for partition when schema evolves

2016-03-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207588#comment-15207588
 ] 

Ashutosh Chauhan commented on HIVE-13261:
-

+1

> Can not compute column stats for partition when schema evolves
> --
>
> Key: HIVE-13261
> URL: https://issues.apache.org/jira/browse/HIVE-13261
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13261.01.patch
>
>
> To repro
> {code}
> CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS 
> TEXTFILE;
> insert into table partitioned1 partition(part=1) values(1, 'original'),(2, 
> 'original'), (3, 'original'),(4, 'original');
> -- Table-Non-Cascade ADD COLUMNS ...
> alter table partitioned1 add columns(c int, d string);
> insert into table partitioned1 partition(part=2) values(1, 'new', 10, 
> 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, 
> 'forty');
> insert into table partitioned1 partition(part=1) values(5, 'new', 100, 
> 'hundred'),(6, 'new', 200, 'two hundred');
> analyze table partitioned1 compute statistics for columns;
> {code}
> Error msg:
> {code}
> 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - 
> NoSuchObjectException(message:Column c for which stats gathering is requested 
> doesn't exist.)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13332) support dumping all row indexes in ORC FileDump

2016-03-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13332:

Attachment: HIVE-13332.patch

[~prasanth_j] can you take a look? most of the changes are in out files.

> support dumping all row indexes in ORC FileDump
> ---
>
> Key: HIVE-13332
> URL: https://issues.apache.org/jira/browse/HIVE-13332
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13332.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12960) Migrate Column Stats Extrapolation to HBaseStore

2016-03-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207561#comment-15207561
 ] 

Ashutosh Chauhan commented on HIVE-12960:
-

There is incorrect import {{import antlr.SemanticException;}} 
It would be great if aggregate computation actually happens on hbase server, 
but I guess thats not possible without a co-processor.
Looks good otherwise, +1

> Migrate Column Stats Extrapolation to HBaseStore
> 
>
> Key: HIVE-12960
> URL: https://issues.apache.org/jira/browse/HIVE-12960
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-12960.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13310) Vectorized Projection Comparison Number Column to Scalar broken for !noNulls and selectedInUse

2016-03-22 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13310:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Vectorized Projection Comparison Number Column to Scalar broken for !noNulls 
> and selectedInUse
> --
>
> Key: HIVE-13310
> URL: https://issues.apache.org/jira/browse/HIVE-13310
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13310.01.patch, HIVE-13310.02.patch
>
>
> LongColEqualLongScalar.java
> LongColGreaterEqualLongScalar.java
> LongColGreaterLongScalar.java
> LongColLessEqualLongScalar.java
> LongColLessLongScalar.java
> LongColNotEqualLongScalar.java
> LongScalarEqualLongColumn.java
> LongScalarGreaterEqualLongColumn.java
> LongScalarGreaterLongColumn.java
> LongScalarLessEqualLongColumn.java
> LongScalarLessLongColumn.java
> LongScalarNotEqualLongColumn.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13310) Vectorized Projection Comparison Number Column to Scalar broken for !noNulls and selectedInUse

2016-03-22 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207550#comment-15207550
 ] 

Matt McCline commented on HIVE-13310:
-

Not a bug in branch-1.

> Vectorized Projection Comparison Number Column to Scalar broken for !noNulls 
> and selectedInUse
> --
>
> Key: HIVE-13310
> URL: https://issues.apache.org/jira/browse/HIVE-13310
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13310.01.patch, HIVE-13310.02.patch
>
>
> LongColEqualLongScalar.java
> LongColGreaterEqualLongScalar.java
> LongColGreaterLongScalar.java
> LongColLessEqualLongScalar.java
> LongColLessLongScalar.java
> LongColNotEqualLongScalar.java
> LongScalarEqualLongColumn.java
> LongScalarGreaterEqualLongColumn.java
> LongScalarGreaterLongColumn.java
> LongScalarLessEqualLongColumn.java
> LongScalarLessLongColumn.java
> LongScalarNotEqualLongColumn.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13318) Cache the result of getTable from metaStore

2016-03-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13318:
---
Attachment: HIVE-13318.01.patch

> Cache the result of getTable from metaStore
> ---
>
> Key: HIVE-13318
> URL: https://issues.apache.org/jira/browse/HIVE-13318
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13318.01.patch
>
>
> getTable by name from metaStore is called many times. We plan to cache it to 
> save calls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13318) Cache the result of getTable from metaStore

2016-03-22 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13318:
---
Status: Patch Available  (was: Open)

> Cache the result of getTable from metaStore
> ---
>
> Key: HIVE-13318
> URL: https://issues.apache.org/jira/browse/HIVE-13318
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13318.01.patch
>
>
> getTable by name from metaStore is called many times. We plan to cache it to 
> save calls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10176) skip.header.line.count causes values to be skipped when performing insert values

2016-03-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207456#comment-15207456
 ] 

Hive QA commented on HIVE-10176:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12794491/HIVE-10176.6.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7339/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7339/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7339/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] 
[INFO] --- maven-jar-plugin:2.2:test-jar (default) @ hive-service-rpc ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/service-rpc/target/hive-service-rpc-2.1.0-SNAPSHOT-tests.jar
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
hive-service-rpc ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/service-rpc/target/hive-service-rpc-2.1.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-service-rpc/2.1.0-SNAPSHOT/hive-service-rpc-2.1.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/service-rpc/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-service-rpc/2.1.0-SNAPSHOT/hive-service-rpc-2.1.0-SNAPSHOT.pom
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/service-rpc/target/hive-service-rpc-2.1.0-SNAPSHOT-tests.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-service-rpc/2.1.0-SNAPSHOT/hive-service-rpc-2.1.0-SNAPSHOT-tests.jar
[INFO] 
[INFO] 
[INFO] Building Spark Remote Client 2.1.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ spark-client ---
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/spark-client/target
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/spark-client (includes = 
[datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
spark-client ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ spark-client 
---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ spark-client ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ spark-client 
---
[INFO] Compiling 28 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java:
 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
 uses or overrides a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java:
 Recompile with -Xlint:deprecation for details.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Some input files use unchecked or unsafe operations.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
 [copy] Copying 16 files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf

[jira] [Commented] (HIVE-13178) Enhance ORC Schema Evolution to handle more standard data type conversions

2016-03-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207443#comment-15207443
 ] 

Hive QA commented on HIVE-13178:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12794725/HIVE-13178.06.patch

{color:green}SUCCESS:{color} +1 due to 33 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 9836 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-schema_evol_orc_nonvec_mapwork_table.q-insert_update_delete.q-selectDistinctStar.q-and-6-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_schema_evol_orc_nonvec_mapwork_part_other_incompatible
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7338/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7338/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7338/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12794725 - PreCommit-HIVE-TRUNK-Build

> Enhance ORC Schema Evolution to handle more standard data type conversions
> --
>
> Key: HIVE-13178
> URL: https://issues.apache.org/jira/browse/HIVE-13178
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, ORC
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-13178.01.patch, HIVE-13178.02.patch, 
> HIVE-13178.03.patch, HIVE-13178.04.patch, HIVE-13178.05.patch, 
> HIVE-13178.06.patch
>
>
> Currently, SHORT -> INT -> BIGINT is supported.
> Handle ORC data type conversions permitted by Implicit conversion allowed by 
> TypeIntoUtils.implicitConvertible method.
>*   STRING_GROUP -> DOUBLE
>*   STRING_GROUP -> DECIMAL
>*   DATE_GROUP -> STRING
>*   NUMERIC_GROUP -> STRING
>*   STRING_GROUP -> STRING_GROUP
>*
>*   // Upward from "lower" type to "higher" numeric type:
>*   BYTE -> SHORT -> INT -> BIGINT -> FLOAT -> DOUBLE -> DECIMAL



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-13302) direct SQL: cast to date doesn't work on Oracle

2016-03-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207426#comment-15207426
 ] 

Sergey Shelukhin edited comment on HIVE-13302 at 3/22/16 10:24 PM:
---

Committed to master and branch-1. Verified the patch works on Oracle.


was (Author: sershe):
Committed to master and branch-1

> direct SQL: cast to date doesn't work on Oracle
> ---
>
> Key: HIVE-13302
> URL: https://issues.apache.org/jira/browse/HIVE-13302
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13302.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13302) direct SQL: cast to date doesn't work on Oracle

2016-03-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13302:

   Resolution: Fixed
Fix Version/s: 2.1.0
   1.3.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-1

> direct SQL: cast to date doesn't work on Oracle
> ---
>
> Key: HIVE-13302
> URL: https://issues.apache.org/jira/browse/HIVE-13302
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Sergey Shelukhin
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13302.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE

2016-03-22 Thread Aaron Dossett (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207384#comment-15207384
 ] 

Aaron Dossett commented on HIVE-11221:
--

[~ashishen...@gmail.com] HDP 2.3.4 does include this fix backported to 1.2.1 
(https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_HDP_RelNotes/content/patch_hive.html)

We recently upgraded to 2.3.4 and concatenation is working fine so far.


> In Tez mode, alter table concatenate orc files can intermittently fail with 
> NPE
> ---
>
> Key: HIVE-11221
> URL: https://issues.apache.org/jira/browse/HIVE-11221
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11221.1.patch
>
>
> We are not waiting for input ready events which can trigger occasional NPE if 
> input is not actually ready.
> Stacktrace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:478)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.pushProjectionsAndFilters(HiveInputFormat.java:471)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:648)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:146)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.(MRReaderMapred.java:73)
>   at 
> org.apache.tez.mapreduce.input.MRInput.initializeInternal(MRInput.java:483)
>   at 
> org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:108)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.getMRInput(MergeFileRecordProcessor.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.init(MergeFileRecordProcessor.java:72)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
>   ... 13 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12049) Provide an option to write serialized thrift objects in final tasks

2016-03-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207370#comment-15207370
 ] 

Ashutosh Chauhan commented on HIVE-12049:
-

Compiler related changes look good to me.

> Provide an option to write serialized thrift objects in final tasks
> ---
>
> Key: HIVE-12049
> URL: https://issues.apache.org/jira/browse/HIVE-12049
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Rohit Dholakia
>Assignee: Rohit Dholakia
> Attachments: HIVE-12049.1.patch, HIVE-12049.11.patch, 
> HIVE-12049.12.patch, HIVE-12049.13.patch, HIVE-12049.14.patch, 
> HIVE-12049.2.patch, HIVE-12049.3.patch, HIVE-12049.4.patch, 
> HIVE-12049.5.patch, HIVE-12049.6.patch, HIVE-12049.7.patch, HIVE-12049.9.patch
>
>
> For each fetch request to HiveServer2, we pay the penalty of deserializing 
> the row objects and translating them into a different representation suitable 
> for the RPC transfer. In a moderate to high concurrency scenarios, this can 
> result in significant CPU and memory wastage. By having each task write the 
> appropriate thrift objects to the output files, HiveServer2 can simply stream 
> a batch of rows on the wire without incurring any of the additional cost of 
> deserialization and translation. 
> This can be implemented by writing a new SerDe, which the FileSinkOperator 
> can use to write thrift formatted row batches to the output file. Using the 
> pluggable property of the {{hive.query.result.fileformat}}, we can set it to 
> use SequenceFile and write a batch of thrift formatted rows as a value blob. 
> The FetchTask can now simply read the blob and send it over the wire. On the 
> client side, the *DBC driver can read the blob and since it is already 
> formatted in the way it expects, it can continue building the ResultSet the 
> way it does in the current implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13331) Failures when concatenating ORC files using tez

2016-03-22 Thread Ashish Shenoy (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Shenoy updated HIVE-13331:
-
Assignee: Prasanth Jayachandran

> Failures when concatenating ORC files using tez
> ---
>
> Key: HIVE-13331
> URL: https://issues.apache.org/jira/browse/HIVE-13331
> Project: Hive
>  Issue Type: Bug
> Environment: HDP 2.2
> Hive 0.14 with Tez as execution engine
>Reporter: Ashish Shenoy
>Assignee: Prasanth Jayachandran
>
> I hit this issue consistently when I try to concatenate the ORC files in a 
> hive partition using 'ALTER TABLE ... PARTITION(...) CONCATENATE'. In an 
> email thread on the hive users mailing list 
> [http://mail-archives.apache.org/mod_mbox/hive-user/201504.mbox/%3c553a2a9e.70...@uib.no%3E],
>  I read that tez should be used as the execution engine for hive, so I 
> updated my hive configs to use tez as the exec engine.
> Here's the stack trace when I use the Tez execution engine:
> 
> VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> 
> File Merge FAILED -1 0 0 -1 0 0
> 
> VERTICES: 00/01 [>>--] 0% ELAPSED TIME: 1458666880.00 
> s
> 
> Status: Failed
> Vertex failed, vertexName=File Merge, 
> vertexId=vertex_1455906569416_0009_1_00, diagnostics=[Vertex 
> vertex_1455906569416_0009_1_00 [File Merge] killed/failed due 
> to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [] initializer 
> failed, vertex=vertex_1455906569416_0009_1_00 [File Merge], 
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452)
> at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441)
> at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295)
> at 
> org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> ]
> DAG failed due to vertex failure. failedVertices:1 killedVertices:0
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.DDLTask
> Please let me know if this has been fixed ? This seems like a very basic 
> thing for Hive to get wrong, so I am wondering if I am using the right 
> configs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11424) Rule to transform OR clauses into IN clauses in CBO

2016-03-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207355#comment-15207355
 ] 

Ashutosh Chauhan commented on HIVE-11424:
-

I see. Than shall we always execute Hive rule, irrespective whether Calcite 
rule ran or not?

> Rule to transform OR clauses into IN clauses in CBO
> ---
>
> Key: HIVE-11424
> URL: https://issues.apache.org/jira/browse/HIVE-11424
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11424.01.patch, HIVE-11424.01.patch, 
> HIVE-11424.03.patch, HIVE-11424.03.patch, HIVE-11424.04.patch, 
> HIVE-11424.05.patch, HIVE-11424.2.patch, HIVE-11424.patch
>
>
> We create a rule that will transform OR clauses into IN clauses (when 
> possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13316) Upgrade to Calcite 1.7

2016-03-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207353#comment-15207353
 ] 

Jesus Camacho Rodriguez commented on HIVE-13316:


There seem to be problems with the metadata providers reimplementation in 
Calcite/current way of using them in Hive, as right method is not being 
triggered. I will need to look further into it.

> Upgrade to Calcite 1.7
> --
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-13250) Compute predicate conversions on the client, instead of per row group

2016-03-22 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-13250.
-
Resolution: Invalid

I take that back. Its not safe even for equality predicate, since that can lead 
to HIVE-12749 scenarios.
Resolving this as invalid.

> Compute predicate conversions on the client, instead of per row group
> -
>
> Key: HIVE-13250
> URL: https://issues.apache.org/jira/browse/HIVE-13250
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Siddharth Seth
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13250.2.patch, HIVE-13250.2.patch, HIVE-13250.patch
>
>
> When running a query for the form 
> select count from table where ts_field = "2016-01-23 00:00:00";
> or
> select count from table where ts_field = 1453507200
> ts_field is of type TIMESTAMP
> The predicate is converted to whatever format is appropriate for TIMESTAMP 
> processing on each and every row group.
> It would be far more efficient to process this once on the client - or even 
> once per task.
> The same applies to ORC splt elimination as well - this is applied for each 
> stripe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13250) Compute predicate conversions on the client, instead of per row group

2016-03-22 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-13250:

Status: Open  (was: Patch Available)

> Compute predicate conversions on the client, instead of per row group
> -
>
> Key: HIVE-13250
> URL: https://issues.apache.org/jira/browse/HIVE-13250
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Siddharth Seth
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13250.2.patch, HIVE-13250.2.patch, HIVE-13250.patch
>
>
> When running a query for the form 
> select count from table where ts_field = "2016-01-23 00:00:00";
> or
> select count from table where ts_field = 1453507200
> ts_field is of type TIMESTAMP
> The predicate is converted to whatever format is appropriate for TIMESTAMP 
> processing on each and every row group.
> It would be far more efficient to process this once on the client - or even 
> once per task.
> The same applies to ORC splt elimination as well - this is applied for each 
> stripe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11424) Rule to transform OR clauses into IN clauses in CBO

2016-03-22 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207341#comment-15207341
 ] 

Jesus Camacho Rodriguez commented on HIVE-11424:


Exactly, once we migrate partition condition remover, we could remove the new 
flag... But till then, it seems better to leave it as optional, so we do not 
regress in some cases.

> Rule to transform OR clauses into IN clauses in CBO
> ---
>
> Key: HIVE-11424
> URL: https://issues.apache.org/jira/browse/HIVE-11424
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11424.01.patch, HIVE-11424.01.patch, 
> HIVE-11424.03.patch, HIVE-11424.03.patch, HIVE-11424.04.patch, 
> HIVE-11424.05.patch, HIVE-11424.2.patch, HIVE-11424.patch
>
>
> We create a rule that will transform OR clauses into IN clauses (when 
> possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11424) Rule to transform OR clauses into IN clauses in CBO

2016-03-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207340#comment-15207340
 ] 

Ashutosh Chauhan commented on HIVE-11424:
-

We are trying to migrate rules on calcite. So, if we implement partition 
condition remover in Calcite then we dont need to rely on Hive's rule.

> Rule to transform OR clauses into IN clauses in CBO
> ---
>
> Key: HIVE-11424
> URL: https://issues.apache.org/jira/browse/HIVE-11424
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11424.01.patch, HIVE-11424.01.patch, 
> HIVE-11424.03.patch, HIVE-11424.03.patch, HIVE-11424.04.patch, 
> HIVE-11424.05.patch, HIVE-11424.2.patch, HIVE-11424.patch
>
>
> We create a rule that will transform OR clauses into IN clauses (when 
> possible).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13294) AvroSerde leaks the connection in a case when reading schema from a url

2016-03-22 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207337#comment-15207337
 ] 

Chaoyu Tang commented on HIVE-13294:


Thanks [~leftylev] for reminding me of this!

> AvroSerde leaks the connection in a case when reading schema from a url
> ---
>
> Key: HIVE-13294
> URL: https://issues.apache.org/jira/browse/HIVE-13294
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.0, 2.0.1
>
> Attachments: HIVE-13294.1.patch, HIVE-13294.patch
>
>
> AvroSerde leaks the connection in a case when reading schema from url:
> In 
> public static Schema determineSchemaOrThrowException {
> ...
> return AvroSerdeUtils.getSchemaFor(new URL(schemaString).openStream());
> ...
> }
> The opened inputStream is never closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13327) SessionID added to HS2 threadname does not trim spaces

2016-03-22 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13327:
--
Fix Version/s: 2.0.1

> SessionID added to HS2 threadname does not trim spaces
> --
>
> Key: HIVE-13327
> URL: https://issues.apache.org/jira/browse/HIVE-13327
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Carter Shanklin
>Assignee: Prasanth Jayachandran
> Fix For: 2.1.0, 2.0.1
>
> Attachments: HIVE-13327.1.patch
>
>
> HIVE-13153 introduced off-by-one in appending spaces to thread names. 
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-13230) ORC Vectorized String reader doesn't handle NULLs correctly

2016-03-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-13230.
--
Resolution: Duplicate

Didn't realize this existed when I created HIVE-13330. Looks like same issue. 
Closing this as duplicate.

> ORC Vectorized String reader doesn't handle NULLs correctly
> ---
>
> Key: HIVE-13230
> URL: https://issues.apache.org/jira/browse/HIVE-13230
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> Wrong results produced.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13286) Query ID is being reused across queries

2016-03-22 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207312#comment-15207312
 ] 

Vikram Dixit K commented on HIVE-13286:
---

Committed to both master and branch-2.0. Thanks [~aihuaxu]!

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
> Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, 
> HIVE-13286.3.patch, HIVE-13286.4.patch
>
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13286) Query ID is being reused across queries

2016-03-22 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13286:
--
  Resolution: Fixed
Target Version/s: 2.1.0, 2.0.1
  Status: Resolved  (was: Patch Available)

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
> Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, 
> HIVE-13286.3.patch, HIVE-13286.4.patch
>
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13153) SessionID is appended to thread name twice

2016-03-22 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13153:
--
Fix Version/s: 2.0.1

> SessionID is appended to thread name twice
> --
>
> Key: HIVE-13153
> URL: https://issues.apache.org/jira/browse/HIVE-13153
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.1.0, 2.0.1
>
> Attachments: HIVE-13153.1.patch, HIVE-13153.2.patch
>
>
> HIVE-12249 added sessionId to thread name. In some cases the sessionId could 
> be appended twice. Example log line
> {code}
> DEBUG [6432ec22-9f66-4fa5-8770-488a9d3f0b61 
> 6432ec22-9f66-4fa5-8770-488a9d3f0b61 main]
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13330) ORC vectorized string dictionary reader does not differentiate null vs empty string dictionary

2016-03-22 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207299#comment-15207299
 ] 

Prasanth Jayachandran commented on HIVE-13330:
--

Addressed [~gopalv]' comment to replace "".getBytes with static final empty 
byte array.

> ORC vectorized string dictionary reader does not differentiate null vs empty 
> string dictionary
> --
>
> Key: HIVE-13330
> URL: https://issues.apache.org/jira/browse/HIVE-13330
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13330.1.patch, HIVE-13330.2.patch
>
>
> Vectorized string dictionary reader cannot differentiate between the case 
> where all dictionary entries are null vs single entry with empty string. This 
> causes wrong results when reading data out of such files. 
> {code:title=Vectorization On}
> SET hive.vectorized.execution.enabled=true;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> NULL
> {code}
> {code:title=Vectorization Off}
> SET hive.vectorized.execution.enabled=false;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> {code}
> The input table testnullorc3 contains a varchar column vcol with few empty 
> strings and few nulls. For this table, non vectorized reader returns empty as 
> first row but vectorized reader returns NULL. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13330) ORC vectorized string dictionary reader does not differentiate null vs empty string dictionary

2016-03-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13330:
-
Attachment: HIVE-13330.2.patch

> ORC vectorized string dictionary reader does not differentiate null vs empty 
> string dictionary
> --
>
> Key: HIVE-13330
> URL: https://issues.apache.org/jira/browse/HIVE-13330
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13330.1.patch, HIVE-13330.2.patch
>
>
> Vectorized string dictionary reader cannot differentiate between the case 
> where all dictionary entries are null vs single entry with empty string. This 
> causes wrong results when reading data out of such files. 
> {code:title=Vectorization On}
> SET hive.vectorized.execution.enabled=true;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> NULL
> {code}
> {code:title=Vectorization Off}
> SET hive.vectorized.execution.enabled=false;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> {code}
> The input table testnullorc3 contains a varchar column vcol with few empty 
> strings and few nulls. For this table, non vectorized reader returns empty as 
> first row but vectorized reader returns NULL. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13330) ORC vectorized string dictionary reader does not differentiate null vs empty string dictionary

2016-03-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13330:
-
Attachment: HIVE-13330.1.patch

> ORC vectorized string dictionary reader does not differentiate null vs empty 
> string dictionary
> --
>
> Key: HIVE-13330
> URL: https://issues.apache.org/jira/browse/HIVE-13330
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13330.1.patch
>
>
> Vectorized string dictionary reader cannot differentiate between the case 
> where all dictionary entries are null vs single entry with empty string. This 
> causes wrong results when reading data out of such files. 
> {code:title=Vectorization On}
> SET hive.vectorized.execution.enabled=true;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> NULL
> {code}
> {code:title=Vectorization Off}
> SET hive.vectorized.execution.enabled=false;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> {code}
> The input table testnullorc3 contains a varchar column vcol with few empty 
> strings and few nulls. For this table, non vectorized reader returns empty as 
> first row but vectorized reader returns NULL. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13330) ORC vectorized string dictionary reader does not differentiate null vs empty string dictionary

2016-03-22 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13330:
-
Status: Patch Available  (was: Open)

> ORC vectorized string dictionary reader does not differentiate null vs empty 
> string dictionary
> --
>
> Key: HIVE-13330
> URL: https://issues.apache.org/jira/browse/HIVE-13330
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.3.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13330.1.patch
>
>
> Vectorized string dictionary reader cannot differentiate between the case 
> where all dictionary entries are null vs single entry with empty string. This 
> causes wrong results when reading data out of such files. 
> {code:title=Vectorization On}
> SET hive.vectorized.execution.enabled=true;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> NULL
> {code}
> {code:title=Vectorization Off}
> SET hive.vectorized.execution.enabled=false;
> SET hive.fetch.task.conversion=none;
> select vcol from testnullorc3 limit 1;
> OK
> {code}
> The input table testnullorc3 contains a varchar column vcol with few empty 
> strings and few nulls. For this table, non vectorized reader returns empty as 
> first row but vectorized reader returns NULL. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13294) AvroSerde leaks the connection in a case when reading schema from a url

2016-03-22 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207268#comment-15207268
 ] 

Lefty Leverenz commented on HIVE-13294:
---

Okay, now it's committed to master.  Thanks.

> AvroSerde leaks the connection in a case when reading schema from a url
> ---
>
> Key: HIVE-13294
> URL: https://issues.apache.org/jira/browse/HIVE-13294
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.1.0, 2.0.1
>
> Attachments: HIVE-13294.1.patch, HIVE-13294.patch
>
>
> AvroSerde leaks the connection in a case when reading schema from url:
> In 
> public static Schema determineSchemaOrThrowException {
> ...
> return AvroSerdeUtils.getSchemaFor(new URL(schemaString).openStream());
> ...
> }
> The opened inputStream is never closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13322) LLAP: ZK registry throws at shutdown due to slf4j trying to initialize a log4j logger

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13322:
---
   Resolution: Fixed
Fix Version/s: 2.1.0
 Release Note: LLAP: ZK registry throws at shutdown due to slf4j trying to 
initialize a log4j logger (Gopal V, reviewed by Prasanth Jayachandran)
   Status: Resolved  (was: Patch Available)

> LLAP: ZK registry throws at shutdown due to slf4j trying to initialize a 
> log4j logger
> -
>
> Key: HIVE-13322
> URL: https://issues.apache.org/jira/browse/HIVE-13322
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Gopal V
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: HIVE-13322.1.patch
>
>
> {noformat}
> 2016-03-08 23:56:34,883 Thread-5 FATAL Unable to register shutdown hook 
> because JVM is shutting down. java.lang.IllegalStateException: Cannot add new 
> shutdown hook as this is not started. Current state: STOPPED
>   at 
> org.apache.logging.log4j.core.util.DefaultShutdownCallbackRegistry.addShutdownCallback(DefaultShutdownCallbackRegistry.java:113)
>   at 
> org.apache.logging.log4j.core.impl.Log4jContextFactory.addShutdownCallback(Log4jContextFactory.java:271)
>   at 
> org.apache.logging.log4j.core.LoggerContext.setUpShutdownHook(LoggerContext.java:256)
>   at 
> org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:216)
>   at 
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:146)
>   at 
> org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:41)
>   at org.apache.logging.log4j.LogManager.getContext(LogManager.java:185)
>   at 
> org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:103)
>   at 
> org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:43)
>   at 
> org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:42)
>   at 
> org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:29)
>   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285)
>   at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:305)
>   at 
> org.apache.curator.utils.CloseableUtils.(CloseableUtils.java:33)
>   at 
> org.apache.hadoop.hive.llap.registry.impl.LlapZookeeperRegistryImpl.stop(LlapZookeeperRegistryImpl.java:584)
>   at 
> org.apache.hadoop.hive.llap.registry.impl.LlapRegistryService.serviceStop(LlapRegistryService.java:105)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>   at 
> org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.serviceStop(LlapDaemon.java:294)
>   at 
> org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
>   at 
> org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80)
>   at 
> org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:65)
>   at 
> org.apache.hadoop.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:183)
>   at 
> org.apache.hive.common.util.ShutdownHookManager$1.run(ShutdownHookManager.java:63)
> {noformat}
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13107) LLAP: Rotate GC logs periodically to prevent full disks

2016-03-22 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207237#comment-15207237
 ] 

Lefty Leverenz commented on HIVE-13107:
---

Does this need to be documented in the wiki?  (If so, please add a TODOC2.1 
label.)  It could go in a new subsection of Hive Logging:

* [Getting Started -- Hive Logging | 
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-HiveLogging]

> LLAP: Rotate GC logs periodically to prevent full disks
> ---
>
> Key: HIVE-13107
> URL: https://issues.apache.org/jira/browse/HIVE-13107
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Trivial
> Fix For: 2.1.0
>
> Attachments: HIVE-13107.1.patch, HIVE-13107.2.patch
>
>
> STDOUT cannot be rotated easily, so log GC logs to a different file and 
> rotate periodically with -XX:+UseGCLogFileRotation
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13300) Hive on spark throws exception for multi-insert with join

2016-03-22 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207228#comment-15207228
 ] 

Szehon Ho commented on HIVE-13300:
--

Thanks, I'll take a look and file if there are not.

> Hive on spark throws exception for multi-insert with join
> -
>
> Key: HIVE-13300
> URL: https://issues.apache.org/jira/browse/HIVE-13300
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-13300.2.patch, HIVE-13300.3.patch, HIVE-13300.patch
>
>
> For certain multi-insert queries, Hive on Spark throws a deserialization 
> error.
> {noformat}
> create table status_updates(userid int,status string,ds string);
> create table profiles(userid int,school string,gender int);
> drop table school_summary; create table school_summary(school string,cnt int) 
> partitioned by (ds string);
> drop table gender_summary; create table gender_summary(gender int,cnt int) 
> partitioned by (ds string);
> insert into status_updates values (1, "status_1", "2016-03-16");
> insert into profiles values (1, "school_1", 0);
> set hive.auto.convert.join=false;
> set hive.execution.engine=spark;
> FROM (SELECT a.status, b.school, b.gender
> FROM status_updates a JOIN profiles b
> ON (a.userid = b.userid and
> a.ds='2009-03-20' )
> ) subq1
> INSERT OVERWRITE TABLE gender_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.gender, COUNT(1) GROUP BY subq1.gender
> INSERT OVERWRITE TABLE school_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.school, COUNT(1) GROUP BY subq1.school
> {noformat}
> Error:
> {noformat}
> 16/03/17 13:29:00 [task-result-getter-3]: WARN scheduler.TaskSetManager: Lost 
> task 0.0 in stage 2.0 (TID 3, localhost): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable 
> to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:279)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error: Unable to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:251)
>   ... 12 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:241)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:249)
>   ... 12 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:597)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:288)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:237)
>   ... 13 more
>

[jira] [Comment Edited] (HIVE-13300) Hive on spark throws exception for multi-insert with join

2016-03-22 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207176#comment-15207176
 ] 

Xuefu Zhang edited comment on HIVE-13300 at 3/22/16 8:24 PM:
-

+1. Just wondering whether these test failures are related or tracked in other 
jiras.


was (Author: xuefuz):
+1

> Hive on spark throws exception for multi-insert with join
> -
>
> Key: HIVE-13300
> URL: https://issues.apache.org/jira/browse/HIVE-13300
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-13300.2.patch, HIVE-13300.3.patch, HIVE-13300.patch
>
>
> For certain multi-insert queries, Hive on Spark throws a deserialization 
> error.
> {noformat}
> create table status_updates(userid int,status string,ds string);
> create table profiles(userid int,school string,gender int);
> drop table school_summary; create table school_summary(school string,cnt int) 
> partitioned by (ds string);
> drop table gender_summary; create table gender_summary(gender int,cnt int) 
> partitioned by (ds string);
> insert into status_updates values (1, "status_1", "2016-03-16");
> insert into profiles values (1, "school_1", 0);
> set hive.auto.convert.join=false;
> set hive.execution.engine=spark;
> FROM (SELECT a.status, b.school, b.gender
> FROM status_updates a JOIN profiles b
> ON (a.userid = b.userid and
> a.ds='2009-03-20' )
> ) subq1
> INSERT OVERWRITE TABLE gender_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.gender, COUNT(1) GROUP BY subq1.gender
> INSERT OVERWRITE TABLE school_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.school, COUNT(1) GROUP BY subq1.school
> {noformat}
> Error:
> {noformat}
> 16/03/17 13:29:00 [task-result-getter-3]: WARN scheduler.TaskSetManager: Lost 
> task 0.0 in stage 2.0 (TID 3, localhost): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable 
> to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:279)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error: Unable to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:251)
>   ... 12 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:241)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:249)
>   ... 12 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:597)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:288)
>   at 
>

[jira] [Commented] (HIVE-13307) LLAP: Slider package should contain permanent functions

2016-03-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207192#comment-15207192
 ] 

Sergey Shelukhin commented on HIVE-13307:
-

+1, can you remove the getMSC code on commit?

> LLAP: Slider package should contain permanent functions
> ---
>
> Key: HIVE-13307
> URL: https://issues.apache.org/jira/browse/HIVE-13307
> Project: Hive
>  Issue Type: New Feature
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>  Labels: TODOC2.1
> Attachments: HIVE-13307.1.patch
>
>
> This renames a previous configuration option
> hive.llap.daemon.allow.permanent.fns -> 
> hive.llap.daemon.download.permanent.fns
> and adds a new parameter for LlapDecider
> hive.llap.allow.permanent.fns
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2016-03-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-9660 started by Sergey Shelukhin.
--
> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9660.WIP2.patch, HIVE-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work stopped] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2016-03-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-9660 stopped by Sergey Shelukhin.
--
> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9660.WIP2.patch, HIVE-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2016-03-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9660:
---
Status: Patch Available  (was: Open)

> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9660.WIP2.patch, HIVE-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13300) Hive on spark throws exception for multi-insert with join

2016-03-22 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207176#comment-15207176
 ] 

Xuefu Zhang commented on HIVE-13300:


+1

> Hive on spark throws exception for multi-insert with join
> -
>
> Key: HIVE-13300
> URL: https://issues.apache.org/jira/browse/HIVE-13300
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-13300.2.patch, HIVE-13300.3.patch, HIVE-13300.patch
>
>
> For certain multi-insert queries, Hive on Spark throws a deserialization 
> error.
> {noformat}
> create table status_updates(userid int,status string,ds string);
> create table profiles(userid int,school string,gender int);
> drop table school_summary; create table school_summary(school string,cnt int) 
> partitioned by (ds string);
> drop table gender_summary; create table gender_summary(gender int,cnt int) 
> partitioned by (ds string);
> insert into status_updates values (1, "status_1", "2016-03-16");
> insert into profiles values (1, "school_1", 0);
> set hive.auto.convert.join=false;
> set hive.execution.engine=spark;
> FROM (SELECT a.status, b.school, b.gender
> FROM status_updates a JOIN profiles b
> ON (a.userid = b.userid and
> a.ds='2009-03-20' )
> ) subq1
> INSERT OVERWRITE TABLE gender_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.gender, COUNT(1) GROUP BY subq1.gender
> INSERT OVERWRITE TABLE school_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.school, COUNT(1) GROUP BY subq1.school
> {noformat}
> Error:
> {noformat}
> 16/03/17 13:29:00 [task-result-getter-3]: WARN scheduler.TaskSetManager: Lost 
> task 0.0 in stage 2.0 (TID 3, localhost): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable 
> to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:279)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error: Unable to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:251)
>   ... 12 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:241)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:249)
>   ... 12 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:597)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:288)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:237)
>   ... 13 more
> {noformat}



--
This message was sent by

[jira] [Updated] (HIVE-13107) LLAP: Rotate GC logs periodically to prevent full disks

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13107:
---
   Resolution: Fixed
Fix Version/s: 2.1.0
 Release Note: LLAP: Rotate GC logs periodically to prevent full disks 
(Gopal V, reviewed by Prasanth Jayachandran)
   Status: Resolved  (was: Patch Available)

> LLAP: Rotate GC logs periodically to prevent full disks
> ---
>
> Key: HIVE-13107
> URL: https://issues.apache.org/jira/browse/HIVE-13107
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Trivial
> Fix For: 2.1.0
>
> Attachments: HIVE-13107.1.patch, HIVE-13107.2.patch
>
>
> STDOUT cannot be rotated easily, so log GC logs to a different file and 
> rotate periodically with -XX:+UseGCLogFileRotation
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13107) LLAP: Rotate GC logs periodically to prevent full disks

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13107:
---
Attachment: HIVE-13107.2.patch

Removing the extra verbose logging with the PrintGCApplicationConcurrentTime.

> LLAP: Rotate GC logs periodically to prevent full disks
> ---
>
> Key: HIVE-13107
> URL: https://issues.apache.org/jira/browse/HIVE-13107
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Trivial
> Attachments: HIVE-13107.1.patch, HIVE-13107.2.patch
>
>
> STDOUT cannot be rotated easily, so log GC logs to a different file and 
> rotate periodically with -XX:+UseGCLogFileRotation
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12606) HCatalog ORC Null values in fields results in NullPointer exception

2016-03-22 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-12606:

Component/s: Hive

> HCatalog ORC Null values in fields results in NullPointer exception
> ---
>
> Key: HIVE-12606
> URL: https://issues.apache.org/jira/browse/HIVE-12606
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Hive
>Affects Versions: 0.13.1
> Environment: Linux
>Reporter: Z. S.
>
> When reading via HCatalog an ORC table that has null values in fields it 
> fails with the following exception: 
> 15/12/07 19:47:42 INFO mapred.Task:  Using ResourceCalculatorProcessTree : 
> null
> 15/12/07 19:47:42 INFO mapred.MapTask: Processing split: 
> org.apache.hive.hcatalog.mapreduce.HCatSplit@4c8c30bc
> 15/12/07 19:47:42 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
> 15/12/07 19:47:42 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
> 15/12/07 19:47:42 INFO mapred.MapTask: soft limit at 83886080
> 15/12/07 19:47:42 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
> 15/12/07 19:47:42 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
> 15/12/07 19:47:42 INFO mapred.MapTask: Map output collector class = 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> 15/12/07 19:47:42 INFO orc.ReaderImpl: Reading ORC rows from 
> hdfs://[REDACTED]/00_0 with {include: null, offset: 0, length: 1628}
> 15/12/07 19:47:42 INFO mapred.MapTask: Ignoring exception during close for 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader@5096bee4
> java.lang.NullPointerException
>   at 
> org.apache.hive.hcatalog.mapreduce.HCatRecordReader.close(HCatRecordReader.java:223)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.close(MapTask.java:520)
>   at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1999)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 15/12/07 19:47:42 INFO mapred.MapTask: Starting flush of map output
> 15/12/07 19:47:42 INFO mapred.LocalJobRunner: map task executor complete.
> 15/12/07 19:47:42 INFO mapreduce.FileOutputCommitterContainer: Job failed. 
> Try cleaning up temporary directory 
> [hdfs://bd/user/hive/warehouse/test.db/billing_aolon_revenue_output_stream/_DYN0.44164173619220104].
> 15/12/07 19:47:42 INFO mapreduce.FileOutputCommitterContainer: Cancelling 
> delegation token for the job.
> 15/12/07 19:47:42 WARN conf.Configuration: 
> file:/tmp/hadoop-/mapred/local/localRunner/job_local413328602_0001/job_local413328602_0001.xml:an
>  attempt to override final parameter: 
> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 15/12/07 19:47:42 WARN conf.Configuration: 
> file:/tmp/hadoop-/mapred/local/localRunner/job_local413328602_0001/job_local413328602_0001.xml:an
>  attempt to override final parameter: 
> mapreduce.job.end-notification.max.attempts;  Ignoring.
> 15/12/07 19:47:42 INFO hive.metastore: Trying to connect to metastore with 
> URI thrift://bd:9083
> 15/12/07 19:47:42 INFO hive.metastore: Connected to metastore.
> 15/12/07 19:47:42 WARN mapred.LocalJobRunner: job_local413328602_0001
> java.lang.Exception: java.lang.NullPointerException
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringDictionaryTreeReader.startStripe(RecordReaderImpl.java:1545)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StringTreeReader.startStripe(RecordReaderImpl.java:1337)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.startStripe(RecordReaderImpl.java:1825)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:2537)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:2950)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:2992)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:284)
>   at 
>

[jira] [Updated] (HIVE-13107) LLAP: Rotate GC logs periodically to prevent full disks

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13107:
---
Attachment: (was: HIVE-13107.2.patch)

> LLAP: Rotate GC logs periodically to prevent full disks
> ---
>
> Key: HIVE-13107
> URL: https://issues.apache.org/jira/browse/HIVE-13107
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Trivial
> Attachments: HIVE-13107.1.patch
>
>
> STDOUT cannot be rotated easily, so log GC logs to a different file and 
> rotate periodically with -XX:+UseGCLogFileRotation
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13107) LLAP: Rotate GC logs periodically to prevent full disks

2016-03-22 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-13107:
---
Attachment: HIVE-13107.2.patch

> LLAP: Rotate GC logs periodically to prevent full disks
> ---
>
> Key: HIVE-13107
> URL: https://issues.apache.org/jira/browse/HIVE-13107
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Trivial
> Attachments: HIVE-13107.1.patch, HIVE-13107.2.patch
>
>
> STDOUT cannot be rotated easily, so log GC logs to a different file and 
> rotate periodically with -XX:+UseGCLogFileRotation
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13300) Hive on spark throws exception for multi-insert with join

2016-03-22 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207100#comment-15207100
 ] 

Szehon Ho commented on HIVE-13300:
--

Test failures do not look related (SparkCliDriver tests have been timing out a 
lot lately, need to investigate).

[~xuefuz] [~csun] can you take another look at latest patch?

> Hive on spark throws exception for multi-insert with join
> -
>
> Key: HIVE-13300
> URL: https://issues.apache.org/jira/browse/HIVE-13300
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-13300.2.patch, HIVE-13300.3.patch, HIVE-13300.patch
>
>
> For certain multi-insert queries, Hive on Spark throws a deserialization 
> error.
> {noformat}
> create table status_updates(userid int,status string,ds string);
> create table profiles(userid int,school string,gender int);
> drop table school_summary; create table school_summary(school string,cnt int) 
> partitioned by (ds string);
> drop table gender_summary; create table gender_summary(gender int,cnt int) 
> partitioned by (ds string);
> insert into status_updates values (1, "status_1", "2016-03-16");
> insert into profiles values (1, "school_1", 0);
> set hive.auto.convert.join=false;
> set hive.execution.engine=spark;
> FROM (SELECT a.status, b.school, b.gender
> FROM status_updates a JOIN profiles b
> ON (a.userid = b.userid and
> a.ds='2009-03-20' )
> ) subq1
> INSERT OVERWRITE TABLE gender_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.gender, COUNT(1) GROUP BY subq1.gender
> INSERT OVERWRITE TABLE school_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.school, COUNT(1) GROUP BY subq1.school
> {noformat}
> Error:
> {noformat}
> 16/03/17 13:29:00 [task-result-getter-3]: WARN scheduler.TaskSetManager: Lost 
> task 0.0 in stage 2.0 (TID 3, localhost): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable 
> to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:279)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:724)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error: Unable to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:251)
>   ... 12 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:241)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:249)
>   ... 12 more
> Caused by: java.io.EOFException
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:597)
>   at 
> org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:288)
>   at 
>

[jira] [Commented] (HIVE-13300) Hive on spark throws exception for multi-insert with join

2016-03-22 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207093#comment-15207093
 ] 

Hive QA commented on HIVE-13300:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12794698/HIVE-13300.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9852 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7337/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7337/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7337/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12794698 - PreCommit-HIVE-TRUNK-Build

> Hive on spark throws exception for multi-insert with join
> -
>
> Key: HIVE-13300
> URL: https://issues.apache.org/jira/browse/HIVE-13300
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-13300.2.patch, HIVE-13300.3.patch, HIVE-13300.patch
>
>
> For certain multi-insert queries, Hive on Spark throws a deserialization 
> error.
> {noformat}
> create table status_updates(userid int,status string,ds string);
> create table profiles(userid int,school string,gender int);
> drop table school_summary; create table school_summary(school string,cnt int) 
> partitioned by (ds string);
> drop table gender_summary; create table gender_summary(gender int,cnt int) 
> partitioned by (ds string);
> insert into status_updates values (1, "status_1", "2016-03-16");
> insert into profiles values (1, "school_1", 0);
> set hive.auto.convert.join=false;
> set hive.execution.engine=spark;
> FROM (SELECT a.status, b.school, b.gender
> FROM status_updates a JOIN profiles b
> ON (a.userid = b.userid and
> a.ds='2009-03-20' )
> ) subq1
> INSERT OVERWRITE TABLE gender_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.gender, COUNT(1) GROUP BY subq1.gender
> INSERT OVERWRITE TABLE school_summary
> PARTITION(ds='2009-03-20')
> SELECT subq1.school, COUNT(1) GROUP BY subq1.school
> {noformat}
> Error:
> {noformat}
> 16/03/17 13:29:00 [task-result-getter-3]: WARN scheduler.TaskSetManager: Lost 
> task 0.0 in stage 2.0 (TID 3, localhost): java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable 
> to deserialize reduce input key from x1x128x0x0 with properties 
> {serialization.sort.order.null=a, columns=reducesinkkey0, 
> serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
>  serialization.sort.order=+, columns.types=int}
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:279)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:49)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
>   at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
>

[jira] [Commented] (HIVE-13115) MetaStore Direct SQL getPartitions call fail when the columns schemas for a partition are null

2016-03-22 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207080#comment-15207080
 ] 

Carl Steinbach commented on HIVE-13115:
---

+1

> MetaStore Direct SQL getPartitions call fail when the columns schemas for a 
> partition are null
> --
>
> Key: HIVE-13115
> URL: https://issues.apache.org/jira/browse/HIVE-13115
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Ratandeep Ratti
>Assignee: Ratandeep Ratti
>  Labels: DirectSql, MetaStore, ORM
> Attachments: HIVE-13115.patch, HIVE-13115.reproduce.issue.patch
>
>
> We are seeing the following exception in our MetaStore logs
> {noformat}
> 2016-02-11 00:00:19,002 DEBUG metastore.MetaStoreDirectSql 
> (MetaStoreDirectSql.java:timingTrace(602)) - Direct SQL query in 5.842372ms + 
> 1.066728ms, the query is [select "PARTITIONS"."PART_ID" from "PARTITIONS"  
> inner join "TBLS" on "PART
> ITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ?   inner join 
> "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"  and "DBS"."NAME" = ?  order by 
> "PART_NAME" asc]
> 2016-02-11 00:00:19,021 ERROR metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(2243)) - Direct SQL failed, falling 
> back to ORM
> MetaException(message:Unexpected null for one of the IDs, SD 6437, column 
> null, serde 6437 for a non- view)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:360)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitions(MetaStoreDirectSql.java:224)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:1563)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$1.getSqlResult(ObjectStore.java:1559)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2208)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsInternal(ObjectStore.java:1570)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1553)
> at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
> at com.sun.proxy.$Proxy5.getPartitions(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:2526)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:8747)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:8731)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:617)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:613)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1591)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:613)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This direct SQL call fails for every {{getPartitions}} call and then falls 
> back to ORM.
> The query which fails is
> {code}
> select 
>   PARTITIONS.PART_ID, SDS.SD_ID, SDS.CD_ID,
>   SERDES.SERDE_ID, PARTITIONS.CREATE_TIME,
>   PARTITIONS.LAST_ACCESS_TIME, SDS.INPUT_FORMAT, SDS.IS_COMPRESSED,
>   SDS.IS_STOREDASSUBDIRECTORIES, SDS.LOCATION, SDS.NUM_BUCKETS,
>   SDS.OUTPUT_FORMAT, SERDES.NAME, SERDES.SLIB 
> from PARTITIONS
>   left outer join SDS on PARTITIONS.SD_ID = SDS.SD_ID 
>   left outer join SERDES on SDS.SERDE_ID = SERDES.SERDE_ID 
>   where PART_ID in (  ?  ) order by PART_NAME asc;
> {code}
> By looking at the source

[jira] [Updated] (HIVE-12616) NullPointerException when spark session is reused to run a mapjoin

2016-03-22 Thread Szehon Ho (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-12616:
-
Attachment: HIVE-12616.3.patch

Looks like the patch is reviewed, but no longer applies.

I rebased and will checkin if test still pass.

> NullPointerException when spark session is reused to run a mapjoin
> --
>
> Key: HIVE-12616
> URL: https://issues.apache.org/jira/browse/HIVE-12616
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
> Attachments: HIVE-12616.1.patch, HIVE-12616.2.patch, 
> HIVE-12616.3.patch, HIVE-12616.patch
>
>
> The way to reproduce:
> {noformat}
> set hive.execution.engine=spark;
> create table if not exists test(id int);
> create table if not exists test1(id int);
> insert into test values(1);
> insert into test1 values(1);
> select max(a.id) from test a ,test1 b
> where a.id = b.id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-22 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13283:

Attachment: HIVE-13283.03.patch

Addressing the comment about getBoolVar

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.01.patch, HIVE-13283.02.patch, 
> HIVE-13283.03.patch, HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207040#comment-15207040
 ] 

Sergey Shelukhin commented on HIVE-13283:
-

Ah. nm, I see jobconf works correctly. Yeah the intent is to change the 
default, but only in the daemon

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.01.patch, HIVE-13283.02.patch, 
> HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13286) Query ID is being reused across queries

2016-03-22 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207015#comment-15207015
 ] 

Aihua Xu commented on HIVE-13286:
-

[~vikram.dixit] Those tests are not related. Sorry. Forgot to mention that. 

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
> Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, 
> HIVE-13286.3.patch, HIVE-13286.4.patch
>
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207016#comment-15207016
 ] 

Sergey Shelukhin commented on HIVE-13226:
-

Is it possible to rename "Start" and "FInish" to something less confusing? DAG 
startup, DAG runtime?

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.1.0
>
> Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, 
> HIVE-13226.3.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13297) Set default field separator instead of ^A

2016-03-22 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-13297:

Assignee: (was: Yongzhi Chen)

> Set default field separator instead of ^A
> -
>
> Key: HIVE-13297
> URL: https://issues.apache.org/jira/browse/HIVE-13297
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Cristian
>
> By default Hive tables are created with field delimiter as ^A. it can be 
> changed by users defining the correct value in tblproperties or 
> serdeproperties.
> The default field separator should be modified by configuration and maybe 
> other default values as line separator in order to avoid specify it for each 
> table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207012#comment-15207012
 ] 

Sergey Shelukhin commented on HIVE-13283:
-

Hmm.. actually yeah, this doesn't work either, the client-side setting is now 
ignored. It would need to be propagated with the plan.

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.01.patch, HIVE-13283.02.patch, 
> HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13151) Clean up UGI objects in FileSystem cache for transactions

2016-03-22 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13151:
-
Attachment: HIVE-13151.4.patch

Upload patch 4 for test

> Clean up UGI objects in FileSystem cache for transactions
> -
>
> Key: HIVE-13151
> URL: https://issues.apache.org/jira/browse/HIVE-13151
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13151.1.patch, HIVE-13151.2.patch, 
> HIVE-13151.3.patch, HIVE-13151.4.patch
>
>
> One issue with FileSystem.CACHE is that it does not clean itself. The key in 
> that cache includes UGI object. When new UGI objects are created and used 
> with the FileSystem api, new entries get added to the cache.
> We need to manually clean up those UGI objects once they are no longer in use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores

2016-03-22 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206934#comment-15206934
 ] 

Eugene Koifman commented on HIVE-11388:
---

This patch also includes a 1 character fix to an issue introduced in HIVE-13013 
(SQL stmt in TxnHandler.lockTransactionRecord())

> Allow ACID Compactor components to run in multiple metastores
> -
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, 
> HIVE-11388.5.patch, HIVE-11388.6.patch, HIVE-11388.7.patch, HIVE-11388.patch
>
>
> (this description is no loner accurate; see further comments)
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores

2016-03-22 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206928#comment-15206928
 ] 

Eugene Koifman commented on HIVE-11388:
---

fix for HIVE-12725 is included here

> Allow ACID Compactor components to run in multiple metastores
> -
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, 
> HIVE-11388.5.patch, HIVE-11388.6.patch, HIVE-11388.7.patch, HIVE-11388.patch
>
>
> (this description is no loner accurate; see further comments)
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13151) Clean up UGI objects in FileSystem cache for transactions

2016-03-22 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206923#comment-15206923
 ] 

Eugene Koifman commented on HIVE-13151:
---

That is +1 modulo my comments above

> Clean up UGI objects in FileSystem cache for transactions
> -
>
> Key: HIVE-13151
> URL: https://issues.apache.org/jira/browse/HIVE-13151
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13151.1.patch, HIVE-13151.2.patch, 
> HIVE-13151.3.patch
>
>
> One issue with FileSystem.CACHE is that it does not clean itself. The key in 
> that cache includes UGI object. When new UGI objects are created and used 
> with the FileSystem api, new entries get added to the cache.
> We need to manually clean up those UGI objects once they are no longer in use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13151) Clean up UGI objects in FileSystem cache for transactions

2016-03-22 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206920#comment-15206920
 ] 

Eugene Koifman commented on HIVE-13151:
---

I talked to Thejas and now I understand this better.


[~wzheng]
+1 on the patch

> Clean up UGI objects in FileSystem cache for transactions
> -
>
> Key: HIVE-13151
> URL: https://issues.apache.org/jira/browse/HIVE-13151
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13151.1.patch, HIVE-13151.2.patch, 
> HIVE-13151.3.patch
>
>
> One issue with FileSystem.CACHE is that it does not clean itself. The key in 
> that cache includes UGI object. When new UGI objects are created and used 
> with the FileSystem api, new entries get added to the cache.
> We need to manually clean up those UGI objects once they are no longer in use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12612) beeline always exits with 0 status when reading query from standard input

2016-03-22 Thread Reuben Kuhnert (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuben Kuhnert updated HIVE-12612:
--
Attachment: HIVE-12612.01.patch

> beeline always exits with 0 status when reading query from standard input
> -
>
> Key: HIVE-12612
> URL: https://issues.apache.org/jira/browse/HIVE-12612
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.1.0
> Environment: CDH5.5.0
>Reporter: Paulo Sequeira
>Assignee: Reuben Kuhnert
>Priority: Minor
> Attachments: HIVE-12612.01.patch
>
>
> Similar to what was reported on HIVE-6978, but now it only happens when the 
> query is read from the standard input. For example, the following fails as 
> expected:
> {code}
> bash$ if beeline -u "jdbc:hive2://..." -e "boo;" ; then echo "Ok?!" ; else 
> echo "Failed!" ; fi
> Connecting to jdbc:hive2://...
> Connected to: Apache Hive (version 1.1.0-cdh5.5.0)
> Driver: Hive JDBC (version 1.1.0-cdh5.5.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:0 
> cannot recognize input near 'boo' '' '' (state=42000,code=4)
> Closing: 0: jdbc:hive2://...
> Failed!
> {code}
> But the following does not:
> {code}
> bash$ if echo "boo;"|beeline -u "jdbc:hive2://..." ; then echo "Ok?!" ; else 
> echo "Failed!" ; fi
> Connecting to jdbc:hive2://...
> Connected to: Apache Hive (version 1.1.0-cdh5.5.0)
> Driver: Hive JDBC (version 1.1.0-cdh5.5.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 1.1.0-cdh5.5.0 by Apache Hive
> 0: jdbc:hive2://...:8> Error: Error while compiling statement: FAILED: 
> ParseException line 1:0 cannot recognize input near 'boo' '' '' 
> (state=42000,code=4)
> 0: jdbc:hive2://...:8> Closing: 0: jdbc:hive2://...
> Ok?!
> {code}
> This was misleading our batch scripts to always believe that the execution of 
> the queries succeded, when sometimes that was not the case. 
> h2. Workaround
> We found we can work around the issue by always using the -e or the -f 
> parameters, and even reading the standard input through the /dev/stdin device 
> (this was useful because a lot of the scripts fed the queries from here 
> documents), like this:
> {code:title=some-script.sh}
> #!/bin/sh
> set -o nounset -o errexit -o pipefail
> # As beeline is failing to report an error status if reading the query
> # to be executed from STDIN, check whether no -f or -e option is used
> # and, in that case, pretend it has to read the query from a regular
> # file using -f to read from /dev/stdin
> function beeline_workaround_exit_status () {
> for arg in "$@"
> do if [ "$arg" = "-f" -o "$arg" = "-e" ]
>then beeline -u "..." "$@"
> return
>fi
> done
> beeline -u "..." "$@" -f /dev/stdin
> }
> beeline_workaround_exit_status < boo;
> EOF
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12612) beeline always exits with 0 status when reading query from standard input

2016-03-22 Thread Reuben Kuhnert (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuben Kuhnert updated HIVE-12612:
--
Status: Patch Available  (was: Open)

> beeline always exits with 0 status when reading query from standard input
> -
>
> Key: HIVE-12612
> URL: https://issues.apache.org/jira/browse/HIVE-12612
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.1.0
> Environment: CDH5.5.0
>Reporter: Paulo Sequeira
>Assignee: Reuben Kuhnert
>Priority: Minor
> Attachments: HIVE-12612.01.patch
>
>
> Similar to what was reported on HIVE-6978, but now it only happens when the 
> query is read from the standard input. For example, the following fails as 
> expected:
> {code}
> bash$ if beeline -u "jdbc:hive2://..." -e "boo;" ; then echo "Ok?!" ; else 
> echo "Failed!" ; fi
> Connecting to jdbc:hive2://...
> Connected to: Apache Hive (version 1.1.0-cdh5.5.0)
> Driver: Hive JDBC (version 1.1.0-cdh5.5.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Error: Error while compiling statement: FAILED: ParseException line 1:0 
> cannot recognize input near 'boo' '' '' (state=42000,code=4)
> Closing: 0: jdbc:hive2://...
> Failed!
> {code}
> But the following does not:
> {code}
> bash$ if echo "boo;"|beeline -u "jdbc:hive2://..." ; then echo "Ok?!" ; else 
> echo "Failed!" ; fi
> Connecting to jdbc:hive2://...
> Connected to: Apache Hive (version 1.1.0-cdh5.5.0)
> Driver: Hive JDBC (version 1.1.0-cdh5.5.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 1.1.0-cdh5.5.0 by Apache Hive
> 0: jdbc:hive2://...:8> Error: Error while compiling statement: FAILED: 
> ParseException line 1:0 cannot recognize input near 'boo' '' '' 
> (state=42000,code=4)
> 0: jdbc:hive2://...:8> Closing: 0: jdbc:hive2://...
> Ok?!
> {code}
> This was misleading our batch scripts to always believe that the execution of 
> the queries succeded, when sometimes that was not the case. 
> h2. Workaround
> We found we can work around the issue by always using the -e or the -f 
> parameters, and even reading the standard input through the /dev/stdin device 
> (this was useful because a lot of the scripts fed the queries from here 
> documents), like this:
> {code:title=some-script.sh}
> #!/bin/sh
> set -o nounset -o errexit -o pipefail
> # As beeline is failing to report an error status if reading the query
> # to be executed from STDIN, check whether no -f or -e option is used
> # and, in that case, pretend it has to read the query from a regular
> # file using -f to read from /dev/stdin
> function beeline_workaround_exit_status () {
> for arg in "$@"
> do if [ "$arg" = "-f" -o "$arg" = "-e" ]
>then beeline -u "..." "$@"
> return
>fi
> done
> beeline -u "..." "$@" -f /dev/stdin
> }
> beeline_workaround_exit_status < boo;
> EOF
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13303) spill to YARN directories, not tmp, when available

2016-03-22 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206881#comment-15206881
 ] 

Sergey Shelukhin commented on HIVE-13303:
-

[~gopalv] [~sseth] can you please review?

> spill to YARN directories, not tmp, when available
> --
>
> Key: HIVE-13303
> URL: https://issues.apache.org/jira/browse/HIVE-13303
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13303.patch
>
>
> RowContainer::setupWriter, HybridHashTableContainer::spillPartition, 
> (KeyValueContainer|ObjectContainer)::setupOutput, 
> VectorMapJoinRowBytesContainer::setupOutputFileStreams create files in tmp. 
> Maybe some other code does it too, those are the ones I see on the execution 
> path. When there are multiple YARN output directories and multiple tasks 
> running on a machine, it's better to use the YARN directories. The only 
> question is cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13329) Hive query id should not be allowed to be modified by users.

2016-03-22 Thread Vikram Dixit K (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13329:
--
External issue ID: HIVE-13286  (was: HIVE-13296)

> Hive query id should not be allowed to be modified by users.
> 
>
> Key: HIVE-13329
> URL: https://issues.apache.org/jira/browse/HIVE-13329
> Project: Hive
>  Issue Type: Bug
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13286) Query ID is being reused across queries

2016-03-22 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206870#comment-15206870
 ] 

Vikram Dixit K commented on HIVE-13286:
---

[~aihuaxu] Are the test failures related? Otherwise let me know and I can 
commit the patch to master and branch-2. I will raise a follow-on jira for 
disallowing the user to set this configuration.

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
> Attachments: HIVE-13286.1.patch, HIVE-13286.2.patch, 
> HIVE-13286.3.patch, HIVE-13286.4.patch
>
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12439) CompactionTxnHandler.markCleaned() and TxnHandler.openTxns() misc improvements

2016-03-22 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206852#comment-15206852
 ] 

Eugene Koifman commented on HIVE-12439:
---

[~leftylev] The new props only apply to direct SQL from Metastore to Metastore 
DB.

> CompactionTxnHandler.markCleaned() and TxnHandler.openTxns() misc improvements
> --
>
> Key: HIVE-12439
> URL: https://issues.apache.org/jira/browse/HIVE-12439
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>  Labels: TODOC1.3, TODOC2.1
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12439.1.patch, HIVE-12439.2.patch, 
> HIVE-12439.3.patch
>
>
> # add a safeguard to make sure IN clause is not too large; break up by txn id 
> to delete from TXN_COMPONENTS where tc_txnid in ...
> # TxnHandler. openTxns() - use 1 insert with many rows in values() clause, 
> rather than 1 DB roundtrip per row



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11388) Allow ACID Compactor components to run in multiple metastores

2016-03-22 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206848#comment-15206848
 ] 

Eugene Koifman commented on HIVE-11388:
---

This change makes use of JDBC Connections, and thus the connection pool may 
need to be larger.  Pool size is currently hardcoded.  Should fix HIVE-12592.

> Allow ACID Compactor components to run in multiple metastores
> -
>
> Key: HIVE-11388
> URL: https://issues.apache.org/jira/browse/HIVE-11388
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-11388.2.patch, HIVE-11388.4.patch, 
> HIVE-11388.5.patch, HIVE-11388.6.patch, HIVE-11388.7.patch, HIVE-11388.patch
>
>
> (this description is no loner accurate; see further comments)
> org.apache.hadoop.hive.ql.txn.compactor.Initiator is a thread that runs 
> inside the metastore service to manage compactions of ACID tables.  There 
> should be exactly 1 instance of this thread (even with multiple Thrift 
> services).
> This is documented in 
> https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration
>  but not enforced.
> Should add enforcement, since more than 1 Initiator could cause concurrent 
> attempts to compact the same table/partition - which will not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12592) Expose connection pool tuning props in TxnHandler

2016-03-22 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206844#comment-15206844
 ] 

Eugene Koifman commented on HIVE-12592:
---

I don't think it's sufficient.  If you look at 
TxnHandler.setupJdbcConnectionPool() - it explicitly sets some parameters for 
BoneCP which I imagine will override whatever is in bonecp-config.xml.  So to 
make this work properly we likely need to add a "base" bonecp-config.xml to 
hive JAR that contains TxnHandler or make it available in some other way

> Expose connection pool tuning props in TxnHandler
> -
>
> Key: HIVE-12592
> URL: https://issues.apache.org/jira/browse/HIVE-12592
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Chetna Chaudhari
>
> BoneCP allows various pool tuning options like connection timeout, num 
> connections, etc
> There should be a config based way to set these



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11221) In Tez mode, alter table concatenate orc files can intermittently fail with NPE

2016-03-22 Thread ashish shenoy (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206833#comment-15206833
 ] 

ashish shenoy commented on HIVE-11221:
--

I hit this issue consistently as well; here's the stack trace when I use the 
Tez execution engine:

VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED

File MergeFAILED -1  00   -1   0   0

VERTICES: 00/01  [>>--] 0%ELAPSED TIME: 
1458666880.00 s

Status: Failed
Vertex failed, vertexName=File Merge, vertexId=vertex_1455906569416_0009_1_00, 
diagnostics=[Vertex vertex_1455906569416_0009_1_00 [File Merge] killed/failed 
due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [] 
initializer failed, vertex=vertex_1455906569416_0009_1_00 [File Merge], 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452)
at 
org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441)
at 
org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295)
at 
org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
]
DAG failed due to vertex failure. failedVertices:1 killedVertices:0
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.DDLTask

We are still on Hive 0.14, and are planning to move to HDP 2.4 since we have 
observed hive to be very unstable, unpredictable and hence unreliable for 
merging ORC files as well as many other basic sql queries that presto 
successfully completes. Since 1.3.0 is not in HDP 2.4, is installing a custom 
hive jar the only option at this point to mitigate this issue ? How will ambari 
behave with a custom installation of hive ?


> In Tez mode, alter table concatenate orc files can intermittently fail with 
> NPE
> ---
>
> Key: HIVE-11221
> URL: https://issues.apache.org/jira/browse/HIVE-11221
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-11221.1.patch
>
>
> We are not waiting for input ready events which can trigger occasional NPE if 
> input is not actually ready.
> Stacktrace:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>

[jira] [Commented] (HIVE-13310) Vectorized Projection Comparison Number Column to Scalar broken for !noNulls and selectedInUse

2016-03-22 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206809#comment-15206809
 ] 

Matt McCline commented on HIVE-13310:
-

Committed to master.  Classes are generated by GenVectorCode on branch-1 -- 
investigating.

> Vectorized Projection Comparison Number Column to Scalar broken for !noNulls 
> and selectedInUse
> --
>
> Key: HIVE-13310
> URL: https://issues.apache.org/jira/browse/HIVE-13310
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13310.01.patch, HIVE-13310.02.patch
>
>
> LongColEqualLongScalar.java
> LongColGreaterEqualLongScalar.java
> LongColGreaterLongScalar.java
> LongColLessEqualLongScalar.java
> LongColLessLongScalar.java
> LongColNotEqualLongScalar.java
> LongScalarEqualLongColumn.java
> LongScalarGreaterEqualLongColumn.java
> LongScalarGreaterLongColumn.java
> LongScalarLessEqualLongColumn.java
> LongScalarLessLongColumn.java
> LongScalarNotEqualLongColumn.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13310) Vectorized Projection Comparison Number Column to Scalar broken for !noNulls and selectedInUse

2016-03-22 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13310:

Fix Version/s: (was: 1.3.0)

> Vectorized Projection Comparison Number Column to Scalar broken for !noNulls 
> and selectedInUse
> --
>
> Key: HIVE-13310
> URL: https://issues.apache.org/jira/browse/HIVE-13310
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-13310.01.patch, HIVE-13310.02.patch
>
>
> LongColEqualLongScalar.java
> LongColGreaterEqualLongScalar.java
> LongColGreaterLongScalar.java
> LongColLessEqualLongScalar.java
> LongColLessLongScalar.java
> LongColNotEqualLongScalar.java
> LongScalarEqualLongColumn.java
> LongScalarGreaterEqualLongColumn.java
> LongScalarGreaterLongColumn.java
> LongScalarLessEqualLongColumn.java
> LongScalarLessLongColumn.java
> LongScalarNotEqualLongColumn.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-13310) Vectorized Projection Comparison Number Column to Scalar broken for !noNulls and selectedInUse

2016-03-22 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206780#comment-15206780
 ] 

Matt McCline commented on HIVE-13310:
-

Failures are unrelated.

> Vectorized Projection Comparison Number Column to Scalar broken for !noNulls 
> and selectedInUse
> --
>
> Key: HIVE-13310
> URL: https://issues.apache.org/jira/browse/HIVE-13310
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13310.01.patch, HIVE-13310.02.patch
>
>
> LongColEqualLongScalar.java
> LongColGreaterEqualLongScalar.java
> LongColGreaterLongScalar.java
> LongColLessEqualLongScalar.java
> LongColLessLongScalar.java
> LongColNotEqualLongScalar.java
> LongScalarEqualLongColumn.java
> LongScalarGreaterEqualLongColumn.java
> LongScalarGreaterLongColumn.java
> LongScalarLessEqualLongColumn.java
> LongScalarLessLongColumn.java
> LongScalarNotEqualLongColumn.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 122 matches

Mail list logo