Review Request 62684: HIVE-17634:Estimate the column stats even not retrieve columns from metastore(hive.stats.fetch.column.stats as false)

2017-09-28 Thread kelly zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62684/
---

Review request for hive and Vineet Garg.


Bugs: HIVE-17634
https://issues.apache.org/jira/browse/HIVE-17634


Repository: hive-git


Description
---

Estimate the column stats even not retrieve columns from 
metastore(hive.stats.fetch.column.stats as false)*


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java fde8c53 


Diff: https://reviews.apache.org/r/62684/diff/1/


Testing
---


Thanks,

kelly zhang



[jira] [Created] (HIVE-17650) DDLTask.handleRemoveMm() assumes locks not present

2017-09-28 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-17650:
-

 Summary: DDLTask.handleRemoveMm() assumes locks not present
 Key: HIVE-17650
 URL: https://issues.apache.org/jira/browse/HIVE-17650
 Project: Hive
  Issue Type: Sub-task
  Components: Transactions
Reporter: Eugene Koifman


This moves every file in the table from under delta_x_x/ to root of the 
table/partition
How would this work for bucketed tables?  Will it create bucket_x_copy_N files?
This could create 1000s of copy_N files - this will likely break something

The comments in the method assume locks are present - this would imply that 
there are appropriate Read/WriteEntity objects already created - I doubt this 
is the case for a table property change.

It seems like this kind of op should require an Exclusive lock at table level 
to prevent concurrent inserts (into new delta_x_x/)





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17649) Export/Import: Move export data write to a task

2017-09-28 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-17649:
---

 Summary: Export/Import: Move export data write to a task
 Key: HIVE-17649
 URL: https://issues.apache.org/jira/browse/HIVE-17649
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17648) Delegation token can't be passed through from Hive to HBase when required by UDF

2017-09-28 Thread iBenny (JIRA)
iBenny created HIVE-17648:
-

 Summary: Delegation token can't be passed through from Hive to 
HBase when required by UDF
 Key: HIVE-17648
 URL: https://issues.apache.org/jira/browse/HIVE-17648
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Affects Versions: 1.1.1
Reporter: iBenny


When using Hive CLI to query a Hive table with UDF which needs to access HBase 
with Kerberos,  the delegation token can't be passed through to HBase from Hive 
CLI. Usually, if we access a Hbase directly from Hive CLI, Hive CLI will get 
the delegation token before the job starting.  But in this case hive CLI don't 
know this job need a HBase delegation token until the UDF was running in the 
map task and then error happened in the map job:
GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Failed to find any Kerberos ...)]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17647) DDLTask.generateAddMmTasks(Table tbl) should not start transactions

2017-09-28 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-17647:
-

 Summary: DDLTask.generateAddMmTasks(Table tbl) should not start 
transactions
 Key: HIVE-17647
 URL: https://issues.apache.org/jira/browse/HIVE-17647
 Project: Hive
  Issue Type: Sub-task
  Components: Transactions
Reporter: Eugene Koifman


This method has 
{noformat}
  if (txnManager.isTxnOpen()) {
mmWriteId = txnManager.getCurrentTxnId();
  } else {
mmWriteId = txnManager.openTxn(new Context(conf), conf.getUser());
txnManager.commitTxn();
  }
{noformat}
this should throw if there is no open transaction.  It should never open one.

In general the logic seems suspect.  Looks like the intent is to move all 
existing files into a delta_x_x/ when a plain table is converted to MM table.  
This seems like something that needs to be done from under an Exclusive lock to 
prevent concurrent Insert operations writing data under table/partition root.  
But this is too late to acquire locks which should be done from the 
Driver.acquireLocks()  (or else have deadlock detector since acquiring them 
here would bread all-or-nothing lock acquisition semantics currently required 
w/o deadlock detector)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17646) MetaStoreUtils.isToInsertOnlyTable(Map props) is not needed

2017-09-28 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-17646:
-

 Summary: MetaStoreUtils.isToInsertOnlyTable(Map 
props) is not needed
 Key: HIVE-17646
 URL: https://issues.apache.org/jira/browse/HIVE-17646
 Project: Hive
  Issue Type: Sub-task
  Components: Transactions
Reporter: Eugene Koifman


TransactionValidationListener is where all the logic to verify
"transactional" & "transactional_properties" should be



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17645) This conflicts with HIVE-17482 (Spark/Acid integration)

2017-09-28 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-17645:
-

 Summary: This conflicts with HIVE-17482 (Spark/Acid integration)
 Key: HIVE-17645
 URL: https://issues.apache.org/jira/browse/HIVE-17645
 Project: Hive
  Issue Type: Sub-task
  Components: Transactions
Affects Versions: 3.0.0
Reporter: Eugene Koifman


MM code introduces 
{noformat}
HiveTxnManager txnManager = SessionState.get().getTxnMgr()
{noformat}

in a number of places.  HIVE-17482 adds a mode where a TransactionManager not 
associated with the session should be used.  This will need to be addressed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17644) directSQL errors out on key constraints until the DB is initialized

2017-09-28 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-17644:
---

 Summary: directSQL errors out on key constraints until the DB is 
initialized
 Key: HIVE-17644
 URL: https://issues.apache.org/jira/browse/HIVE-17644
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin


{noformat}
2017-09-28T17:22:32,370  WARN [pool-6-thread-16] metastore.MetaStoreDirectSql: 
Failed to execute [SELECT "DBS"."NAME", "TBLS"."TBL_NAME", 
"COLUMNS_V2"."COLUMN_NAME","KEY_CONSTRAINTS"."POSITION", 
"KEY_CONSTRAINTS"."CONSTRAINT_NAME", "KEY_CONSTRAINTS"."ENABLE_VALIDATE_RELY"  
from "TBLS"  INNER  join "KEY_CONSTRAINTS" ON "TBLS"."TBL_ID" = 
"KEY_CONSTRAINTS"."PARENT_TBL_ID"  INNER join "DBS" ON "TBLS"."DB_ID" = 
"DBS"."DB_ID"  INNER JOIN "COLUMNS_V2" ON "COLUMNS_V2"."CD_ID" = 
"KEY_CONSTRAINTS"."PARENT_CD_ID" AND  "COLUMNS_V2"."INTEGER_IDX" = 
"KEY_CONSTRAINTS"."PARENT_INTEGER_IDX"  WHERE 
"KEY_CONSTRAINTS"."CONSTRAINT_TYPE" = 0 AND "DBS"."NAME" = ? AND 
"TBLS"."TBL_NAME" = ?] with parameters 
[concatenatetable_org_apache_hadoop_hive_ql_parse_testreplicationscenarios_1506644534106,
 unptned]
javax.jdo.JDODataStoreException: Error executing SQL query "SELECT 
"DBS"."NAME", "TBLS"."TBL_NAME", 
"COLUMNS_V2"."COLUMN_NAME","KEY_CONSTRAINTS"."POSITION", 
"KEY_CONSTRAINTS"."CONSTRAINT_NAME", "KEY_CONSTRAINTS"."ENABLE_VALIDATE_RELY"  
from "TBLS"  INNER  join "KEY_CONSTRAINTS" ON "TBLS"."TBL_ID" = 
"KEY_CONSTRAINTS"."PARENT_TBL_ID"  INNER join "DBS" ON "TBLS"."DB_ID" = 
"DBS"."DB_ID"  INNER JOIN "COLUMNS_V2" ON "COLUMNS_V2"."CD_ID" = 
"KEY_CONSTRAINTS"."PARENT_CD_ID" AND  "COLUMNS_V2"."INTEGER_IDX" = 
"KEY_CONSTRAINTS"."PARENT_INTEGER_IDX"  WHERE 
"KEY_CONSTRAINTS"."CONSTRAINT_TYPE" = 0 AND "DBS"."NAME" = ? AND 
"TBLS"."TBL_NAME" = ?".
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
 ~[datanucleus-api-jdo-4.2.4.jar:?]
at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) 
~[datanucleus-api-jdo-4.2.4.jar:?]
at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267) 
~[datanucleus-api-jdo-4.2.4.jar:?]
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:1922)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPrimaryKeys(MetaStoreDirectSql.java:2105)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$12.getSqlResult(ObjectStore.java:8891)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$12.getSqlResult(ObjectStore.java:8887)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3060)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPrimaryKeysInternal(ObjectStore.java:8899)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPrimaryKeys(ObjectStore.java:8875)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_45]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_45]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_45]
at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_45]
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) 
[hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at com.sun.proxy.$Proxy31.getPrimaryKeys(Unknown Source) [?:?]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_primary_keys(HiveMetaStore.java:7252)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_45]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_45]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_45]
at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_45]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
 [hive-metastore-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at com.sun.proxy.$Proxy33.get_primary_keys(Unknown Source) [?:?]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_primary_keys.getResult(ThriftHiveMetastore.java:13596)
 [hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
at 
org.

[jira] [Created] (HIVE-17643) recent WM changes broke reopen due to spurious overloads

2017-09-28 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-17643:
---

 Summary: recent WM changes broke reopen due to spurious overloads
 Key: HIVE-17643
 URL: https://issues.apache.org/jira/browse/HIVE-17643
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17642) union test OOMs

2017-09-28 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-17642:
---

 Summary: union test OOMs
 Key: HIVE-17642
 URL: https://issues.apache.org/jira/browse/HIVE-17642
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin


Noticed this while investigating some spurious failures due to a combination of 
errors.
The reason unionDistinct (or whatever) test fails sporadically is OOM:
{noformat}
2017-09-28T01:07:25,804 ERROR [3d4e3f44-40c5-431a-b3de-801d60c1c579 main] 
ql.Driver: FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 61, 
vertexId=vertex_1506585924598_0001_53_01, diagnostics=[Vertex 
vertex_1506585924598_0001_53_01 [Map 61] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: src initializer failed, 
vertex=vertex_1506585924598_0001_53_01 [Map 61], java.lang.OutOfMemoryError: GC 
overhead limit exceeded
at 
java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1019)
at 
java.util.concurrent.ConcurrentHashMap.putAll(ConcurrentHashMap.java:1084)
at 
java.util.concurrent.ConcurrentHashMap.(ConcurrentHashMap.java:852)
at org.apache.hadoop.conf.Configuration.(Configuration.java:721)
at org.apache.hadoop.mapred.JobConf.(JobConf.java:442)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:591)
at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:196)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
]Invalid event V_INTERNAL_ERROR on Vertex vertex_1506585924598_0001_53_00 [Map 
60]
2017-09-28T01:07:25,804 DEBUG [3d4e3f44-40c5-431a-b3de-801d60c1c579 main] 
ql.Driver: Shutting down query 


SELECT count(1) FROM (
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT

  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT

  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT

  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT

  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src UNION DISTINCT
  SELECT key, value FROM src) src
{noformat}

Perhaps we should make it smaller.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Review Request 62680: HIVE-17637: Move WorkloadManager to service module

2017-09-28 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62680/
---

Review request for hive and Sergey Shelukhin.


Bugs: HIVE-17637
https://issues.apache.org/jira/browse/HIVE-17637


Repository: hive-git


Description
---

HIVE-17637: Move WorkloadManager to service module


Diffs
-

  common/src/java/org/apache/hive/wm/ISessionPoolManager.java PRE-CREATION 
  common/src/java/org/apache/hive/wm/ISessionRestart.java PRE-CREATION 
  common/src/java/org/apache/hive/wm/IWorkloadManager.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SessionExpirationTracker.java 
da93a3a791ee952db5fd44276744ac8cf4ed234e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java 
538d7454b70aa973fdacd7197d1c9cea92dc99db 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolSession.java 
4488c12eb9c432c173580bb75df3113818c93113 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 
29d6fe65833a019a383cb50299124f6480e520b8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WmTezSession.java 
00501eef93cda5b386d1db3429aec5050ab6fffc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java 
288d705f8373fb44c6b9c2d28701628dab5e8a71 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/SampleTezSessionState.java 
59efd43be674879bcb19ddae9f0ef38e0d8ef4e0 
  ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestWorkloadManager.java 
7adf895077ccb199c8450fe6d07af70c5435b5e8 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 
5cb973ca9563990b66f8516c4b296fd3873e9875 
  service/src/java/org/apache/hive/service/wm/tez/WorkloadManager.java 
PRE-CREATION 


Diff: https://reviews.apache.org/r/62680/diff/1/


Testing
---


Thanks,

Prasanth_J



[jira] [Created] (HIVE-17641) Visibility issue of Task.done cause Driver skip stages in parallel execution

2017-09-28 Thread Zhiyuan Yang (JIRA)
Zhiyuan Yang created HIVE-17641:
---

 Summary: Visibility issue of Task.done cause Driver skip stages in 
parallel execution
 Key: HIVE-17641
 URL: https://issues.apache.org/jira/browse/HIVE-17641
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Zhiyuan Yang
Assignee: Zhiyuan Yang


Task.done is not volatile. In case of parallel execution, TaskRunner thread set 
this value, and Driver thread read this value when it determines whether a 
child task is runnable

DriverContext.java
{code}
public static boolean isLaunchable(Task tsk) {
return !tsk.getQueued() && !tsk.getInitialized() && tsk.isRunnable();
{code}
Task.java
{code}
public boolean isRunnable() {
boolean isrunnable = true;
if (parentTasks != null) {
  for (Task parent : parentTasks) {
if (!parent.done()) {
{code}

This happens without any synchronization, so a child can be not runnable even 
all parents finish.

To make it worse, Driver think query is successful when there is no running 
task or runnable task, so query may finish without executing some stages.
Driver.java
{code}
while (!destroyed && driverCxt.isRunning()) {
{code}
DriverContext.java
{code}
public synchronized boolean isRunning() {
return !shutdown && (!running.isEmpty() || !runnable.isEmpty());
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17640) Comparison of date return null if only time part is provided in string.

2017-09-28 Thread Yongzhi Chen (JIRA)
Yongzhi Chen created HIVE-17640:
---

 Summary: Comparison of date return null if only time part is 
provided in string.
 Key: HIVE-17640
 URL: https://issues.apache.org/jira/browse/HIVE-17640
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen
 Fix For: 2.1.0


Reproduce:
select '2017-01-01 00:00:00' < current_date;
INFO  : OK
...
1 row selected (18.324 seconds)
...
 NULL



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17639) don't reuse planner context when re-parsing the query

2017-09-28 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-17639:
---

 Summary: don't reuse planner context when re-parsing the query
 Key: HIVE-17639
 URL: https://issues.apache.org/jira/browse/HIVE-17639
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17638) SparkDynamicPartitionPruner loads all partition metadata into memory

2017-09-28 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-17638:
---

 Summary: SparkDynamicPartitionPruner loads all partition metadata 
into memory
 Key: HIVE-17638
 URL: https://issues.apache.org/jira/browse/HIVE-17638
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Sahil Takiar


The {{SparkDynamicPartitionPruner}} first loads the contents of each partition 
pruning file into memory, and then prunes all the partitions from the 
{{MapWork}}. This can cause increased memory pressure on the HoS Remote Driver 
because it requires loading all the partition metadata into memory. It would be 
more efficient if pruning of partitions was done while scanning the files, so 
that all the partition metadata doesn't need to be buffered in memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17637) Move WorkloadManager to service module

2017-09-28 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-17637:


 Summary: Move WorkloadManager to service module
 Key: HIVE-17637
 URL: https://issues.apache.org/jira/browse/HIVE-17637
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


There are some rules that has to applied at HS2 service level (connections per 
user, elapsed time etc.). WM is in ql module which makes it difficult to 
interact service module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17636) Add multiple_agg.q test for blobstores

2017-09-28 Thread Ran Gu (JIRA)
Ran Gu created HIVE-17636:
-

 Summary: Add multiple_agg.q test for blobstores
 Key: HIVE-17636
 URL: https://issues.apache.org/jira/browse/HIVE-17636
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Ran Gu
Assignee: Ran Gu
 Fix For: 3.0.0, 2.4.0


This patch introduces multiple_agg.q regression tests into the hive-blobstore 
qtest module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17635) Add unit tests to CompactionTxnHandler and use PreparedStatements for queries

2017-09-28 Thread Andrew Sherman (JIRA)
Andrew Sherman created HIVE-17635:
-

 Summary: Add unit tests to CompactionTxnHandler and use 
PreparedStatements for queries
 Key: HIVE-17635
 URL: https://issues.apache.org/jira/browse/HIVE-17635
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Andrew Sherman
Assignee: Andrew Sherman


It is better for jdbc code that runs against the HMS database to use 
PreparedStatements. Convert CompactionTxnHandler queries to use 
PreparedStatement and add tests to TestCompactionTxnHandler to test these 
queries, and improve code coverage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17634) Use properties from HiveConf in RelOptHiveTable#updateColStats

2017-09-28 Thread liyunzhang_intel (JIRA)
liyunzhang_intel created HIVE-17634:
---

 Summary: Use properties from HiveConf in 
RelOptHiveTable#updateColStats
 Key: HIVE-17634
 URL: https://issues.apache.org/jira/browse/HIVE-17634
 Project: Hive
  Issue Type: Bug
Reporter: liyunzhang_intel


in 
[RelOptHiveTable#updateColStats|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L309],
 we set {{fetchColStats}},{{fetchPartStats}} as true when call 
{{StatsUtils.collectStatistics}}
{code}

   if (!hiveTblMetadata.isPartitioned()) {
// 2.1 Handle the case for unpartitioned table.
try {
  Statistics stats = StatsUtils.collectStatistics(hiveConf, null,
  hiveTblMetadata, hiveNonPartitionCols, 
nonPartColNamesThatRqrStats,
  colStatsCached, nonPartColNamesThatRqrStats, true, true);
  ...
{code}

This will cause querying columns statistic from metastore even we set  
{{hive.stats.fetch.column.stats}} and {{hive.stats.fetch.partition.stats}} as 
false in HiveConf.  If we these two properties as false, we can not any column 
statistics from metastore.  Suggest to set the properties from HiveConf. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 62442: HIVE-17569: Compare filtered output files in BeeLine tests

2017-09-28 Thread Peter Vary

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62442/#review186541
---


Ship it!




Ship It!

- Peter Vary


On Sept. 27, 2017, 3:11 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62442/
> ---
> 
> (Updated Sept. 27, 2017, 3:11 p.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-17569
> https://issues.apache.org/jira/browse/HIVE-17569
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Introduce a new property "test.beeline.compare.portable" with the default 
> value false and if this property is set to true, the result of the commands 
> "EXPLAIN", "DESCRIBE EXTENDED" and "DESCRIBE FORMATTED" will be filtered out 
> from the out files before comparing them in BeeLine tests.
> 
> 
> Diffs
> -
> 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreBeeLineDriver.java
>  9dfc253 
>   itests/util/src/main/java/org/apache/hive/beeline/QFile.java e70ac38 
> 
> 
> Diff: https://reviews.apache.org/r/62442/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



[jira] [Created] (HIVE-17633) Make it possible to override the query results directory in TestBeeLineDriver

2017-09-28 Thread Peter Vary (JIRA)
Peter Vary created HIVE-17633:
-

 Summary: Make it possible to override the query results directory 
in TestBeeLineDriver
 Key: HIVE-17633
 URL: https://issues.apache.org/jira/browse/HIVE-17633
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 3.0.0
Reporter: Peter Vary
Assignee: Peter Vary


It would be good to have the possibility to override where the 
TestBeeLineDriver looks for the golden files



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17632) Build Hive with JDK9

2017-09-28 Thread Jerry Chen (JIRA)
Jerry Chen created HIVE-17632:
-

 Summary: Build Hive with JDK9
 Key: HIVE-17632
 URL: https://issues.apache.org/jira/browse/HIVE-17632
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Affects Versions: 3.0.0
Reporter: Jerry Chen


JDK 9 has been released recently with a lot of improvements such as the support 
of AVX 512 which can bring performance benefits running on Skylake servers.
We would expect that the users will soon to try JDK9 and will build Hadoop on 
it. Currently it's not clear what issues will the user have to build Hive on 
JDK9. The JIRA can serve as the umbrella JIRA to track all these issues.

http://jdk.java.net/9/




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)