[jira] [Created] (HIVE-24742) Support router path or view fs path in Hive table location

2021-02-04 Thread Aihua Xu (Jira)
Aihua Xu created HIVE-24742:
---

 Summary: Support router path or view fs path in Hive table location
 Key: HIVE-24742
 URL: https://issues.apache.org/jira/browse/HIVE-24742
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 3.1.2
Reporter: Aihua Xu
Assignee: Aihua Xu


In 
[FileUtils.java|https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L747],
 equalsFileSystem function checks the base URL to determine if source and 
destination are on the same cluster and decides copy or move the data. That 
will not work for viewfs or router base file system since viewfs://ns-default/a 
and viewfs://ns-default/b may be on different physical clusters.

FileSystem in HDFS supports resolvePath() function to resolve to the physical 
path. We can support viewfs and router through such function.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24171) Support HDFS reads from observer NameNodes

2020-09-15 Thread Aihua Xu (Jira)
Aihua Xu created HIVE-24171:
---

 Summary: Support HDFS reads from observer NameNodes
 Key: HIVE-24171
 URL: https://issues.apache.org/jira/browse/HIVE-24171
 Project: Hive
  Issue Type: New Feature
  Components: Hive
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


HDFS-12943 introduces the consistent reads from observer NameNodes which can 
boost the read performance and reduces the overloads on active NameNodes.

To take advantage of this feature, the clients are required to make a msync() 
call after writing the files or before reading the files since observer 
NameNodes could have the stale data for a small window. 

Hive needs to make msync() call to HDFS in some places, e.g., 1) after 
generating the plan files - map.xml and reduce.xml so they can get used later 
by executors; 2) after the intermediate files are generated so they can get 
used by later stages or HS2. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-21122) Support Yarn resource profile in Hive

2019-01-14 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-21122:
---

 Summary: Support Yarn resource profile in Hive
 Key: HIVE-21122
 URL: https://issues.apache.org/jira/browse/HIVE-21122
 Project: Hive
  Issue Type: New Feature
  Components: Hive
Affects Versions: 3.0.0
Reporter: Aihua Xu


Resource profile is a new feature supported in Yarn 3.1.0 (see YARN-3926). This 
would allow Yarn to allocate other resources like GPU/FPGA, in addition to 
memory and vcores. This would be a nice feature to support in Hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20861) Pass queryId as the client CallerContext to Spark

2018-11-02 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-20861:
---

 Summary: Pass queryId as the client CallerContext to Spark 
 Key: HIVE-20861
 URL: https://issues.apache.org/jira/browse/HIVE-20861
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu


SPARK-16759 exposes a way for the client to pass the client CallerContext such 
as QueryId. For better debug, hive should pass queryId to spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20745) qtest-druid build is failing

2018-10-14 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-20745:
---

 Summary: qtest-druid build is failing
 Key: HIVE-20745
 URL: https://issues.apache.org/jira/browse/HIVE-20745
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 4.0.0
Reporter: Aihua Xu


qtest-druild build throws the following exception. Seems we are missing avro 
dependency in pom.xml.

{noformat}
[ERROR] 
/Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[9,31]
 package org.apache.avro.message does not exist
[ERROR] 
/Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[10,31]
 package org.apache.avro.message does not exist
[ERROR] 
/Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[11,31]
 package org.apache.avro.message does not exist
[ERROR] 
/Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[22,24]
 cannot find symbol
[ERROR]   symbol:   class BinaryMessageEncoder
[ERROR]   location: class org.apache.hive.kafka.Wikipedia
[ERROR] 
/Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[25,24]
 cannot find symbol
[ERROR]   symbol:   class BinaryMessageDecoder
[ERROR]   location: class org.apache.hive.kafka.Wikipedia
[ERROR] 
/Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[31,17]
 cannot find symbol
[ERROR]   symbol:   class BinaryMessageDecoder
[ERROR]   location: class org.apache.hive.kafka.Wikipedia
[ERROR] 
/Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[39,63]
 cannot find symbol
[ERROR]   symbol:   class SchemaStore
[ERROR]   location: class org.apache.hive.kafka.Wikipedia
[ERROR] 
/Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[39,17]
 cannot find symbol
[ERROR]   symbol:   class BinaryMessageDecoder
[ERROR]   location: class org.apache.hive.kafka.Wikipedia
[ERROR] -> [Help 1]
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20345) Drop database may hang by the change in HIVE-11258

2018-08-08 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-20345:
---

 Summary: Drop database may hang by the change in HIVE-11258
 Key: HIVE-20345
 URL: https://issues.apache.org/jira/browse/HIVE-20345
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Affects Versions: 2.0.0, 1.3.0
Reporter: Aihua Xu
Assignee: Aihua Xu


In HiveMetaStore.java drop_database_core function,  HIVE-11258 updates the 
startIndex from endIndex incorrectly inside {{if (tables != null && 
!tables.isEmpty())}} statement. If the tables get deleted before 
getTableObjectsByName() call, then returned table list is empty and startIndex 
won't get updated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20331) Query with union all, lateral view and Join fails with "cannot find parent in the child operator"

2018-08-07 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-20331:
---

 Summary: Query with union all, lateral view and Join fails with 
"cannot find parent in the child operator"
 Key: HIVE-20331
 URL: https://issues.apache.org/jira/browse/HIVE-20331
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 2.1.1
Reporter: Aihua Xu
Assignee: Aihua Xu


The following query with Union, Lateral view and Join will fail during 
execution with the exception below.
{noformat}
create table t1(col1 int);
SELECT 1 AS `col1`
FROM t1
UNION ALL
  SELECT 2 AS `col1`
  FROM
(SELECT col1
 FROM t1
) x1
JOIN
  (SELECT col1
  FROM
(SELECT 
  Row_Number() over (PARTITION BY col1 ORDER BY col1) AS `col1`
FROM t1
) x2 lateral VIEW explode(map(10,1))`mapObj` AS `col2`, `col3`
  ) `expdObj`  
{noformat}

{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive internal 
error: cannot find parent in the child operator!
at 
org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:509)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:116) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
{noformat}

After debugging, seems we have issues in GenMRFileSink1 class in which we are 
setting incorrect aliasToWork to the MapWork.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20079) Populate more accurate rawDataSize for parquet format

2018-07-03 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-20079:
---

 Summary: Populate more accurate rawDataSize for parquet format
 Key: HIVE-20079
 URL: https://issues.apache.org/jira/browse/HIVE-20079
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Run the following queries and you will see the raw data for the table is 4 
(that is the number of fields) incorrectly. We need to populate correct data 
size so data can be split properly.
{noformat}
SET hive.stats.autogather=true;
CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET;
INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1');
DESC FORMATTED parquet_stats;
{noformat}

{noformat}
Table Parameters:
COLUMN_STATS_ACCURATE   true
numFiles1
numRows 2
rawDataSize 4
totalSize   373
transient_lastDdlTime   1530660523
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20053) Separate Hive Security from SessionState

2018-07-02 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-20053:
---

 Summary: Separate Hive Security from SessionState
 Key: HIVE-20053
 URL: https://issues.apache.org/jira/browse/HIVE-20053
 Project: Hive
  Issue Type: Improvement
  Components: Security
Affects Versions: 3.0.0
Reporter: Aihua Xu


Right now we have Hive security classes associated with SessionState. When 
HiveServer2 starts, the service session will initialize it and later each 
session will need to reinitialize it. Since such security configuration is on 
service level, we should move security info out out SessionState and make it 
Singleton so we can initialize it once.

And also, since SessionState.setupAuth()  - to setup authentication and 
authorization is not synchronized, we could run into concurrency issue if 
queries or meta operations are run within same session. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20037) Print root cause exception's toString() rather than getMessage()

2018-06-29 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-20037:
---

 Summary: Print root cause exception's toString() rather than 
getMessage()
 Key: HIVE-20037
 URL: https://issues.apache.org/jira/browse/HIVE-20037
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


When we run HoS job and if it fails for some errors, we are printing the 
exception message rather than exception toString(), for some exceptions, e.g., 
this java.lang.NoClassDefFoundError, we are missing the exception type 
information. 

{noformat}
Failed to execute Spark task Stage-1, with exception 
'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client 
for Spark session cf054497-b073-4327-a315-68c867ce3434: 
org/apache/spark/SparkConf)'
{noformat}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20027) TestRuntimeStats.testCleanup is flaky

2018-06-28 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-20027:
---

 Summary: TestRuntimeStats.testCleanup is flaky
 Key: HIVE-20027
 URL: https://issues.apache.org/jira/browse/HIVE-20027
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu


int deleted = objStore.deleteRuntimeStats(1);
assertEquals(1, deleted);

The testCleanup could fail if somehow there is GC pause before 
deleteRuntimeStats happens so actually 2 stats will get deleted rather than one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string

2018-06-19 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19948:
---

 Summary: HiveCli is not splitting the command by semicolon 
properly if quotes are inside the string 
 Key: HIVE-19948
 URL: https://issues.apache.org/jira/browse/HIVE-19948
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 2.2.0
Reporter: Aihua Xu


HIVE-15297 tries to split the command by considering semicolon inside string, 
but it doesn't consider the case that quotes can also be inside string. 

For the following command {{insert into escape1 partition (ds='1', part='3') 
values ("abc' ");}}, it will fail with 
{noformat}
18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 extraneous 
input ';' expecting EOF near ''
org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input ';' 
expecting EOF near ''
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19936) explain on a query failing in secure cluster whereas query itself works

2018-06-18 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19936:
---

 Summary: explain on a query failing in secure cluster whereas 
query itself works
 Key: HIVE-19936
 URL: https://issues.apache.org/jira/browse/HIVE-19936
 Project: Hive
  Issue Type: Bug
  Components: Hooks
Reporter: Aihua Xu


On a secured cluster with Sentry integrated run the following queries

{noformat}
create table foobar (id int) partitioned by (val int);
explain alter table foobar add partition (val=50);
{noformat}

The explain query will fail with the following exception while the query itself 
works with no issue.

Error while compiling statement: FAILED: SemanticException No valid 
privileges{color}
 Required privilege( Table) not available in output privileges
 The required privileges: (state=42000,code=4)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19899) Support stored as JsonFile

2018-06-14 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19899:
---

 Summary: Support stored as JsonFile 
 Key: HIVE-19899
 URL: https://issues.apache.org/jira/browse/HIVE-19899
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 3.0.0
 Environment: This is to add "stored as jsonfile" support for json file 
format. 
Reporter: Aihua Xu
Assignee: Aihua Xu






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19835) Flaky test: TestWorkloadManager.testAsyncSessionInitFailures

2018-06-08 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19835:
---

 Summary: Flaky test: 
TestWorkloadManager.testAsyncSessionInitFailures
 Key: HIVE-19835
 URL: https://issues.apache.org/jira/browse/HIVE-19835
 Project: Hive
  Issue Type: Sub-task
  Components: Test
Affects Versions: 4.0.0
Reporter: Aihua Xu


Sometimes this test fails with the following issue. Seems it's a flaky test.

{noformat}
Error Message
expected:<0> but was:<1>
Stacktrace
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testAsyncSessionInitFailures(TestWorkloadManager.java:1138)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19747) "GRANT ALL TO USER" failed with NullPointerException

2018-05-31 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19747:
---

 Summary: "GRANT ALL TO USER" failed with NullPointerException
 Key: HIVE-19747
 URL: https://issues.apache.org/jira/browse/HIVE-19747
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Affects Versions: 2.1.0
Reporter: Aihua Xu


If you issue the command 'grant all to user abc', you will see the following 
NPE exception. Seems the type in hivePrivObject is not initialized.

{noformat}
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.isOwner(SQLAuthorizationUtils.java:265)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPrivilegesFromMetaStore(SQLAuthorizationUtils.java:212)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.checkRequiredPrivileges(GrantPrivAuthUtils.java:64)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.authorize(GrantPrivAuthUtils.java:50)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.grantPrivileges(SQLStdHiveAccessController.java:179)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.grantPrivileges(SQLStdHiveAccessControllerWrapper.java:70)
at 
org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.grantPrivileges(HiveAuthorizerImpl.java:48)
at 
org.apache.hadoop.hive.ql.exec.DDLTask.grantOrRevokePrivileges(DDLTask.java:1123
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19496) Check untar folder

2018-05-10 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19496:
---

 Summary: Check untar folder
 Key: HIVE-19496
 URL: https://issues.apache.org/jira/browse/HIVE-19496
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Aihua Xu
Assignee: Aihua Xu


We need to check untar folder.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19328) Some error messages like "table not found" are printing to STDERR

2018-04-26 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19328:
---

 Summary: Some error messages like "table not found" are printing 
to STDERR
 Key: HIVE-19328
 URL: https://issues.apache.org/jira/browse/HIVE-19328
 Project: Hive
  Issue Type: Sub-task
  Components: Logging
Affects Versions: 3.0.0
Reporter: Aihua Xu


In Driver class, we are printing the exceptions to the log file and to the 
console through LogHelper. 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L730

I can see the following exceptions in the stderr.
FAILED: SemanticException [Error 10001]: Table not found default.sample_07

If it's from HiveCli, that makes sense to print to console, while if it's 
beeline talking to HS2, then such log should go to HS2 log and beeline console. 
So we should differentiate these two scenarios.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19320) MapRedLocalTask is printing child log to stderr and stdout

2018-04-26 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19320:
---

 Summary: MapRedLocalTask is printing child log to stderr and stdout
 Key: HIVE-19320
 URL: https://issues.apache.org/jira/browse/HIVE-19320
 Project: Hive
  Issue Type: Sub-task
  Components: Logging
Affects Versions: 3.0.0
Reporter: Aihua Xu


In this line, local child MR task is printing the logs to stderr and stdout. 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java#L341

stderr/stdout should capture the service running log rather than the query 
execution output. Those should be reasonable to go to HS2 log and propagate to 
beeline console. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19318) Improve Hive logging

2018-04-26 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19318:
---

 Summary: Improve Hive logging
 Key: HIVE-19318
 URL: https://issues.apache.org/jira/browse/HIVE-19318
 Project: Hive
  Issue Type: Improvement
  Components: Logging
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Use this jira to track some potential improvements on hive logging. What I have 
noticed that some log entries may have incorrect log level, or may not show in 
the correct places, e.g., printing to the STDERR/STDOUT rather than the HS2 log 
file. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19223) Migrate negative test cases to use hive.cli.errors.ignore

2018-04-16 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19223:
---

 Summary: Migrate negative test cases to use hive.cli.errors.ignore 
 Key: HIVE-19223
 URL: https://issues.apache.org/jira/browse/HIVE-19223
 Project: Hive
  Issue Type: Improvement
  Components: Test
Affects Versions: 3.0.0
Reporter: Aihua Xu


Migrate the negative test cases to use hive.cli.errors.ignore properties so 
multiple negative tests can be grouped together. It will save test resources 
and execution times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19222) TestNegativeCliDriver tests are failing due to "java.lang.OutOfMemoryError: GC overhead limit exceeded"

2018-04-16 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19222:
---

 Summary: TestNegativeCliDriver tests are failing due to 
"java.lang.OutOfMemoryError: GC overhead limit exceeded"
 Key: HIVE-19222
 URL: https://issues.apache.org/jira/browse/HIVE-19222
 Project: Hive
  Issue Type: Sub-task
Reporter: Aihua Xu


TestNegativeCliDriver tests are failing with OOM recently. Not sure why. I will 
try to increase the memory to test out.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19204) Detailed errors from some tasks are not displayed to the client because the tasks don't set exception when they fail

2018-04-13 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19204:
---

 Summary: Detailed errors from some tasks are not displayed to the 
client because the tasks don't set exception when they fail
 Key: HIVE-19204
 URL: https://issues.apache.org/jira/browse/HIVE-19204
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


In TaskRunner.java, if the tasks have exception set, then the task result will 
have such exception set and Driver.java will get such details and display to 
the client. But some tasks don't set such exceptions so the client won't see 
such details unless you check the HS2 log.
  
{noformat}
  public void runSequential() {
int exitVal = -101;
try {
  exitVal = tsk.executeTask(ss == null ? null : ss.getHiveHistory());
} catch (Throwable t) {
  if (tsk.getException() == null) {
tsk.setException(t);
  }
  LOG.error("Error in executeTask", t);
}
result.setExitVal(exitVal);
if (tsk.getException() != null) {
  result.setTaskError(tsk.getException());
}
  }
 {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19040) get_partitions_by_expr() implementation in HiveMetaStore causes backward incompatibility easily

2018-03-23 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19040:
---

 Summary: get_partitions_by_expr() implementation  in HiveMetaStore 
causes backward incompatibility easily
 Key: HIVE-19040
 URL: https://issues.apache.org/jira/browse/HIVE-19040
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Affects Versions: 2.0.0
Reporter: Aihua Xu


In the HiveMetaStore implementation of {{public PartitionsByExprResult 
get_partitions_by_expr(PartitionsByExprRequest req) throws TException}} , an 
expression is serialized into byte array from the client side and passed 
through  PartitionsByExprRequest. Then HMS will deserialize back into the 
expression and filter the partitions by it.

Such partition filtering expression can contain various UDFs. If there are some 
changes to one of the UDFs between different Hive versions, HS2 on the older 
version will serialize the expression in old format which won't be able to be 
deserialized by HMS on the newer version.  One example of that is, GenericUDFIn 
class adds {{transient}}  to the field constantInSet which will cause such 
incompatibility.

One approach I'm thinking is, instead of converting the expression object to 
byte array, we can pass the expression string directly. 

 

 

  

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19018) beeline -e now requires semicolon even when used with query from command line

2018-03-21 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19018:
---

 Summary: beeline -e now requires semicolon even when used with 
query from command line
 Key: HIVE-19018
 URL: https://issues.apache.org/jira/browse/HIVE-19018
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Right now if you execute {{beeline -u "jdbc:hive2://" -e "select 3"}}, beeline 
console will wait for you to enter ';". It's a regression from the old 
behavior. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19010) Improve column stats update

2018-03-21 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-19010:
---

 Summary: Improve column stats update 
 Key: HIVE-19010
 URL: https://issues.apache.org/jira/browse/HIVE-19010
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


I'm seeing the column stats update could be inefficient. Use the subtasks of 
this Jira to track the improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18986) Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns

2018-03-16 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-18986:
---

 Summary: Table rename will run java.lang.StackOverflowError in 
dataNucleus if the table contains large number of columns
 Key: HIVE-18986
 URL: https://issues.apache.org/jira/browse/HIVE-18986
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Aihua Xu
Assignee: Aihua Xu
 Fix For: 3.0.0


If the table contains a lot of columns e.g, 5k, simple table rename would fail 
with the following stack trace. The issue is datanucleus can't handle the query 
with lots of colName='c1' && colName='c2'.

 

2018-03-13 17:19:52,770 INFO 
org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-5-thread-200]: 
ugi=anonymous ip=10.17.100.135 cmd=source:10.17.100.135 alter_table: db=default 
tbl=fgv_full_var_pivoted02 newtbl=fgv_full_var_pivoted 2018-03-13 17:20:00,495 
ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-200]: 
java.lang.StackOverflowError at 
org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:330) at 
org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at 
org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18887) Improve preserving column stats for alter table commands

2018-03-06 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-18887:
---

 Summary: Improve preserving column stats for alter table commands
 Key: HIVE-18887
 URL: https://issues.apache.org/jira/browse/HIVE-18887
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


We are trying to preserve column stats for certain alter table commands, while 
seems that current generic approach which compare the old columns against the 
new columns and update for all the columns may not be efficient . e.g., if we 
just rename the table, we should be able to update the name itself. COL_STATS 
table somehow contains DB_Name and Table_Name. If those tables don't have these 
columns, certain commands don't even need to update these tables. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18550) Keep the hbase table name property as hbase.table.name

2018-01-25 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-18550:
---

 Summary: Keep the hbase table name property as hbase.table.name
 Key: HIVE-18550
 URL: https://issues.apache.org/jira/browse/HIVE-18550
 Project: Hive
  Issue Type: Sub-task
  Components: HBase Handler
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


With hbase 2.0 support, I made some changes to the hbase table name property 
change in HIVE-18366 and HIVE-18202. By checking the logic, seems the change is 
not necessary since hbase.table.name is internal to hive hbase handler. We just 
need to map hbase.table.name to hbase.mapreduce.hfileoutputformat.table.name 
for HiveHFileOutputFormat. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18366) Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead of hbase.table.name as the table name property

2018-01-03 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-18366:
---

 Summary: Update HBaseSerDe to use 
hbase.mapreduce.hfileoutputformat.table.name instead of hbase.table.name as the 
table name property
 Key: HIVE-18366
 URL: https://issues.apache.org/jira/browse/HIVE-18366
 Project: Hive
  Issue Type: Sub-task
  Components: HBase Handler
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


HBase 2.0 changes the table name property to 
hbase.mapreduce.hfileoutputformat.table.name. HiveHFileOutputFormat is using 
the new property name while HiveHBaseTableOutputFormat is not. If we create the 
table as follows, HiveHBaseTableOutputFormat is used which still uses the old 
property hbase.table.name.

{noformat}
create table hbase_table2(key int, val string) stored by 
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties 
('hbase.columns.mapping' = ':key,cf:val') tblproperties 
('hbase.mapreduce.hfileoutputformat.table.name' = 'positive_hbase_handler_bulk')
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18327) Remove the unnecessary HiveConf dependency for MiniHiveKdc

2017-12-21 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-18327:
---

 Summary: Remove the unnecessary HiveConf dependency for MiniHiveKdc
 Key: HIVE-18327
 URL: https://issues.apache.org/jira/browse/HIVE-18327
 Project: Hive
  Issue Type: Test
  Components: Test
Affects Versions: 3.0.0
Reporter: Aihua Xu


MiniHiveKdc takes HiveConf as input parameter while it's not needed. Remove the 
unnecessary HiveConf.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18323) Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader

2017-12-20 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-18323:
---

 Summary: Vectorization: add the support of timestamp in 
VectorizedPrimitiveColumnReader
 Key: HIVE-18323
 URL: https://issues.apache.org/jira/browse/HIVE-18323
 Project: Hive
  Issue Type: Improvement
  Components: Vectorization
Affects Versions: 3.0.0
Reporter: Aihua Xu


{noformat}
CREATE TABLE `t1`(
  `ts` timestamp,
  `s1` string)
STORED AS PARQUET;

set hive.vectorized.execution.enabled=true;
SELECT * from t1 SORT BY s1;
{noformat}

This query will throw exception since timestamp is not supported here yet.

{noformat}
Caused by: java.io.IOException: java.io.IOException: Unsupported type: optional 
int96 ts
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:116)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18202) Automatically migrate hbase.table.name to hbase.mapreduce.hfileoutputformat.table.name for hbase-based table

2017-12-01 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-18202:
---

 Summary: Automatically migrate hbase.table.name to 
hbase.mapreduce.hfileoutputformat.table.name for hbase-based table
 Key: HIVE-18202
 URL: https://issues.apache.org/jira/browse/HIVE-18202
 Project: Hive
  Issue Type: Sub-task
  Components: HBase Handler
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


The property name for Hbase table mapping is changed from hbase.table.name to 
hbase.mapreduce.hfileoutputformat.table.name in HBase 2.

We can include such upgrade for existing hbase-based tables in DB upgrade 
script to automatically change such values.

For the new tables, the query will be like:

create table hbase_table(key int, val string) stored by 
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties 
('hbase.columns.mapping' = ':key,cf:val') tblproperties 
('hbase.mapreduce.hfileoutputformat.table.name' = 'positive_hbase_handler_bulk')




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18023) Redact the expression in lineage info

2017-11-08 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-18023:
---

 Summary: Redact the expression in lineage info
 Key: HIVE-18023
 URL: https://issues.apache.org/jira/browse/HIVE-18023
 Project: Hive
  Issue Type: Improvement
  Components: Logging
Affects Versions: 2.1.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Trivial


The query redactor is redacting the query itself while the expression shown in 
lineage info is not, which may still expose sensitive info. The following query

{{select customers.id, customers.name from customers where 
customers.addresses['shipping'].zip_code ='1234-5678-1234-5678';}} will have a 
log entry in lineage. The expression should also be redacted.

{noformat}
[HiveServer2-Background-Pool: Thread-43]: 
{"version":"1.0","user":"hive","timestamp":1510179280,"duration":40747,"jobIds":["job_1510150684172_0006"],"engine":"mr","database":"default","hash":"a2b4721a0935e3770d81649d24ab1cd4","queryText":"select
 customers.id, customers.name from customers where 
customers.addresses['shipping'].zip_code 
='---'","edges":[{"sources":[2],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[1],"edgeType":"PROJECTION"},{"sources":[],"targets":[0,1],"expression":"(addresses['shipping'].zip_code
 = 
'1234-5678-1234-5678')","edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"customers.id"},{"id":1,"vertexType":"COLUMN","vertexId":"customers.name"},{"id":2,"vertexType":"COLUMN","vertexId":"default.customers.id"},{"id":3,"vertexType":"COLUMN","vertexId":"default.customers.name"}]}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18009) Multiple lateral view query is slow on hive on spark

2017-11-07 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-18009:
---

 Summary: Multiple lateral view query is slow on hive on spark
 Key: HIVE-18009
 URL: https://issues.apache.org/jira/browse/HIVE-18009
 Project: Hive
  Issue Type: Improvement
  Components: Spark
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


When running the query with multiple lateral view, HoS is busy with the 
compilation. GenSparkUtils has an efficient implementation of getChildOperator 
when we have diamond hierarchy in operator trees (lateral view in this case) 
since the node may be visited multiple times.

{noformat}
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:442)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438)
at 
org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.j

[jira] [Created] (HIVE-17999) Remove hadoop3 hack in TestJdbcWithLocalClusterSpark and TestMultiSessionsHS2WithLocalClusterSpark after Spark supports Hadoop3

2017-11-07 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17999:
---

 Summary: Remove hadoop3 hack in TestJdbcWithLocalClusterSpark and 
TestMultiSessionsHS2WithLocalClusterSpark after Spark supports Hadoop3
 Key: HIVE-17999
 URL: https://issues.apache.org/jira/browse/HIVE-17999
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Affects Versions: 3.0.0
Reporter: Aihua Xu


Currently Spark hasn't supported Hadoop3 since it's blocked by Hive to support 
Hadoop3 so Hive takes the workaround to get HoS tests to pass (see 
TestJdbcWithLocalClusterSpark and TestMultiSessionsHS2WithLocalClusterSpark). 

SPARK-18673 is to enable the support of Hadoop3. After the work is done, we 
should upgrade Spark version dependency and remove such hack in these two tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17870) Update NoDeleteRollingFileAppender to use Log4j2 api

2017-10-20 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17870:
---

 Summary: Update NoDeleteRollingFileAppender to use Log4j2 api
 Key: HIVE-17870
 URL: https://issues.apache.org/jira/browse/HIVE-17870
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Aihua Xu


NoDeleteRollingFileAppender is still using log4jv1 api. Since we already moved 
to use log4j2 in hive, we better update to use log4jv2 as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar

2017-10-10 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17762:
---

 Summary: Exclude older jackson-annotation.jar from druid-handler 
shaded jar
 Key: HIVE-17762
 URL: https://issues.apache.org/jira/browse/HIVE-17762
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but 
older versions are brought in from the transitive dependencies. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17699) Skip calling authValidator.checkPrivileges when there is nothing to get authorized

2017-10-04 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17699:
---

 Summary: Skip calling authValidator.checkPrivileges when there is 
nothing to get authorized
 Key: HIVE-17699
 URL: https://issues.apache.org/jira/browse/HIVE-17699
 Project: Hive
  Issue Type: Improvement
Affects Versions: 2.1.1
Reporter: Aihua Xu
Assignee: Aihua Xu


For the command like "drop database if exists db1;" and the database db1 
doesn't exist, there will be nothing to get authorized. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17679) http-generic-click-jacking for WebHcat server

2017-10-03 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17679:
---

 Summary: http-generic-click-jacking for WebHcat server
 Key: HIVE-17679
 URL: https://issues.apache.org/jira/browse/HIVE-17679
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Affects Versions: 2.1.1
Reporter: Aihua Xu
Assignee: Aihua Xu


The web UIs do not include the "X-Frame-Options" header to prevent the pages 
from being framed from another site.
Reference:
https://www.owasp.org/index.php/Clickjacking
https://www.owasp.org/index.php/Clickjacking_Defense_Cheat_Sheet
https://developer.mozilla.org/en-US/docs/Web/HTTP/X-Frame-Options



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17624) MapredLocakTask running in separate JVM could throw ClassNotFoundException

2017-09-27 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17624:
---

 Summary: MapredLocakTask running in separate JVM could throw 
ClassNotFoundException 
 Key: HIVE-17624
 URL: https://issues.apache.org/jira/browse/HIVE-17624
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.1.1
Reporter: Aihua Xu
Assignee: Aihua Xu


{noformat}
set hive.auto.convert.join=true;
set hive.auto.convert.join.use.nonstaged=false;

add jar hive-hcatalog-core.jar;

drop table if exists t1;
CREATE TABLE t1 (a string, b string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';

LOAD DATA LOCAL INPATH "data/files/sample.json" INTO TABLE t1;
select * from t1 l join t1 r on l.a=r.a;
{noformat}

The join will use a MapJoin which uses MapredLocalTask in a separate JVM to 
load the table into a Hashmap. But hive doesn't pass added jar to the classpath 
in such JVM so the following exception is thrown.

{noformat}
org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception 
java.lang.ClassNotFoundException: 
org.apache.hive.hcatalog.data.JsonSerDejava.lang.RuntimeException: 
java.lang.ClassNotFoundException: org.apache.hive.hcatalog.data.JsonSerDe
at 
org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:72)
at 
org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:92)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:564)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:127)
at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:462)
at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:390)
at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:370)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:756)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: 
org.apache.hive.hcatalog.data.JsonSerDe
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at 
org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:69)
... 15 more

at 
org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:586)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:127)
at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:462)
at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:390)
at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:370)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:756)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17619) Exclude avatica-core.jar since avatica.jar is included

2017-09-27 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17619:
---

 Summary: Exclude avatica-core.jar since avatica.jar is included
 Key: HIVE-17619
 URL: https://issues.apache.org/jira/browse/HIVE-17619
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.1.1
Reporter: Aihua Xu
Assignee: Aihua Xu


avatica.jar is included in the project but this jar has a dependency on 
avatica-core.jar and it's pulled into the project as well. 

If avatica-core.jar is included in the classpath in front of  avatica.jar, then 
hive could run into missing class which is shaded inside avatica.jar.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17583) Fix test failure TestAccumuloCliDriver caused from the accumulo version upgrade

2017-09-22 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17583:
---

 Summary: Fix test failure TestAccumuloCliDriver caused from the 
accumulo version upgrade
 Key: HIVE-17583
 URL: https://issues.apache.org/jira/browse/HIVE-17583
 Project: Hive
  Issue Type: Test
  Components: Test
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17376) Upgrade snappy version to 1.1.4

2017-08-23 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17376:
---

 Summary: Upgrade snappy version to 1.1.4
 Key: HIVE-17376
 URL: https://issues.apache.org/jira/browse/HIVE-17376
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Upgrade the snappy java version to 1.1.4. The older version has some issues 
like memory leak (https://github.com/xerial/snappy-java/issues/91).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17373) Upgrade some dependency versions

2017-08-22 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17373:
---

 Summary: Upgrade some dependency versions
 Key: HIVE-17373
 URL: https://issues.apache.org/jira/browse/HIVE-17373
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Upgrade some libraries including log4j to 2.8.2, accumulo to 1.8.1 and 
commons-httpclient to 3.1. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17357) Similar to HIVE-17336, plugin jars are not properly added

2017-08-18 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17357:
---

 Summary: Similar to HIVE-17336, plugin jars are not properly added
 Key: HIVE-17357
 URL: https://issues.apache.org/jira/browse/HIVE-17357
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


I forgot to include the same change for LocalHiveSparkClient.java in 
HIVE-17336. We need to make the same change as HIVE-17336 in 
LocalHiveSparkClient class to include plugin jars. Maybe we should have a 
common base class for both LocalHiveSparkClient and RemoteHiveSparkClient to 
have some common functions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17353) The ResultSets are not accessible if running multiple queries within the same HiveStatement

2017-08-17 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17353:
---

 Summary: The ResultSets are not accessible if running multiple 
queries within the same HiveStatement 
 Key: HIVE-17353
 URL: https://issues.apache.org/jira/browse/HIVE-17353
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


The following queries would fail,
{noformat}
ResultSet rs1 =
stmt.executeQuery("select * from testMultipleResultSets1");
ResultSet rs2 =
stmt.executeQuery("select * from testMultipleResultSets2");
rs1.next();
rs2.next();
{noformat}

with the exception:
{noformat}
[HiveServer2-Handler-Pool: Thread-208]: Error fetching results: 
org.apache.hive.service.cli.HiveSQLException: Invalid OperationHandle: 
OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=8a1c4fe5-e80b-4d9a-b673-78d92b3baaa8]
at 
org.apache.hive.service.cli.operation.OperationManager.getOperation(OperationManager.java:177)
at 
org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:462)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:691)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
at 
org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17336) Missing class 'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat' from Hive on Spark when inserting into hbase based table

2017-08-16 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17336:
---

 Summary: Missing class 
'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat' from Hive on Spark 
when inserting into hbase based table
 Key: HIVE-17336
 URL: https://issues.apache.org/jira/browse/HIVE-17336
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


When inserting into a hbase based table from hive on spark, the following 
exception is thrown 
{noformat}
Error while processing statement: FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask. 
org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: 
org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat
Serialization trace:
inputFileFormatClass (org.apache.hadoop.hive.ql.plan.TableDesc)
tableInfo (org.apache.hadoop.hive.ql.plan.FileSinkDesc)
conf (org.apache.hadoop.hive.ql.exec.FileSinkOperator)
childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
 at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
 at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:183)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.read(DefaultSerializers.java:326)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.read(DefaultSerializers.java:314)
 at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:759)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObjectOrNull(SerializationUtilities.java:201)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:132)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:178)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:178)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
 at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
 at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
 at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
 at 
org.apache.hadoop.hive.ql.exec.SerializationUt

[jira] [Created] (HIVE-17272) when hive.vectorized.execution.enabled is true, query on empty table fails with NPE

2017-08-08 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17272:
---

 Summary: when hive.vectorized.execution.enabled is true, query on 
empty table fails with NPE
 Key: HIVE-17272
 URL: https://issues.apache.org/jira/browse/HIVE-17272
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.1.1
Reporter: Aihua Xu
Assignee: Aihua Xu


{noformat}
set hive.vectorized.execution.enabled=true;
CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int);
select * from tab t1 join tab t2 where t1.x=t2.x;
{noformat}

The query fails with the following exception.
{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386)
 ~[hive-exec-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559)
 ~[hive-exec-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474)
 ~[hive-exec-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) 
~[hive-exec-2.3.0.jar:2.3.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_101]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_101]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_101]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
~[hadoop-common-2.6.0.jar:?]
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
~[hadoop-common-2.6.0.jar:?]
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) 
~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_101]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_101]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_101]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) 
~[hadoop-common-2.6.0.jar:?]
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) 
~[hadoop-common-2.6.0.jar:?]
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) 
~[hadoop-common-2.6.0.jar:?]
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413) 
~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) 
~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268)
 ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_101]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[?:1.8.0_101]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[?:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17155) findConfFile() in HiveConf.java has some issues with the conf path

2017-07-21 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17155:
---

 Summary: findConfFile() in HiveConf.java has some issues with the 
conf path
 Key: HIVE-17155
 URL: https://issues.apache.org/jira/browse/HIVE-17155
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor


In findConfFile() function of HiveConf.java, here are some issues. 
File.pathSeparator which is ":" is used as the separator rather than "/". new 
File(jarUri).getParentFile() will get the "$hive_home/lib" folder, but actually 
we want "$hive_home".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext

2017-07-05 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-17048:
---

 Summary: Pass HiveOperation info to HiveSemanticAnalyzerHook 
through HiveSemanticAnalyzerHookContext
 Key: HIVE-17048
 URL: https://issues.apache.org/jira/browse/HIVE-17048
 Project: Hive
  Issue Type: Improvement
  Components: Hooks
Affects Versions: 2.1.1
Reporter: Aihua Xu
Assignee: Aihua Xu


Currently hive passes the following info to HiveSemanticAnalyzerHook through 
HiveSemanticAnalyzerHookContext (see 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553).
 But the operation type (HiveOperation) is also needed in some cases, e.g., 
when integrating with Sentry. 

{noformat}
hookCtx.setConf(conf);
hookCtx.setUserName(userName);
hookCtx.setIpAddress(SessionState.get().getUserIpAddress());
hookCtx.setCommand(command);
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-16911) Upgrade groovy version to 2.4.11

2017-06-15 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16911:
---

 Summary: Upgrade groovy version to 2.4.11
 Key: HIVE-16911
 URL: https://issues.apache.org/jira/browse/HIVE-16911
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Hive currently uses groovy 2.4.4 which has security issue 
(https://access.redhat.com/security/cve/cve-2016-6814). Need to upgrade to 
2.4.8 or later. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-16902) investigate "failed to remove operation log" errors

2017-06-14 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16902:
---

 Summary: investigate "failed to remove operation log" errors
 Key: HIVE-16902
 URL: https://issues.apache.org/jira/browse/HIVE-16902
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


When we call {{set a=3;}} from beeline, the following exception is thrown. 
{noformat}
[HiveServer2-Handler-Pool: Thread-46]: Failed to remove corresponding log file 
of operation: OperationHandle [opType=GET_TABLES, 
getHandleIdentifier()=50f58d7b-f935-4590-922f-de7051a34658]
java.io.FileNotFoundException: File does not exist: 
/var/log/hive/operation_logs/7f613077-e29d-484a-96e1-43c81f9c0999/hive_20170531101400_28d52b7d-ffb9-4815-8c6c-662319628915
at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2275)
at 
org.apache.hadoop.hive.ql.session.OperationLog$LogFile.remove(OperationLog.java:122)
at 
org.apache.hadoop.hive.ql.session.OperationLog.close(OperationLog.java:90)
at 
org.apache.hive.service.cli.operation.Operation.cleanupOperationLog(Operation.java:287)
at 
org.apache.hive.service.cli.operation.MetadataOperation.close(MetadataOperation.java:58)
at 
org.apache.hive.service.cli.operation.OperationManager.closeOperation(OperationManager.java:273)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.closeOperation(HiveSessionImpl.java:822)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy38.closeOperation(Unknown Source)
at 
org.apache.hive.service.cli.CLIService.closeOperation(CLIService.java:475)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.CloseOperation(ThriftCLIService.java:671)
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1677)
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1662)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:605)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-16884) Replace the deprecated HBaseInterface with Table

2017-06-12 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16884:
---

 Summary: Replace the deprecated HBaseInterface with Table  
 Key: HIVE-16884
 URL: https://issues.apache.org/jira/browse/HIVE-16884
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


HBaseInterface has been deprecated and will get removed in HBase 2.0 by 
HBASE-13395. Replace it with the new one 
{{org.apache.hadoop.hbase.client.Table}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-16849) Upgrade jetty version to 9.4.6.v20170531

2017-06-07 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16849:
---

 Summary: Upgrade jetty version to 9.4.6.v20170531
 Key: HIVE-16849
 URL: https://issues.apache.org/jira/browse/HIVE-16849
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 3.0.0
Reporter: Aihua Xu


>From HIVE-16846, the test case of TestJdbcWithMiniHS2#testHttpHeaderSize is 
>returning http error code 413 (PAYLOAD_TOO_LARGE_413) rather than 431 
>(REQUEST_HEADER_FIELDS_TOO_LARGE_431 ) while 431 seems more accurate and the 
>newer version of jetty fixed such issue.

{noformat}
// This should fail with given HTTP response code 413 in error message, 
since header is more
// than the configured the header size
userName = StringUtils.leftPad("*", 2000);
try {
  conn = getConnection(miniHS2.getJdbcURL(testDbName), userName, 
"password");
} catch (Exception e) {
  assertTrue("Header exception thrown", e != null);
  assertTrue(e.getMessage().contains("HTTP Response code: 413"));
} finally {
  if (conn != null) {
conn.close();
  }
}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16846) TestJdbcWithMiniHS2#testHttpHeaderSize test case is not testing in HTTP mode

2017-06-07 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16846:
---

 Summary: TestJdbcWithMiniHS2#testHttpHeaderSize test case is not 
testing in HTTP mode
 Key: HIVE-16846
 URL: https://issues.apache.org/jira/browse/HIVE-16846
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


TestJdbcWithMiniHS2#testHttpHeaderSize test case actually is testing binary 
mode so the request/response sizes are not checked. 

We need to build MiniHS2 using withHTTPTransport() to start the HTTP mode. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16769) Possible hive service startup due to the existing of /tmp/stderr

2017-05-26 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16769:
---

 Summary: Possible hive service startup due to the existing of 
/tmp/stderr
 Key: HIVE-16769
 URL: https://issues.apache.org/jira/browse/HIVE-16769
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


HIVE-12497 prints the ignoring errors from hadoop version, hbase mapredcp and 
hadoop jars to /tmp/${USER}/stderr. 

In some cases ${USER} is not set, then the file becomes /tmp/stderr.  If  such 
file preexists with different permission, it will cause the service startup to 
fail.

I just tried the script without outputting to stderr file, I don't see such 
error any more {{"ERROR StatusLogger No log4j2 configuration file found. Using 
default configuration: logging only errors to the console."}}.

I think we can remove such redirect to avoid possible startup failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16682) Check if the console message from the hive schema tool needs to print to logging file

2017-05-16 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16682:
---

 Summary: Check if the console message from the hive schema tool 
needs to print to logging file
 Key: HIVE-16682
 URL: https://issues.apache.org/jira/browse/HIVE-16682
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Aihua Xu
Priority: Minor


>From HiveSchemaTool, most of the messages are printed to console and some of 
>them are printed to log. Evaluate the console messages if make sense to print 
>to log as well and what would be the best way to print them to avoid 
>duplication in case if LOG is configured to be console. 





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16647) Improve the validation output to make the output to stderr and stdout more consistent

2017-05-11 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16647:
---

 Summary: Improve the validation output to make the output to 
stderr and stdout more consistent
 Key: HIVE-16647
 URL: https://issues.apache.org/jira/browse/HIVE-16647
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor


Some output are printed to stderr or stdout inconsistently. Here are some of 
them. Update to make them more consistent.

*  Version table validation
  When the version table is missing, the err msg goes to stderr
  When the version table is not valid, the err msg goes to stdout with a 
message like "Failed in schema version validation: 
*  Metastore/schema table validation
** When the version table contains the wrong version or there are no rows in 
the version table, err msg goes to stderr
** When there diffs between the schema and metastore tables, the err msg goes 
to stdout




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16528) Exclude older version of beanutils from dependent jars in test pom.xml

2017-04-25 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16528:
---

 Summary: Exclude older version of beanutils from dependent jars in 
test pom.xml 
 Key: HIVE-16528
 URL: https://issues.apache.org/jira/browse/HIVE-16528
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


The test build is picking up the older beanutils jars. That is causing test 
failures when hadoop is upgrading to alpha2.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16455) ADD JAR command leaks JAR Files

2017-04-14 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16455:
---

 Summary: ADD JAR command leaks JAR Files
 Key: HIVE-16455
 URL: https://issues.apache.org/jira/browse/HIVE-16455
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Aihua Xu
Assignee: Aihua Xu


HiveServer2 is leaking file handles when using ADD JAR statement and the JAR 
file added is not used in the query itself.

{noformat}
beeline> !connect jdbc:hive2://localhost:1 admin
0: jdbc:hive2://localhost:1> create table test_leak (a int);
0: jdbc:hive2://localhost:1> insert into test_leak Values (1);

-- Exit beeline terminal; Find PID of HiveServer2

[root@host-10-17-80-111 ~]# lsof -p 29588 | grep "(deleted)" | wc -l
0
[root@host-10-17-80-111 ~]# beeline -u jdbc:hive2://localhost:1/default -n 
admin

And run the command "ADD JAR hdfs:///tmp/hive-contrib.jar; select * from 
test_leak"
[root@host-10-17-80-111 ~]# lsof -p 29588 | grep "(deleted)" | wc -l
1

java29588 hive  391u   REG  252,3125987  2099944 
/tmp/57d98f5b-1e53-44e2-876b-6b4323ac24db_resources/hive-contrib.jar (deleted)
java29588 hive  392u   REG  252,3125987  2099946 
/tmp/eb3184ad-7f15-4a77-a10d-87717ae634d1_resources/hive-contrib.jar (deleted)
java29588 hive  393r   REG  252,3125987  2099825 
/tmp/e29dccfc-5708-4254-addb-7a8988fc0500_resources/hive-contrib.jar (deleted)
java29588 hive  394r   REG  252,3125987  2099833 
/tmp/5153dd4a-a606-4f53-b02c-d606e7e56985_resources/hive-contrib.jar (deleted)
java29588 hive  395r   REG  252,3125987  2099827 
/tmp/ff3cdb05-917f-43c0-830a-b293bf397a23_resources/hive-contrib.jar (deleted)
java29588 hive  396r   REG  252,3125987  2099822 
/tmp/60531b66-5985-421e-8eb5-eeac31fdf964_resources/hive-contrib.jar (deleted)
java29588 hive  397r   REG  252,3125987  2099831 
/tmp/78878921-455c-438c-9735-447566ed8381_resources/hive-contrib.jar (deleted)
java29588 hive  399r   REG  252,3125987  2099835 
/tmp/0e5d7990-30cc-4248-9058-587f7f1ff211_resources/hive-contrib.jar (deleted)
{noformat}
You can see the the session directory (and therefore anything in it) is set to 
delete only on exit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16450) Some metastore operations are not retried even with desired underlining exceptions

2017-04-14 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16450:
---

 Summary: Some metastore operations are not retried even with 
desired underlining exceptions
 Key: HIVE-16450
 URL: https://issues.apache.org/jira/browse/HIVE-16450
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


In RetryingHMSHandler class, we are expecting the operations should retry when 
the cause of MetaException is JDOException or NucleusException.
{noformat}
if (e.getCause() instanceof MetaException && e.getCause().getCause() != 
null) {
  if (e.getCause().getCause() instanceof javax.jdo.JDOException ||
  e.getCause().getCause() instanceof NucleusException) {
// The JDOException or the Nucleus Exception may be wrapped further 
in a MetaException
caughtException = e.getCause().getCause();
   }
{noformat}

While in ObjectStore, many places we are only throwing new MetaException(msg) 
without the cause, so we are missing retrying for some cases. e.g., with the 
following JDOException, we should retry but it's ignored.

{noformat}
2017-04-04 17:28:21,602 ERROR metastore.ObjectStore 
(ObjectStore.java:getMTableColumnStatistics(6555)) - Error retrieving 
statistics via jdo
javax.jdo.JDOException: Exception thrown when executing query
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getMTableColumnStatistics(ObjectStore.java:6546)
at 
org.apache.hadoop.hive.metastore.ObjectStore.access$1200(ObjectStore.java:171)
at 
org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6606)
at 
org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6595)
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2633)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatisticsInternal(ObjectStore.java:6594)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatistics(ObjectStore.java:6588)
at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
at com.sun.proxy.$Proxy0.getTableColumnStatistics(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTableUpdateTableColumnStats(HiveAlterHandler.java:787)
at 
org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:247)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3809)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3779)
at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
at com.sun.proxy.$Proxy3.alter_table_with_environment_context(Unknown 
Source)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:9617)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:9601)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{noformat}



--
This 

[jira] [Created] (HIVE-16439) Exclude older v2 version of jackson lib from pom.xml

2017-04-13 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16439:
---

 Summary: Exclude older v2 version of jackson lib from pom.xml 
 Key: HIVE-16439
 URL: https://issues.apache.org/jira/browse/HIVE-16439
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Aihua Xu
Assignee: Aihua Xu


There are multiple versions of jackson libs included in the dependent jars like 
spark-client and metrics-json. That causes older versions of jackson libs to be 
used.   

We need to exclude them from the dependencies and use the explicit one 
(currently 2.6.5).

  com.fasterxml.jackson.core
  jackson-databind




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16400) Fix the MDC reference to use slf4j rather than log4j

2017-04-06 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16400:
---

 Summary: Fix the MDC reference to use slf4j rather than log4j
 Key: HIVE-16400
 URL: https://issues.apache.org/jira/browse/HIVE-16400
 Project: Hive
  Issue Type: Sub-task
  Components: Logging
Affects Versions: 3.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


The MDC reference in LogUtils is using Log4J version, but we should use slf4j 
version.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16281) Upgrade master branch to JDK8

2017-03-22 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16281:
---

 Summary: Upgrade master branch to JDK8
 Key: HIVE-16281
 URL: https://issues.apache.org/jira/browse/HIVE-16281
 Project: Hive
  Issue Type: New Feature
  Components: Hive
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


This is to track the JDK 8 upgrade work for the master branch.

Here are threads for the discussion:
https://lists.apache.org/thread.html/83d8235bc9547cc94a0d689580f20db4b946876b6d0369e31ea12b51@1460158490@%3Cdev.hive.apache.org%3E

https://lists.apache.org/thread.html/dcd57844ceac7faf8975a00d5b8b1825ab5544d94734734aedc3840e@%3Cdev.hive.apache.org%3E

JDK7 is end of public update and some newer version of dependent libraries like 
jetty require newer JDK. Seems it's reasonable to upgrade to JDK8 in 2.x.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16061) Some of console output is not printed to the beeline console

2017-02-28 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-16061:
---

 Summary: Some of console output is not printed to the beeline 
console
 Key: HIVE-16061
 URL: https://issues.apache.org/jira/browse/HIVE-16061
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 2.1.1
Reporter: Aihua Xu
Assignee: Aihua Xu


Run a hiveserver2 instance "hive --service hiveserver2".
Then from another console, connect to hiveserver2 "beeline -u 
"jdbc:hive2://localhost:1"

When you run a MR job like "select t1.key from src t1 join src t2 on 
t1.key=t2.key", some of the console logs like MR job info are not printed to 
the console while it just print to the hiveserver2 console.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-15823) Investigate parquet filtering for complex data types decimal, date and timestamp

2017-02-06 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15823:
---

 Summary: Investigate parquet filtering for complex data types 
decimal, date and timestamp
 Key: HIVE-15823
 URL: https://issues.apache.org/jira/browse/HIVE-15823
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Follow up on HIVE-15782. Currently, if there is decimal, date or timestamp data 
type in a filtering condition, such filtering is not supported in Hive to  push 
to parquet file. Investigate that to improve the performance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-15805) Some minor improvement on the validation tool

2017-02-03 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15805:
---

 Summary: Some minor improvement on the validation tool
 Key: HIVE-15805
 URL: https://issues.apache.org/jira/browse/HIVE-15805
 Project: Hive
  Issue Type: Sub-task
  Components: Database/Schema
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor


To correct some types and make the output neat.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-15782) query on parquet table returns incorrect result when hive.optimize.index.filter is set to true

2017-02-01 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15782:
---

 Summary: query on parquet table returns incorrect result when 
hive.optimize.index.filter is set to true 
 Key: HIVE-15782
 URL: https://issues.apache.org/jira/browse/HIVE-15782
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


When hive.optimize.index.filter is set to true, the parquet table is filtered 
using the parquet column index. 

{noformat}
set hive.optimize.index.filter=true;
CREATE TABLE t1 (
  name string,
  dec decimal(5,0)
) stored as parquet;

insert into table t1 values('Jim', 3);
insert into table t1 values('Tom', 5);

select * from t1 where (name = 'Jim' or dec = 5);
{noformat}

Only one row {{Jim, 3}} is returned, but both should be returned. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-15617) Improve the avg performance for Range based window

2017-01-13 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15617:
---

 Summary: Improve the avg performance for Range based window
 Key: HIVE-15617
 URL: https://issues.apache.org/jira/browse/HIVE-15617
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Affects Versions: 1.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Similar to HIVE-15520, we need to improve the performance for avg().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15520) Improve the Range based window to add streaming support

2016-12-27 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15520:
---

 Summary: Improve the Range based window to add streaming support 
 Key: HIVE-15520
 URL: https://issues.apache.org/jira/browse/HIVE-15520
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Reporter: Aihua Xu
Assignee: Aihua Xu


Currently streaming process is not supported for range based windowing. Thus 
sum(x) over (partition by y order by z) is O(n^2) running time. 

Investigate the possibility of streaming support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15518) Update the comment to match what it's doing in WindowSpec

2016-12-27 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15518:
---

 Summary: Update the comment to match what it's doing in WindowSpec
 Key: HIVE-15518
 URL: https://issues.apache.org/jira/browse/HIVE-15518
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Trivial


{noformat}
  /*
   * - A Window Frame that has only the /start/boundary, then it is interpreted 
as:
 BETWEEN  AND CURRENT ROW
   * - A Window Specification with an Order Specification and no Window
   *   Frame is interpreted as:
 ROW BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
   * - A Window Specification with no Order and no Window Frame is interpreted 
as:
 ROW BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
   */
{noformat}

The comments in WindowSpec above doesn't really match what it's claimed to do. 
Correct the comment to reduce the confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15500) fix the test failure dbtxnmgr_showlocks

2016-12-22 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15500:
---

 Summary: fix the test failure dbtxnmgr_showlocks
 Key: HIVE-15500
 URL: https://issues.apache.org/jira/browse/HIVE-15500
 Project: Hive
  Issue Type: Test
  Components: Test
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Trivial
 Attachments: HIVE-15500.1.patch





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15498) sum() over (order by c) should default the windowing spec to RangeBoundarySpec

2016-12-22 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15498:
---

 Summary: sum() over (order by c) should default the windowing spec 
to RangeBoundarySpec
 Key: HIVE-15498
 URL: https://issues.apache.org/jira/browse/HIVE-15498
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 2.1.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Currently {{sum() over (partition by a)}} without order by will default 
windowing to RangeBoundarySpec while  {{sum() over (partition by a order by 
c)}} will default to ValueBoundarySpec.

>From the comment 
{noformat}
  /*
   * - A Window Frame that has only the /start/boundary, then it is interpreted 
as:
 BETWEEN  AND CURRENT ROW
   * - A Window Specification with an Order Specification and no Window
   *   Frame is interpreted as:
 ROW BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
   * - A Window Specification with no Order and no Window Frame is interpreted 
as:
 ROW BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
   */
{noformat}
We were trying to set as "row between". 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15476) ObjectStore.getMTableColumnStatistics() should check if colNames is empty

2016-12-20 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15476:
---

 Summary: ObjectStore.getMTableColumnStatistics() should check if 
colNames is empty
 Key: HIVE-15476
 URL: https://issues.apache.org/jira/browse/HIVE-15476
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor


See the following exception in the log. Can't find out which exact query causes 
it though.

{noformat}
[pool-4-thread-31]: Exception thrown
Method/Identifier expected at character 37 in "tableName == t1 && dbName == t2 
&& ()"
org.datanucleus.store.query.QueryCompilerSyntaxException: Method/Identifier 
expected at character 37 in "tableName == t1 && dbName == t2 && ()"
at 
org.datanucleus.query.compiler.JDOQLParser.processPrimary(JDOQLParser.java:810)
at 
org.datanucleus.query.compiler.JDOQLParser.processUnaryExpression(JDOQLParser.java:656)
at 
org.datanucleus.query.compiler.JDOQLParser.processMultiplicativeExpression(JDOQLParser.java:582)
at 
org.datanucleus.query.compiler.JDOQLParser.processAdditiveExpression(JDOQLParser.java:553)
at 
org.datanucleus.query.compiler.JDOQLParser.processRelationalExpression(JDOQLParser.java:467)
at 
org.datanucleus.query.compiler.JDOQLParser.processAndExpression(JDOQLParser.java:450)
at 
org.datanucleus.query.compiler.JDOQLParser.processExclusiveOrExpression(JDOQLParser.java:436)
at 
org.datanucleus.query.compiler.JDOQLParser.processInclusiveOrExpression(JDOQLParser.java:422)
at 
org.datanucleus.query.compiler.JDOQLParser.processConditionalAndExpression(JDOQLParser.java:408)
at 
org.datanucleus.query.compiler.JDOQLParser.processConditionalOrExpression(JDOQLParser.java:389)
at 
org.datanucleus.query.compiler.JDOQLParser.processExpression(JDOQLParser.java:378)
at 
org.datanucleus.query.compiler.JDOQLParser.processPrimary(JDOQLParser.java:785)
at 
org.datanucleus.query.compiler.JDOQLParser.processUnaryExpression(JDOQLParser.java:656)
at 
org.datanucleus.query.compiler.JDOQLParser.processMultiplicativeExpression(JDOQLParser.java:582)
at 
org.datanucleus.query.compiler.JDOQLParser.processAdditiveExpression(JDOQLParser.java:553)
at 
org.datanucleus.query.compiler.JDOQLParser.processRelationalExpression(JDOQLParser.java:467)
at 
org.datanucleus.query.compiler.JDOQLParser.processAndExpression(JDOQLParser.java:450)
at 
org.datanucleus.query.compiler.JDOQLParser.processExclusiveOrExpression(JDOQLParser.java:436)
at 
org.datanucleus.query.compiler.JDOQLParser.processInclusiveOrExpression(JDOQLParser.java:422)
at 
org.datanucleus.query.compiler.JDOQLParser.processConditionalAndExpression(JDOQLParser.java:412)
at 
org.datanucleus.query.compiler.JDOQLParser.processConditionalOrExpression(JDOQLParser.java:389)
at 
org.datanucleus.query.compiler.JDOQLParser.processExpression(JDOQLParser.java:378)
at org.datanucleus.query.compiler.JDOQLParser.parse(JDOQLParser.java:99)
at 
org.datanucleus.query.compiler.JavaQueryCompiler.compileFilter(JavaQueryCompiler.java:467)
at 
org.datanucleus.query.compiler.JDOQLCompiler.compile(JDOQLCompiler.java:113)
at 
org.datanucleus.store.query.AbstractJDOQLQuery.compileInternal(AbstractJDOQLQuery.java:367)
at 
org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:240)
at org.datanucleus.store.query.Query.executeQuery(Query.java:1744)
at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672)
at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:312)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getMTableColumnStatistics(ObjectStore.java:6505)
at 
org.apache.hadoop.hive.metastore.ObjectStore.access$1200(ObjectStore.java:171)
at 
org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6566)
at 
org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6555)
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2629)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatisticsInternal(ObjectStore.java:6554)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatistics(ObjectStore.java:6548)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
at com.sun.proxy.$Proxy12.getTableColumnStatistics(Unknown Source)
   

[jira] [Created] (HIVE-15464) "show create table" doesn't show skewed info

2016-12-19 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15464:
---

 Summary: "show create table" doesn't show skewed info
 Key: HIVE-15464
 URL: https://issues.apache.org/jira/browse/HIVE-15464
 Project: Hive
  Issue Type: Improvement
  Components: Query Planning
Reporter: Aihua Xu
Priority: Trivial


After you create a table like {{create table table1 (x int) skewed by (x) on 
(1,5,6);}}, then you "show create table table1", it doesn't include skewed 
info. Better to include it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15418) "select 'abc'" will throw 'Cannot find path in conf'

2016-12-12 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15418:
---

 Summary: "select 'abc'" will throw 'Cannot find path in conf'
 Key: HIVE-15418
 URL: https://issues.apache.org/jira/browse/HIVE-15418
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu
Assignee: Aihua Xu


Here is the stack trace. Seems it's a regression since it worked with earlier 
version.

{noformat}
2016-12-09T16:32:37,577 ERROR [56fa1999-ffbe-42c0-bb91-61211cd62476 main] 
CliDriver: Failed with exception java.io.IOException:java.io.IOException: 
Cannot find path in conf
java.io.IOException: java.io.IOException: Cannot find path in conf
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2191)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:777)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:715)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:642)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Cannot find path in conf
at 
org.apache.hadoop.hive.ql.io.NullRowsInputFormat.getSplits(NullRowsInputFormat.java:165)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:372)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:304)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459)
... 15 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15392) Refactoring the validate function of HiveSchemaTool to make the output consistent

2016-12-08 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15392:
---

 Summary: Refactoring the validate function of HiveSchemaTool to 
make the output consistent
 Key: HIVE-15392
 URL: https://issues.apache.org/jira/browse/HIVE-15392
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor
 Attachments: HIVE-15392.1.patch

The validate output is not consistent. Make it more consistent.

{noformat}
Starting metastore validationValidating schema version
Succeeded in schema version validation.
Validating sequence number for SEQUENCE_TABLE
Metastore connection URL:
jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User:   APP
Validating tables in the schema for version 2.2.0
Expected (from schema definition) 57 tables, Found (from HMS metastore) 58 
tables
Schema table validation successful
Metastore connection URL:
jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User:   APP
Metastore connection URL:
jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User:   APP
Metastore connection URL:
jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User:   APP
Validating columns for incorrect NULL values
Metastore connection URL:
jdbc:derby:;databaseName=metastore_db;create=true
Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver
Metastore connection User:   APP
Done with metastore validationschemaTool completed
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15383) Add additional info to 'desc function extended' output

2016-12-07 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15383:
---

 Summary: Add additional info to 'desc function extended' output
 Key: HIVE-15383
 URL: https://issues.apache.org/jira/browse/HIVE-15383
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Trivial


Add additional info to the output to 'desc function extended'. The resources 
would be helpful for the user to check which jars are referred.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15346) Remove "values temp table" from input list

2016-12-02 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15346:
---

 Summary: Remove "values temp table" from input list
 Key: HIVE-15346
 URL: https://issues.apache.org/jira/browse/HIVE-15346
 Project: Hive
  Issue Type: Sub-task
  Components: Query Planning
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15321) Change to read as long for HiveConf.ConfVars.METASTORESERVERMAXMESSAGESIZE

2016-11-30 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15321:
---

 Summary: Change to read as long for 
HiveConf.ConfVars.METASTORESERVERMAXMESSAGESIZE
 Key: HIVE-15321
 URL: https://issues.apache.org/jira/browse/HIVE-15321
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 1.1.0, 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Follow up on HIVE-11240 which tries to change the type from int to long, while 
we are still read with {{conf.getIntVar()}}. 

Seems we should use {{conf.getLongVar()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15317) Query "insert into table values()" creates the tmp table under the current database

2016-11-30 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15317:
---

 Summary: Query "insert into table values()" creates the tmp table 
under the current database
 Key: HIVE-15317
 URL: https://issues.apache.org/jira/browse/HIVE-15317
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


The current implementation of "insert into db1.table1 values()" creates a tmp 
table under the current database while table1 may not be under current 
database. 

e.g.,

{noformat}
use default;
create database db1;
create table db1.table1(x int);
insert into db1.table1 values(3);
{noformat}

It will create the tmp table under default database. Now if authorization is 
turned on and the current user only has access to db1 but not default database, 
then it will cause access issue.

We may need to rethink the approach for the implementation. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15318) Query "insert into table values()" creates the tmp table under the current database

2016-11-30 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15318:
---

 Summary: Query "insert into table values()" creates the tmp table 
under the current database
 Key: HIVE-15318
 URL: https://issues.apache.org/jira/browse/HIVE-15318
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


The current implementation of "insert into db1.table1 values()" creates a tmp 
table under the current database while table1 may not be under current 
database. 

e.g.,

{noformat}
use default;
create database db1;
create table db1.table1(x int);
insert into db1.table1 values(3);
{noformat}

It will create the tmp table under default database. Now if authorization is 
turned on and the current user only has access to db1 but not default database, 
then it will cause access issue.

We may need to rethink the approach for the implementation. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15275) "beeline -f " will throw NPE

2016-11-23 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15275:
---

 Summary: "beeline -f " will throw NPE 
 Key: HIVE-15275
 URL: https://issues.apache.org/jira/browse/HIVE-15275
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Execute {{"beeline -f "}} and the command will throw the following NPE 
exception.

{noformat}
2016-11-23T13:34:54,367 WARN [Thread-1] 
org.apache.hadoop.util.ShutdownHookManager - ShutdownHook '' failed, 
java.lang.NullPointerException
java.lang.NullPointerException
at org.apache.hive.beeline.BeeLine$1.run(BeeLine.java:1247) 
~[hive-beeline-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) 
[hadoop-common-2.7.3.jar:?]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15263) Detect the values for incorrect NULL values

2016-11-22 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15263:
---

 Summary: Detect the values for incorrect NULL values
 Key: HIVE-15263
 URL: https://issues.apache.org/jira/browse/HIVE-15263
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


We have seen the incorrect NULL values for SD_ID in TBLS for the hive tables. 
That column can be null since it will be NULL for hive views. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15231) query on view results fails with table not found error if view is created with subquery alias (CTE).

2016-11-17 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15231:
---

 Summary: query on view results fails with table not found error if 
view is created with subquery alias (CTE).
 Key: HIVE-15231
 URL: https://issues.apache.org/jira/browse/HIVE-15231
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 1.3.0
Reporter: Aihua Xu
Assignee: Aihua Xu


HIVE-10698 fixed one issue of the query on view with CTE, but it seems to break 
another case if a alias is given for the CTE.

use bugtest;
create table basetb(id int, name string);
create view testv1 as
with subtb as (select id, name from bugtest.basetb)
select id from subtb a;
use castest;
explain select * from bugtest.testv1;
hive> explain select * from bugtest.testv1;

FAILED: SemanticException Line 2:21 Table not found 'subtb' in definition of 
VIEW testv1 [
with subtb as (select `basetb`.`id`, `basetb`.`name` from `bugtest`.`basetb`)
select `a`.`id` from `bugtest`.`subtb` `a`
] used as testv1 at Line 1:14




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15207) Implement a capability to detect incorrect sequence numbers

2016-11-15 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15207:
---

 Summary: Implement a capability to detect incorrect sequence 
numbers
 Key: HIVE-15207
 URL: https://issues.apache.org/jira/browse/HIVE-15207
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Aihua Xu
Assignee: Aihua Xu


We have seen next sequence number is smaller than the max(id) for certain 
tables. Seems it's caused by thread-safe issue in HMS, but still not sure if it 
has been fully fixed. Try to detect such issue.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15206) Add a validation functionality to hiveSchemaTool

2016-11-15 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15206:
---

 Summary: Add a validation functionality to hiveSchemaTool
 Key: HIVE-15206
 URL: https://issues.apache.org/jira/browse/HIVE-15206
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


We have seen issues that the metastore get corrupted and cause the whole 
hiveserver not running. 

Add the support to detect such corruption to hiveSchemaTool. Fixing the issue 
automatically could be risky, so may still defer to the admin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema

2016-11-03 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15118:
---

 Summary: Remove unused 'COLUMNS' table from derby schema
 Key: HIVE-15118
 URL: https://issues.apache.org/jira/browse/HIVE-15118
 Project: Hive
  Issue Type: Improvement
  Components: Database/Schema
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu
Priority: Minor


COLUMNS table is unused any more. Other databases already removed it. Remove 
from derby as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15086) Add test to cover data encryption when HMS is configured to authenticate kerberos

2016-10-27 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15086:
---

 Summary: Add test to cover data encryption when HMS is configured 
to authenticate  kerberos 
 Key: HIVE-15086
 URL: https://issues.apache.org/jira/browse/HIVE-15086
 Project: Hive
  Issue Type: Sub-task
  Components: Test
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


We are missing the test coverage to test the cases when HMS is configured to 
authenticate with kerberos. For such case, the communication can be encrypted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15054) Hive insertion query execution fails on Hive on Spark

2016-10-25 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15054:
---

 Summary: Hive insertion query execution fails on Hive on Spark
 Key: HIVE-15054
 URL: https://issues.apache.org/jira/browse/HIVE-15054
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


The query of {{insert overwrite table tbl1}} sometimes will fail with the 
following errors. Seems we are constructing taskAttemptId with partitionId 
which is not unique if there are multiple attempts.

{noformat}
ava.lang.IllegalStateException: Hit error while closing operators - failing 
tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output 
from: 
hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_task_tmp.-ext-10002/_tmp.002148_0
 to: 
hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_tmp.-ext-10002/002148_0
at 
org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
at 
org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at 
org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15025) Secure-Socket-Layer (SSL) support for HMS

2016-10-20 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15025:
---

 Summary: Secure-Socket-Layer (SSL) support for HMS
 Key: HIVE-15025
 URL: https://issues.apache.org/jira/browse/HIVE-15025
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


HMS server should support SSL encryption. When the server is keberos enabled, 
the encryption can be enabled. But if keberos is not enabled, then there is no 
encryption between HS2 and HMS. 

Similar to HS2, we should support encryption in both cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14926) Keep Schema in consistent state where schemaTool fails or succeeds.

2016-10-11 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14926:
---

 Summary: Keep Schema in consistent state where schemaTool fails or 
succeeds.  
 Key: HIVE-14926
 URL: https://issues.apache.org/jira/browse/HIVE-14926
 Project: Hive
  Issue Type: Improvement
  Components: Database/Schema
Reporter: Aihua Xu
Assignee: Aihua Xu


SchemaTool uses autocommit right now when executing the upgrade or init 
scripts. Seems we should use database transaction to commit or roll back to 
keep schema consistent.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14912) Fix the test failures for 2.1.1 caused by HIVE-13409

2016-10-07 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14912:
---

 Summary: Fix the test failures for 2.1.1 caused by HIVE-13409
 Key: HIVE-14912
 URL: https://issues.apache.org/jira/browse/HIVE-14912
 Project: Hive
  Issue Type: Sub-task
  Components: Test
Affects Versions: 2.1.1
Reporter: Aihua Xu
Assignee: Aihua Xu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14859) Improve WebUI work following up HIVE-12338/HIVE-12952

2016-09-29 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14859:
---

 Summary: Improve WebUI work following up HIVE-12338/HIVE-12952
 Key: HIVE-14859
 URL: https://issues.apache.org/jira/browse/HIVE-14859
 Project: Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Aihua Xu
Assignee: Aihua Xu


Follow up on HIVE-12338/HIVE-12952, try to improve the WebUI pages for the 
following areas.

1. For the HiveServer summary page, we can organize the open queries by the 
sessions. So we will list the sessions and then list the open queries under 
each session. 
2. For each query detailed page, organize the performance page into a 
meaningful substeps. For compilation stage, probably divided into Parser, 
optimizer, etc; for runtime stage, seems it's not easy to get the status from 
yarn, not sure if we can divide further. 
3. Metrics dump: better to have a visual display as well as the simple dump.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14820) RPC server for spark inside HS2 is not getting server address properly

2016-09-22 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14820:
---

 Summary: RPC server for spark inside HS2 is not getting server 
address properly
 Key: HIVE-14820
 URL: https://issues.apache.org/jira/browse/HIVE-14820
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 2.0.1
Reporter: Aihua Xu
Assignee: Aihua Xu


When hive.spark.client.rpc.server.address is configured, this property is not 
retrieved properly because we are getting the value by {{String hiveHost = 
config.get(HiveConf.ConfVars.SPARK_RPC_SERVER_ADDRESS);}}  which always returns 
null in getServerAddress() call of RpcConfiguration.java. Rather it should be 
{{String hiveHost = 
config.get(HiveConf.ConfVars.SPARK_RPC_SERVER_ADDRESS.varname);}}.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14805) Subquery inside a view doesn't set InsideView property correctly

2016-09-21 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14805:
---

 Summary: Subquery inside a view doesn't set InsideView property 
correctly
 Key: HIVE-14805
 URL: https://issues.apache.org/jira/browse/HIVE-14805
 Project: Hive
  Issue Type: Bug
  Components: Views
Affects Versions: 2.0.1
Reporter: Aihua Xu
Assignee: Aihua Xu


Here is the repro steps.

create table t1(col string);
create view v1 as select * from t1;
create view dataview as select v1.col from v1 join (select * from v1) v2 on 
v1.col=v2.col;
select * from dataview;

If hive is configured with authorization hook like Sentry, it will require the 
access not only for dataview but also for v1, which should not be required.
The subquery seems to not carry insideview property from the parent query.








--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14788) Investigate how to access permanent function with restarting HS2 if load balancer is configured

2016-09-19 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14788:
---

 Summary: Investigate how to access permanent function with 
restarting HS2 if load balancer is configured
 Key: HIVE-14788
 URL: https://issues.apache.org/jira/browse/HIVE-14788
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Aihua Xu
Assignee: Aihua Xu


When load balancer is configured for multiple HS2 servers, seems we need to 
restart each HS2 server to get permanent function to work. Since the command 
"reload function" issued from the client to refresh the global registry may is 
not targeted to a specific HS2 server, some servers may not get refreshed and 
ClassNotFoundException may be thrown later.

Investigate if it's an issue and a good solution for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14742) Hive on spark throws NPE exception for union all query

2016-09-13 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14742:
---

 Summary: Hive on spark throws NPE exception for union all query 
 Key: HIVE-14742
 URL: https://issues.apache.org/jira/browse/HIVE-14742
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu


{noformat}
create table foo (fooId string, fooData string) partitioned by (fooPartition 
string) stored as parquet;
insert into foo partition (fooPartition = '1') values ('1', '1'), ('2', '2');
set hive.execution.engine=spark;
select * from ( 
select 
fooId as myId, 
fooData as myData 
from foo where fooPartition = '1' 
union all 
select 
fooId as myId, 
fooData as myData 
from foo where fooPartition = '3' 
) allData;
{noformat}

Error while compiling statement: FAILED: NullPointerException null



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14341) Altered skewed location is not respected for list bucketing

2016-07-26 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14341:
---

 Summary: Altered skewed location is not respected for list 
bucketing
 Key: HIVE-14341
 URL: https://issues.apache.org/jira/browse/HIVE-14341
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.0.1
Reporter: Aihua Xu
Assignee: Aihua Xu


CREATE TABLE list_bucket_single (key STRING, value STRING)
  SKEWED BY (key) ON (1,5,6) STORED AS DIRECTORIES;

alter table list_bucket_single set skewed location 
(''1"="/user/hive/warehouse/hdfs_skewed/new1");

While when you insert a row to key 1, the location falls back to the default 
one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   >