[jira] [Created] (HIVE-22901) Variable substitution can lead to OOM on circular references

2020-02-18 Thread Daniel Voros (Jira)
Daniel Voros created HIVE-22901:
---

 Summary: Variable substitution can lead to OOM on circular 
references
 Key: HIVE-22901
 URL: https://issues.apache.org/jira/browse/HIVE-22901
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.2
Reporter: Daniel Voros
Assignee: Daniel Voros


{{SystemVariables#substitute()}} is dealing with circular references between 
variables by only doing the substitution 40 times by default. If the 
substituted part is sufficiently large though, it's possible that the 
substitution will produce a string bigger than the heap size within the 40 
executions.

Take the following test case that fails with OOM in current master (third round 
of execution would need 10G heap, while running with only 2G):

{code}
@Test
public void testSubstitute() {
String randomPart = RandomStringUtils.random(100_000);
String reference = "${hiveconf:myTestVariable}";

StringBuilder longStringWithReferences = new StringBuilder();
for(int i = 0; i < 10; i ++) {
longStringWithReferences.append(randomPart).append(reference);
}

SystemVariables uut = new SystemVariables();

HiveConf conf = new HiveConf();
conf.set("myTestVariable", longStringWithReferences.toString());

uut.substitute(conf, longStringWithReferences.toString(), 40);
}
{code}

Produces:
{code}
java.lang.OutOfMemoryError: Java heap space

at java.util.Arrays.copyOf(Arrays.java:3332)
at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
at java.lang.StringBuilder.append(StringBuilder.java:136)
at 
org.apache.hadoop.hive.conf.SystemVariables.substitute(SystemVariables.java:110)
at 
org.apache.hadoop.hive.conf.SystemVariablesTest.testSubstitute(SystemVariablesTest.java:27)
{code}

We should check the size of the substituted query and bail out earlier.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22501) Stats reported multiple times during MR execution for UNION queries

2019-11-15 Thread Daniel Voros (Jira)
Daniel Voros created HIVE-22501:
---

 Summary: Stats reported multiple times during MR execution for 
UNION queries
 Key: HIVE-22501
 URL: https://issues.apache.org/jira/browse/HIVE-22501
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Voros
Assignee: Daniel Voros


Take the following example:
{code}
set hive.execution.engine=mr;

create table tb(id string) stored as orc;
insert into tb values('1');
create table tb2 like tb stored as orc;

insert into tb2 select * from tb union all select * from tb;
{code}

Last insert results in 2 records in the table, but {{TOTAL_TABLE_ROWS_WRITTEN}} 
statistic (and number of affected rows on the consolse) is 4.

We seem to traverse the operator graph multiple times starting from every TS 
operator and increment the counters every time we hit the FS operator. 
UNION-ing the table 3 times results in 9 TOTAL_TABLE_ROWS_WRITTEN.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-21724) Nested ARRAY and STRUCT inside MAP don't work with LazySimpleDeserializeRead

2019-05-13 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-21724:
---

 Summary: Nested ARRAY and STRUCT inside MAP don't work with 
LazySimpleDeserializeRead
 Key: HIVE-21724
 URL: https://issues.apache.org/jira/browse/HIVE-21724
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 3.1.1
Reporter: Daniel Voros
Assignee: Daniel Voros


The logic during vectorized execution that keeps track of how deep we are in 
the nested structure doesn't work for ARRAYs and STRUCTs embedded inside maps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21034) Add option to schematool to drop Hive databases

2018-12-12 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-21034:
---

 Summary: Add option to schematool to drop Hive databases
 Key: HIVE-21034
 URL: https://issues.apache.org/jira/browse/HIVE-21034
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Voros
Assignee: Daniel Voros


An option to remove all Hive managed data could be a useful addition to 
{{schematool}}.

I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20586) Beeline is asking for user/pass when invoked without -u

2018-09-18 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20586:
---

 Summary: Beeline is asking for user/pass when invoked without -u
 Key: HIVE-20586
 URL: https://issues.apache.org/jira/browse/HIVE-20586
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 3.1.0, 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros


Since HIVE-18963 it's possible to define a default connection URL in 
beeline-site.xml to be able to use beeline without specifying the HS2 JDBC URL.

When invoked with no arguments, beeline is asking for username/password on the 
command line. When running with {{-u}} and the exact same URL as in 
beeline-site.xml, it does not ask for username/password.

I think these two should do exactly the same, given that the URL after {{-u}} 
is the same as in beeline-site.xml:
{code:java}
beeline -u URL
beeline
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20231) Backport HIVE-19981 to branch-3

2018-07-24 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20231:
---

 Summary: Backport HIVE-19981 to branch-3
 Key: HIVE-20231
 URL: https://issues.apache.org/jira/browse/HIVE-20231
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Voros
Assignee: Daniel Voros






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20191) PreCommit patch application doesn't fail if patch is empty

2018-07-17 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20191:
---

 Summary: PreCommit patch application doesn't fail if patch is empty
 Key: HIVE-20191
 URL: https://issues.apache.org/jira/browse/HIVE-20191
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Daniel Voros
Assignee: Daniel Voros


I've created some backport tickets to branch-3 (e.g. HIVE-20181) and made the 
mistake of uploading the patch files with wrong filename ({{.} instead of {{-}} 
between version and branch).

These get applied on master, where they're already present, since {{git apply}} 
with {{-3}} won't fail if patch is already there. Tests are run on master 
instead of failing.

I think the patch application should fail if the patch is empty and branch 
selection logic should probably fail too if the patch name is malformed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20185) Backport HIVE-20111 to branch-3

2018-07-16 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20185:
---

 Summary: Backport HIVE-20111 to branch-3
 Key: HIVE-20185
 URL: https://issues.apache.org/jira/browse/HIVE-20185
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros
 Attachments: HIVE-20185.1.branch-3.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20184) Backport HIVE-20085 to branch-3

2018-07-16 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20184:
---

 Summary: Backport HIVE-20085 to branch-3
 Key: HIVE-20184
 URL: https://issues.apache.org/jira/browse/HIVE-20184
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20182) Backport HIVE-20067 to branch-3

2018-07-16 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20182:
---

 Summary: Backport HIVE-20067 to branch-3
 Key: HIVE-20182
 URL: https://issues.apache.org/jira/browse/HIVE-20182
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20181) Backport HIVE-20045 to branch-3

2018-07-16 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20181:
---

 Summary: Backport HIVE-20045 to branch-3
 Key: HIVE-20181
 URL: https://issues.apache.org/jira/browse/HIVE-20181
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20180) Backport HIVE-19759 to branch-3

2018-07-16 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20180:
---

 Summary: Backport HIVE-19759 to branch-3
 Key: HIVE-20180
 URL: https://issues.apache.org/jira/browse/HIVE-20180
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20066) hive.load.data.owner is compared to full principal

2018-07-03 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20066:
---

 Summary: hive.load.data.owner is compared to full principal
 Key: HIVE-20066
 URL: https://issues.apache.org/jira/browse/HIVE-20066
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Voros
Assignee: Daniel Voros


HIVE-19928 compares the user running HS2 to the configured owner 
(hive.load.data.owner) to check if we're able to move the file with LOAD DATA 
or need to copy.

This check compares the full username (that may contain the full kerberos 
principal) to hive.load.data.owner. We should compare to the short username 
({{UGI.getShortUserName()}}) instead. That's used in similar context 
[here|https://github.com/apache/hive/blob/f519db7eafacb4b4d2d9fe2a9e10e908d8077224/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L398].

cc [~djaiswal]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20022) Upgrade hadoop.version to 3.1.1

2018-06-28 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-20022:
---

 Summary: Upgrade hadoop.version to 3.1.1
 Key: HIVE-20022
 URL: https://issues.apache.org/jira/browse/HIVE-20022
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros


HIVE-19304 is relying on YARN-7142 and YARN-8122 that will only be released in 
Hadoop 3.1.1. We should upgrade when possible.

cc [~gsaha]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19979) Backport HIVE-19304 to branch-3

2018-06-25 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-19979:
---

 Summary: Backport HIVE-19304 to branch-3
 Key: HIVE-19979
 URL: https://issues.apache.org/jira/browse/HIVE-19979
 Project: Hive
  Issue Type: Task
Reporter: Daniel Voros
Assignee: Daniel Voros


Needs HIVE-19978 (backport of HIVE-18037) to land first.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19978) Backport HIVE-18037 to branch-3

2018-06-25 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-19978:
---

 Summary: Backport HIVE-18037 to branch-3
 Key: HIVE-19978
 URL: https://issues.apache.org/jira/browse/HIVE-19978
 Project: Hive
  Issue Type: Task
Reporter: Daniel Voros
Assignee: Daniel Voros






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19728) beeline with USE_BEELINE_FOR_HIVE_CLI fails when trying to set hive.aux.jars.path

2018-05-29 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-19728:
---

 Summary: beeline with USE_BEELINE_FOR_HIVE_CLI fails when trying 
to set hive.aux.jars.path
 Key: HIVE-19728
 URL: https://issues.apache.org/jira/browse/HIVE-19728
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros


Since HIVE-19385 it's possible to redirect bin/hive to beeline. This is not 
working as expected though, because in {{bin/hive}} we're setting 
{{hive.aux.jars.path}}. This leads to the following error:

{code}
$ USE_BEELINE_FOR_HIVE_CLI=true hive

...
Error: Could not open client transport for any of the Server URI's in 
ZooKeeper: Failed to open new session: java.lang.IllegalArgumentException: 
Cannot modify hive.aux.jars.path at runtime. It is not in list of params that 
are allowed to be modified at runtime (state=08S01,code=0)
Beeline version 3.0.0 by Apache Hive
beeline> 
{code}

We already avoid setting {{hive.aux.jars.path}} when running {{beeline}} 
service but the USE_BEELINE_FOR_HIVE_CLI override happens after that.

I'd suggest checking the value of USE_BEELINE_FOR_HIVE_CLI right after we've 
selected the service to run (cli/beeline/...) and override cli->beeline there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18858) System properties in job configuration not resolved when submitting MR job

2018-03-05 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-18858:
---

 Summary: System properties in job configuration not resolved when 
submitting MR job
 Key: HIVE-18858
 URL: https://issues.apache.org/jira/browse/HIVE-18858
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
 Environment: Hadoop 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros


Since [this hadoop 
commit|https://github.com/apache/hadoop/commit/5eb7dbe9b31a45f57f2e1623aa1c9ce84a56c4d1]
 that was first released in 3.0.0, Configuration has a restricted mode, that 
disables the resolution of system properties (that happens when retrieving a 
configuration option).

This leads to test failures when switching to Hadoop 3.0.0 (instead of 
3.0.0-beta1), since we're relying on the [substitution of 
test.tmp.dir|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/data/conf/hive-site.xml#L37]
 during the [maven 
build|https://github.com/apache/hive/blob/05d4719eefc56676a3e0e8f706e1c5e5e1f6b345/pom.xml#L83].
 See test results on HIVE-18327.

When we're passing job configurations to Hadoop, I believe there's no way to 
disable the restricted mode, since we go through some Hadoop MR calls first, 
see here:

{code}
"HiveServer2-Background-Pool: Thread-105@9500" prio=5 tid=0x69 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
  at 
org.apache.hadoop.conf.Configuration.addResourceObject(Configuration.java:970)
  - locked <0x2fe6> (a org.apache.hadoop.mapred.JobConf)
  at 
org.apache.hadoop.conf.Configuration.addResource(Configuration.java:895)
  at org.apache.hadoop.mapred.JobConf.(JobConf.java:476)
  at 
org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:162)
  at 
org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:788)
  at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:254)
  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
  at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
  at 
java.security.AccessController.doPrivileged(AccessController.java:-1)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
  at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
  at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
  at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
  at 
java.security.AccessController.doPrivileged(AccessController.java:-1)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
  at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
  at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
  at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:415)
  at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
  at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2314)
  at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1985)
  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1687)
  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1438)
  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1432)
  at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
  at 
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:90)
  at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
  at 
java.security.AccessController.doPrivileged(AccessController.java:-1)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
  at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:353)
  at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
{code}

I suggest to resolve all variables before passing the configuration to Hadoop 
in ExecDriver.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18784) TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport mode instead of binary

2018-02-23 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-18784:
---

 Summary: TestJdbcWithMiniKdcSQLAuthBinary runs with HTTP transport 
mode instead of binary
 Key: HIVE-18784
 URL: https://issues.apache.org/jira/browse/HIVE-18784
 Project: Hive
  Issue Type: Test
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros


TestJdbcWithMiniKdcSQLAuthHttp should run HTTP and 
TestJdbcWithMiniKdcSQLAuthBinary should run binary, but currently they're both 
using HTTP.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18646) Update errata.txt for HIVE-18617

2018-02-07 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-18646:
---

 Summary: Update errata.txt for HIVE-18617
 Key: HIVE-18646
 URL: https://issues.apache.org/jira/browse/HIVE-18646
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Daniel Voros
Assignee: Daniel Voros


HIVE-18617 was committed as HIVE-18671.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18091) Failing tests of itests/qtest-spark and itests/hive-unit on branch-1

2017-11-17 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-18091:
---

 Summary: Failing tests of itests/qtest-spark and itests/hive-unit 
on branch-1
 Key: HIVE-18091
 URL: https://issues.apache.org/jira/browse/HIVE-18091
 Project: Hive
  Issue Type: Bug
  Components: Test, Testing Infrastructure
Reporter: Daniel Voros
Assignee: Daniel Voros


Seen this while looking at ptest results for HIVE-17947 but is probably an 
older issue.

Tests under itests/qtest-spark and itests/hive-unit fail when trying to execute 
the download-spark plugin with:

{code}
[INFO]
[INFO] 
[INFO] Building Hive Integration - Unit Tests 1.3.0-SNAPSHOT
[INFO] 
[INFO]
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-it-unit ---
[INFO]
[INFO] --- maven-antrun-plugin:1.7:run (download-spark) @ hive-it-unit ---
[INFO] Executing tasks

main:
 [exec] + /bin/pwd
 [exec] + BASE_DIR=./target
 [exec] + HIVE_ROOT=./target/../../../
 [exec] + DOWNLOAD_DIR=./../thirdparty
 [exec] + mkdir -p ./../thirdparty
 [exec] 
/home/hiveptest/35.192.99.254-hiveptest-0/apache-github-branch-1-source/itests/hive-unit
 [exec] + download 
http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.5.0-bin-hadoop2-without-hive.tgz
 spark
 [exec] + 
url=http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.5.0-bin-hadoop2-without-hive.tgz
 [exec] + finalName=spark
 [exec] ++ basename 
http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.5.0-bin-hadoop2-without-hive.tgz
 [exec] + tarName=spark-1.5.0-bin-hadoop2-without-hive.tgz
 [exec] + rm -rf ./target/spark
 [exec] + [[ ! -f ./../thirdparty/spark-1.5.0-bin-hadoop2-without-hive.tgz 
]]
 [exec] + tar -zxf ./../thirdparty/spark-1.5.0-bin-hadoop2-without-hive.tgz 
-C ./target
 [exec] + mv ./target/spark-1.5.0-bin-hadoop2-without-hive ./target/spark
 [exec] + cp -f ./target/../../..//data/conf/spark/log4j.properties 
./target/spark/conf/
 [exec] + sed '/package /d' 
/data/hiveptest/working/apache-github-branch-1-source/itests/../contrib/src/java/org/apache/hadoop/hive/contrib/udf/example/UDFExampleAdd.java
 [exec] sed: can't read 
/data/hiveptest/working/apache-github-branch-1-source/itests/../contrib/src/java/org/apache/hadoop/hive/contrib/udf/example/UDFExampleAdd.java:
 No such file or directory
 [exec] + javac -cp 
/data/hiveptest/working/maven/org/apache/hive/hive-exec/1.3.0-SNAPSHOT/hive-exec-1.3.0-SNAPSHOT.jar
 /tmp/UDFExampleAdd.java -d /tmp
 [exec] + jar -cf /tmp/udfexampleadd-1.0.jar -C /tmp UDFExampleAdd.class
 [exec] /tmp/UDFExampleAdd.class : no such file or directory
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 8.376s
[INFO] Finished at: Mon Nov 06 22:29:39 UTC 2017
[INFO] Final Memory: 18M/241M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-antrun-plugin:1.7:run (download-spark) on 
project hive-it-unit: An Ant BuildException has occured: exec returned: 1
[ERROR] around Ant part .. @ 4:141 in 
/home/hiveptest/35.192.99.254-hiveptest-0/apache-github-branch-1-source/itests/hive-unit/target/antrun/build-main.xml
{code}

{{mvn antrun:run@download-spark}} passes when run locally, so I guess it might 
be an issue with the way wer're executing ptests.

Full list of classes with the same error:

{code}
TestAcidOnTez
TestAdminUser
TestAuthorizationPreEventListener
TestAuthzApiEmbedAuthorizerInEmbed
TestAuthzApiEmbedAuthorizerInRemote
TestBeeLineWithArgs
TestCLIAuthzSessionContext
TestClearDanglingScratchDir
TestClientSideAuthorizationProvider
TestCompactor
TestCreateUdfEntities
TestCustomAuthentication
TestDBTokenStore
TestDDLWithRemoteMetastoreSecondNamenode
TestDynamicSerDe
TestEmbeddedHiveMetaStore
TestEmbeddedThriftBinaryCLIService
TestFilterHooks
TestFolderPermissions
TestHS2AuthzContext
TestHS2AuthzSessionContext
TestHS2ClearDanglingScratchDir
TestHS2ImpersonationWithRemoteMS
TestHiveAuthorizerCheckInvocation
TestHiveAuthorizerShowFilters
TestHiveHistory
TestHiveMetaStoreTxns
TestHiveMetaStoreWithEnvironmentContext
TestHiveMetaTool
TestHiveServer2
TestHiveServer2SessionTimeout
TestHiveSessionImpl
TestHs2Hooks
TestJdbcDriver2
TestJdbcMetadataApiAuth
TestJdbcWithLocalClusterSpark
TestJdbcWithMiniHS2
TestJdbcWithMiniMr
TestJdbcWithSQLAuthUDFBlacklist
TestJdbcWithSQLAuthorization
TestLocationQueries
TestMTQueries
TestMarkPartition
TestMarkPartitionRemote
TestMetaStoreAuthorization
TestMetaStoreConnectionUrlHook
TestMetaStoreEndFunctionListener
TestMetaStoreEventListener
TestMetaStoreEventListenerOnlyOnCom

[jira] [Created] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1

2017-10-31 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-17947:
---

 Summary: Concurrent inserts might fail for ACID table since 
HIVE-17526 on branch-1
 Key: HIVE-17947
 URL: https://issues.apache.org/jira/browse/HIVE-17947
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.3.0
Reporter: Daniel Voros
Assignee: Daniel Voros
Priority: Blocker


HIVE-17526 (only on branch-1) disabled conversion to ACID if there are *_copy_N 
files under the table, but the filesystem checks introduced there are running 
for every insert since the MoveTask in the end of the insert will call 
alterTable eventually.

The filename checking also recurses into staging directories created by other 
inserts. If those are removed while listing the files, it leads to the 
following exception and failing insert:

{code}
java.io.FileNotFoundException: File 
hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001
 does not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081)
 ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059)
 ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
at 
org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004)
 ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
at 
org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000)
 ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018)
 ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?]
at 
org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) 
~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
at 
org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) 
~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) 
~[hadoop-common-2.7.3.2.6.3.0-235.jar:?]
at 
org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at 
org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at 
org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at 
org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) ~[?:?]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown 
Source) [?:?]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:299)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:325)
 [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235]
at sun.reflect.GeneratedMethodAccessor87.invoke(Unknown Source) ~[?:?]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
at 
org.apache.hadoop.hi

[jira] [Created] (HIVE-17526) Disable conversion to ACID if table has _copy_N files on branch-1

2017-09-13 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-17526:
---

 Summary: Disable conversion to ACID if table has _copy_N files on 
branch-1
 Key: HIVE-17526
 URL: https://issues.apache.org/jira/browse/HIVE-17526
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Voros
Assignee: Daniel Voros
 Fix For: 1.3.0


As discussed in HIVE-16177, non-ACID to ACID conversion can lead to data loss 
if the table has *_copy_N files.

The patch for HIVE-16177 is quite massive and would basically need a 
reimplementation to apply for branch-1 since the related code paths have 
diverged a lot. We could disable the conversion to ACID if there are *_copy_N 
files instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-15833) Add unit tests for org.json usage on branch-1

2017-02-07 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-15833:
---

 Summary: Add unit tests for org.json usage on branch-1
 Key: HIVE-15833
 URL: https://issues.apache.org/jira/browse/HIVE-15833
 Project: Hive
  Issue Type: Sub-task
Reporter: Daniel Voros
Assignee: Daniel Voros


Before switching implementation, we should add some tests that capture the 
current behavior.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-15834) Add unit tests for org.json usage on master

2017-02-07 Thread Daniel Voros (JIRA)
Daniel Voros created HIVE-15834:
---

 Summary: Add unit tests for org.json usage on master
 Key: HIVE-15834
 URL: https://issues.apache.org/jira/browse/HIVE-15834
 Project: Hive
  Issue Type: Sub-task
Reporter: Daniel Voros
Assignee: Daniel Voros


Before switching implementation, we should add some tests that capture the 
current behavior.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)