[jira] [Created] (HIVE-14137) Hive on Spark throws FileAlreadyExistsException for jobs with multiple empty tables

2016-06-29 Thread Sahil Takiar (JIRA)
Sahil Takiar created HIVE-14137:
---

 Summary: Hive on Spark throws FileAlreadyExistsException for jobs 
with multiple empty tables
 Key: HIVE-14137
 URL: https://issues.apache.org/jira/browse/HIVE-14137
 Project: Hive
  Issue Type: Bug
Reporter: Sahil Takiar


The following queries:

{code}
-- Setup
drop table if exists empty1;
create table empty1 (col1 bigint) stored as parquet tblproperties 
('parquet.compress'='snappy');

drop table if exists empty2;
create table empty2 (col1 bigint, col2 bigint) stored as parquet tblproperties 
('parquet.compress'='snappy');

drop table if exists empty3;
create table empty3 (col1 bigint) stored as parquet tblproperties 
('parquet.compress'='snappy');

-- All empty HDFS directories.
-- Fails with [08S01]: Error while processing statement: FAILED: Execution 
Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask.
select empty1.col1
from empty1
inner join empty2
on empty2.col1 = empty1.col1
inner join empty3
on empty3.col1 = empty2.col2;

-- Two empty HDFS directories.
-- Create an empty file in HDFS.
insert into empty1 select * from empty1 where false;

-- Same query fails with [08S01]: Error while processing statement: FAILED: 
Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask.
select empty1.col1
from empty1
inner join empty2
on empty2.col1 = empty1.col1
inner join empty3
on empty3.col1 = empty2.col2;

-- One empty HDFS directory.
-- Create an empty file in HDFS.
insert into empty2 select * from empty2 where false;

-- Same query succeeds.
select empty1.col1
from empty1
inner join empty2
on empty2.col1 = empty1.col1
inner join empty3
on empty3.col1 = empty2.col2;
{code}

Will result in the following exception:

{code}
org.apache.hadoop.fs.FileAlreadyExistsException: 
/tmp/hive/hive/1f3837aa-9407-4780-92b1-42a66d205139/hive_2016-06-24_15-45-23_206_79177714958655528-2/-mr-10004/0/emptyFile
 for client 172.26.14.151 already exists
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2784)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2676)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2561)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:593)
at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:111)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:393)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1902)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1738)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1663)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:405)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:401)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:401)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:344)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:920)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:901)
at parqu

[jira] [Created] (HIVE-14136) LLAP ZK SecretManager should resolve _HOST in principal

2016-06-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-14136:
---

 Summary: LLAP ZK SecretManager should resolve _HOST in principal
 Key: HIVE-14136
 URL: https://issues.apache.org/jira/browse/HIVE-14136
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14135) beeline output not formatted correctly for large column widths

2016-06-29 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-14135:
--

 Summary: beeline output not formatted correctly for large column 
widths
 Key: HIVE-14135
 URL: https://issues.apache.org/jira/browse/HIVE-14135
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 2.2.0
Reporter: Vihang Karajgaonkar
Assignee: Vihang Karajgaonkar


If the column width is too large then beeline uses the maximum column width 
when normalizing all the column widths. In order to reproduce the issue, run 
set -v; 

Once the configuration variables is classpath which can be extremely large 
width (41k characters in my environment).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49410: HIVE-13945 Decimal value is displayed as rounded when selecting where clause with that decimal value - no out files

2016-06-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49410/#review140074
---




ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java (line 576)


spurious



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java (line 
295)


spurious



serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/DecimalTypeInfo.java 
(line 34)


spurious


- Sergey Shelukhin


On June 29, 2016, 11:56 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49410/
> ---
> 
> (Updated June 29, 2016, 11:56 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see JIRA
> 
> 
> Diffs
> -
> 
>   contrib/src/test/queries/clientpositive/udf_example_format.q 38069dc 
>   orc/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java 9e5f5cc 
>   orc/src/java/org/apache/orc/impl/RecordReaderImpl.java 36a802e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java e154d13 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> 9de1833 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java
>  045f0ab 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 227a051 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java 
> 1fa94b9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java
>  6952ffb 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTBuilder.java
>  2e1384a 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 5b32f56 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 67d8b86 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g a1909a7 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java a9e503d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 20d9649 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 
> 239cc61 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java fc6540e 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java 3d85abd 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java 24533d6 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox.java
>  795013a 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeadLag.java 
> 351b593 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 
> 89e69be 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPrintf.java 
> e52e431 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java 
> b89c14e 
>   
> ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 
> 6fa3b3f 
>   ql/src/test/queries/clientnegative/compare_double_bigint.q 8ee4b27 
>   ql/src/test/queries/clientpositive/decimal_divide.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/implicit_decimal.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/input_lazyserde.q 74d7a2a 
>   ql/src/test/queries/clientpositive/udf_java_method.q c0598cd 
>   ql/src/test/queries/clientpositive/udf_reflect.q 97fa817 
>   ql/src/test/queries/clientpositive/vector_struct_in.q 0e3a4ca 
>   
> serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
>  6415bf8 
>   serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/DecimalTypeInfo.java 
> d00f77d 
>   storage-api/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 
> 1c6be91 
> 
> Diff: https://reviews.apache.org/r/49410/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Review Request 49410: HIVE-13945 Decimal value is displayed as rounded when selecting where clause with that decimal value - no out files

2016-06-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49410/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

see JIRA


Diffs
-

  contrib/src/test/queries/clientpositive/udf_example_format.q 38069dc 
  orc/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java 9e5f5cc 
  orc/src/java/org/apache/orc/impl/RecordReaderImpl.java 36a802e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java e154d13 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
9de1833 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java
 045f0ab 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 227a051 
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java 
1fa94b9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java
 6952ffb 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTBuilder.java
 2e1384a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 5b32f56 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 67d8b86 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g a1909a7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java a9e503d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 20d9649 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 239cc61 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java fc6540e 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java 3d85abd 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java 24533d6 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox.java
 795013a 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeadLag.java 
351b593 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 
89e69be 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPrintf.java 
e52e431 
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java 
b89c14e 
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 
6fa3b3f 
  ql/src/test/queries/clientnegative/compare_double_bigint.q 8ee4b27 
  ql/src/test/queries/clientpositive/decimal_divide.q PRE-CREATION 
  ql/src/test/queries/clientpositive/implicit_decimal.q PRE-CREATION 
  ql/src/test/queries/clientpositive/input_lazyserde.q 74d7a2a 
  ql/src/test/queries/clientpositive/udf_java_method.q c0598cd 
  ql/src/test/queries/clientpositive/udf_reflect.q 97fa817 
  ql/src/test/queries/clientpositive/vector_struct_in.q 0e3a4ca 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
 6415bf8 
  serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/DecimalTypeInfo.java 
d00f77d 
  storage-api/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java 
1c6be91 

Diff: https://reviews.apache.org/r/49410/diff/


Testing
---


Thanks,

Sergey Shelukhin



Review Request 49408: HIVE-13945 Decimal value is displayed as rounded when selecting where clause with that decimal value.

2016-06-29 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49408/
---

Review request for hive and Ashutosh Chauhan.


Repository: hive-git


Description
---

see jira


Diffs
-

  contrib/src/test/queries/clientpositive/udf_example_format.q 38069dc 
  contrib/src/test/results/clientpositive/udf_example_format.q.out 34b10c4 
  orc/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java 9e5f5cc 
  orc/src/java/org/apache/orc/impl/RecordReaderImpl.java 36a802e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java e154d13 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
9de1833 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDecimalToLong.java
 045f0ab 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 227a051 
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java 
1fa94b9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java
 6952ffb 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTBuilder.java
 2e1384a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 5b32f56 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 67d8b86 
  ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g a1909a7 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java a9e503d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 20d9649 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 239cc61 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java fc6540e 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java 3d85abd 
  ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java 24533d6 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox.java
 795013a 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLeadLag.java 
351b593 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java 
89e69be 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFPrintf.java 
e52e431 
  ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java 
b89c14e 
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFOPDivide.java 
6fa3b3f 
  ql/src/test/queries/clientnegative/compare_double_bigint.q 8ee4b27 
  ql/src/test/queries/clientpositive/decimal_divide.q PRE-CREATION 
  ql/src/test/queries/clientpositive/implicit_decimal.q PRE-CREATION 
  ql/src/test/queries/clientpositive/input_lazyserde.q 74d7a2a 
  ql/src/test/queries/clientpositive/udf_java_method.q c0598cd 
  ql/src/test/queries/clientpositive/udf_reflect.q 97fa817 
  ql/src/test/queries/clientpositive/vector_struct_in.q 0e3a4ca 
  ql/src/test/results/clientnegative/no_matching_udf.q.out cd3e72d 
  ql/src/test/results/clientnegative/udf_add_months_error_2.q.out 897f4f2 
  ql/src/test/results/clientnegative/udf_format_number_wrong4.q.out 6eca7f8 
  ql/src/test/results/clientnegative/udf_format_number_wrong5.q.out 6d7a3c1 
  ql/src/test/results/clientnegative/udf_trunc_error2.q.out 7d089fd 
  ql/src/test/results/clientnegative/wrong_column_type.q.out be48c4e 
  ql/src/test/results/clientpositive/annotate_stats_select.q.out a03040a 
  ql/src/test/results/clientpositive/cast1.q.out d87c04c 
  ql/src/test/results/clientpositive/decimal_6.q.out 2c2c97a 
  ql/src/test/results/clientpositive/decimal_divide.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/decimal_precision.q.out fa26248 
  ql/src/test/results/clientpositive/decimal_udf.q.out dad8663 
  ql/src/test/results/clientpositive/implicit_decimal.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_partitioned.q.out f556369 
  ql/src/test/results/clientpositive/infer_const_type.q.out 4ff8c87 
  ql/src/test/results/clientpositive/input49.q.out 2d51528 
  ql/src/test/results/clientpositive/input_lazyserde.q.out 3cf3bd2 
  ql/src/test/results/clientpositive/lineage3.q.out 12ae13e 
  ql/src/test/results/clientpositive/list_bucket_dml_13.q.out 93ebef0 
  ql/src/test/results/clientpositive/literal_double.q.out 5d46d2d 
  ql/src/test/results/clientpositive/llap/orc_ppd_basic.q.out 39438e9 
  ql/src/test/results/clientpositive/llap/vectorization_short_regress.q.out 
82f5b12 
  ql/src/test/results/clientpositive/metadata_only_queries.q.out 9bbc9b9 
  ql/src/test/results/clientpositive/orc_predicate_pushdown.q.out 38321e9 
  ql/src/test/results/clientpositive/parquet_predicate_pushdown.q.out a9d03fc 
  ql/src/test/results/clientpositive/perf/query13.q.out c9167e7 
  ql/src/test/results/clientpositive/perf/query32.q.out b8c1468 
  ql/src/test/results/clientpositive/perf/query48.q.out 6473ad8 
  ql/src/test/results/clientpositive/perf/query58.q.out b688f5a 
  ql/src/test/results/clientpositive/perf/query65.q.out 6a6777b 
  ql/src/test/results/client

[jira] [Created] (HIVE-14134) Support automatic type conversion (or fail) when using struct_in with different types

2016-06-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-14134:
---

 Summary: Support automatic type conversion (or fail) when using 
struct_in with different types
 Key: HIVE-14134
 URL: https://issues.apache.org/jira/browse/HIVE-14134
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin


E.g. if the struct in int and double, this should work or fail to compile: 

{noformat}
select * from test_4 where struct(`my_bigint`, `my_double`)
IN (struct(1L, "a", 1.5BD), ...)
{noformat}

Right now in vectorization it all depends on serialization format so it doesn't 
work; in non-vector it also doesn't work for some other reason. See HIVE-13945



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14133) Don't fail config validation for removed configs

2016-06-29 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-14133:
---

 Summary: Don't fail config validation for removed configs
 Key: HIVE-14133
 URL: https://issues.apache.org/jira/browse/HIVE-14133
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 2.1.0, 2.0.0
Reporter: Ashutosh Chauhan


Users may have set config in their scripts. If we remove said config in later 
version then config validation code will throw exception for scripts containing 
said config. This unnecessary incompatibility can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14132) Don't fail config validation for removed configs

2016-06-29 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-14132:
---

 Summary: Don't fail config validation for removed configs
 Key: HIVE-14132
 URL: https://issues.apache.org/jira/browse/HIVE-14132
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 2.1.0, 2.0.0
Reporter: Ashutosh Chauhan


Users may have set config in their scripts. If we remove said config in later 
version then config validation code will throw exception for scripts containing 
said config. This unnecessary incompatibility can be avoided.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14130) Performance

2016-06-29 Thread Zhu Li (JIRA)
Zhu Li created HIVE-14130:
-

 Summary: Performance 
 Key: HIVE-14130
 URL: https://issues.apache.org/jira/browse/HIVE-14130
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Zhu Li
Assignee: Zhu Li


1. In HCatalog,  the code used for lazy deserialization in 
HCatRecordReader.java uses a method named getPosition(fieldName) for getting 
index of a filed in a row. When it is invoked, it also invokes toLowerCase() 
method for the String variable fieldName. This is trivial when data size is 
small, but when data size is huge, repeated invocations of toLowerCase() for 
the same set of fieldNames wastes some time. So storing the indices for the 
columns names in HcatRecordReader class or storing lower-case fieldNames in 
outputSchema will improve efficiency. 

2. HCatRecordReader.java is creating new instance of DefaultHCatRecord 
repeatedly for every new incoming row of data. This causes a waste of time. 
Adding a private variable of DefaultHCatRecord in this class and using it 
repeatedly for new rows will reduce some overhead.

3. Method serializePrimitiveField in class HCatRecordSerDe.java is invoking 
HCatContext.INSTANCE.getConf() repeatedly. This also causes some overhead 
according to result by JProfiler. Adding a static boolean field in 
HCatRecordSerDe.java which stores HCatContext.INSTANCE.getConf().isPresent() 
and another static Configuration variable which stores result of 
HCatContext.INSTANCE.getConf() also reduces overhead.

 According to my test on a cluster, using the above modifications we can save 
80 seconds or so when HCatalog is used to load a table in size of 1 
billion(rows) * 40(columns) with various data types. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14131) Performance

2016-06-29 Thread Zhu Li (JIRA)
Zhu Li created HIVE-14131:
-

 Summary: Performance 
 Key: HIVE-14131
 URL: https://issues.apache.org/jira/browse/HIVE-14131
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Zhu Li
Assignee: Zhu Li


1. In HCatalog,  the code used for lazy deserialization in 
HCatRecordReader.java uses a method named getPosition(fieldName) for getting 
index of a filed in a row. When it is invoked, it also invokes toLowerCase() 
method for the String variable fieldName. This is trivial when data size is 
small, but when data size is huge, repeated invocations of toLowerCase() for 
the same set of fieldNames wastes some time. So storing the indices for the 
columns names in HcatRecordReader class or storing lower-case fieldNames in 
outputSchema will improve efficiency. 

2. HCatRecordReader.java is creating new instance of DefaultHCatRecord 
repeatedly for every new incoming row of data. This causes a waste of time. 
Adding a private variable of DefaultHCatRecord in this class and using it 
repeatedly for new rows will reduce some overhead.

3. Method serializePrimitiveField in class HCatRecordSerDe.java is invoking 
HCatContext.INSTANCE.getConf() repeatedly. This also causes some overhead 
according to result by JProfiler. Adding a static boolean field in 
HCatRecordSerDe.java which stores HCatContext.INSTANCE.getConf().isPresent() 
and another static Configuration variable which stores result of 
HCatContext.INSTANCE.getConf() also reduces overhead.

 According to my test on a cluster, using the above modifications we can save 
80 seconds or so when HCatalog is used to load a table in size of 1 
billion(rows) * 40(columns) with various data types. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14129) Execute move tasks in parallel

2016-06-29 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-14129:
---

 Summary: Execute move tasks in parallel
 Key: HIVE-14129
 URL: https://issues.apache.org/jira/browse/HIVE-14129
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Ashutosh Chauhan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14128) Parallelize jobClose phases

2016-06-29 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-14128:
---

 Summary: Parallelize jobClose phases
 Key: HIVE-14128
 URL: https://issues.apache.org/jira/browse/HIVE-14128
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 2.1.0, 2.0.0, 1.2.0
Reporter: Ashutosh Chauhan






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14127) Hive on spark fails to find the plugged serde jar during runtime

2016-06-29 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-14127:
---

 Summary: Hive on spark fails to find the plugged serde jar during 
runtime
 Key: HIVE-14127
 URL: https://issues.apache.org/jira/browse/HIVE-14127
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14126) With ranger enabled, partitioned columns is returned first when you execute 'select *'

2016-06-29 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-14126:
--

 Summary: With ranger enabled, partitioned columns is returned 
first when you execute 'select *'
 Key: HIVE-14126
 URL: https://issues.apache.org/jira/browse/HIVE-14126
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Question for reviews on review board

2016-06-29 Thread Svetozar Ivanov
Hi all, I was wondering if there is some reasonable explanation why my 
review request is not picked up by anybody? Did I miss some required 
field(s)? My review request is https://reviews.apache.org/r/43811/, any 
help would be appreciated, thanks!


Best Regards,
Svetozar Ivanov


[jira] [Created] (HIVE-14124) Spark app name should be in line with MapReduce app name when using Hive On Spark

2016-06-29 Thread Thomas Scott (JIRA)
Thomas Scott created HIVE-14124:
---

 Summary: Spark app name should be in line with MapReduce app name 
when using Hive On Spark
 Key: HIVE-14124
 URL: https://issues.apache.org/jira/browse/HIVE-14124
 Project: Hive
  Issue Type: Improvement
  Components: Spark
Affects Versions: 1.1.0
Reporter: Thomas Scott
Priority: Minor


When using the spark execution engine the jobs submitted to YARN are submitted 
with name "Hive On Spark" whereas in mr  execution engine the name contains the 
query executed. This is overrideable via spark.app.name but it should 
automatically fill out the query executed in line with the mr engine.

Example:

set hive.execution.engine=spark;
Select count(*) from sometable; 
 
-> Launched YARN Job description: Hive On Spark

set hive.execution.engine=mr;
Select count(*) from sometable; 
 
-> Launched YARN Job description: Select count(*) from sometable





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14125) Spark app name should be in line with MapReduce app name when using Hive On Spark

2016-06-29 Thread Thomas Scott (JIRA)
Thomas Scott created HIVE-14125:
---

 Summary: Spark app name should be in line with MapReduce app name 
when using Hive On Spark
 Key: HIVE-14125
 URL: https://issues.apache.org/jira/browse/HIVE-14125
 Project: Hive
  Issue Type: Improvement
  Components: Spark
Affects Versions: 1.1.0
Reporter: Thomas Scott
Priority: Minor


When using the spark execution engine the jobs submitted to YARN are submitted 
with name "Hive On Spark" whereas in mr  execution engine the name contains the 
query executed. This is overrideable via spark.app.name but it should 
automatically fill out the query executed in line with the mr engine.

Example:

set hive.execution.engine=spark;
Select count(*) from sometable; 
 
-> Launched YARN Job description: Hive On Spark

set hive.execution.engine=mr;
Select count(*) from sometable; 
 
-> Launched YARN Job description: Select count(*) from sometable





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 49264: HIVE-14037: java.lang.ClassNotFoundException for the jar in hive.reloadable.aux.jars.path in mapreduce

2016-06-29 Thread Aihua Xu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49264/
---

(Updated June 29, 2016, 1:23 p.m.)


Review request for hive.


Changes
---

Address comments. Don't add jar if it doesn't exist and log the error message.


Repository: hive-git


Description
---

HIVE-14037: java.lang.ClassNotFoundException for the jar in 
hive.reloadable.aux.jars.path in mapreduce


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1d1306f 
  common/src/java/org/apache/hive/common/util/HiveStringUtils.java bba14e2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 528d663 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 8a6499b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a42c2e9 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 96c826b 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java bb6a4e1 
  ql/src/test/queries/clientpositive/reloadJar.q PRE-CREATION 
  ql/src/test/results/clientpositive/reloadJar.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/49264/diff/


Testing
---


Thanks,

Aihua Xu



[jira] [Created] (HIVE-14123) Add beeline configuration option to show database in the prompt

2016-06-29 Thread Peter Vary (JIRA)
Peter Vary created HIVE-14123:
-

 Summary: Add beeline configuration option to show database in the 
prompt
 Key: HIVE-14123
 URL: https://issues.apache.org/jira/browse/HIVE-14123
 Project: Hive
  Issue Type: Improvement
  Components: Beeline, CLI
Affects Versions: 2.2.0
Reporter: Peter Vary
Assignee: Peter Vary
Priority: Minor


There are several jira issues complaining that, the Beeline does not respect 
hive.cli.print.current.db.

This is partially true, since in embedded mode, it uses the 
hive.cli.print.current.db to change the prompt, since HIVE-10511.

In remote mode, I think this function should use a beeline command line option 
instead, like for the showHeader option emphasizing, that this is a client side 
option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-14122) Missing update to AbstractMapOperator::numRows

2016-06-29 Thread Gopal V (JIRA)
Gopal V created HIVE-14122:
--

 Summary: Missing update to AbstractMapOperator::numRows
 Key: HIVE-14122
 URL: https://issues.apache.org/jira/browse/HIVE-14122
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 2.1.0, 2.2.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Critical


The INPUT_RECORDS counter is out of sync with the actual # of rows-read in 
vectorized and non-vectorized modes.

This means Tez record summaries are off by a large margin or is 0 for those 
vertices.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)