[jira] [Created] (HIVE-22715) SkewJoin throws NullPointException.

2020-01-09 Thread LuGuangMing (Jira)
LuGuangMing created HIVE-22715:
--

 Summary: SkewJoin throws NullPointException.
 Key: HIVE-22715
 URL: https://issues.apache.org/jira/browse/HIVE-22715
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0
Reporter: LuGuangMing
 Attachments: sample.sql

When hive.optimize.skewjoin=ture and hive.auto.convert.join=true it throws 
NullPointException.
Attached SQL file for reproducing the issue.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22714) TestScheduledQueryService is flaky

2020-01-09 Thread Jason Dere (Jira)
Jason Dere created HIVE-22714:
-

 Summary: TestScheduledQueryService is flaky
 Key: HIVE-22714
 URL: https://issues.apache.org/jira/browse/HIVE-22714
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere


{noformat}
[ERROR] Failures: 
[ERROR]   TestScheduledQueryService.testScheduledQueryExecution:152 
Expected: <5>
 but: was <0>
[INFO] 
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0
{noformat}

Looks like sometimes we are not waiting long enough for the INSERT query to 
complete and the SELECT runs before it finishes:
{noformat}
$ egrep "insert|select" 
target/surefire-reports/org.apache.hadoop.hive.ql.schq.TestScheduledQueryService-output.txt
 | grep HOOK
PREHOOK: query: insert into tu values(1),(2),(3),(4),(5)
2020-01-09T14:49:09,497  INFO [SchQ 0] SessionState: PREHOOK: query: insert 
into tu values(1),(2),(3),(4),(5)
PREHOOK: query: select 1 from tu
2020-01-09T14:49:11,452  INFO [main] SessionState: PREHOOK: query: select 1 
from tu
POSTHOOK: query: select 1 from tu
2020-01-09T14:49:11,452  INFO [main] SessionState: POSTHOOK: query: select 1 
from tu
POSTHOOK: query: insert into tu values(1),(2),(3),(4),(5)
2020-01-09T14:49:12,062  INFO [SchQ 0] SessionState: POSTHOOK: query: insert 
into tu values(1),(2),(3),(4),(5)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22713) Constant propagation shouldn't be done for Join-Fil(*)-RS structure

2020-01-09 Thread Ramesh Kumar Thangarajan (Jira)
Ramesh Kumar Thangarajan created HIVE-22713:
---

 Summary: Constant propagation shouldn't be done for Join-Fil(*)-RS 
structure
 Key: HIVE-22713
 URL: https://issues.apache.org/jira/browse/HIVE-22713
 Project: Hive
  Issue Type: Bug
Reporter: Ramesh Kumar Thangarajan


Constant propagation shouldn't be done for Join-Fil(*)-RS structure too. Since 
we output columns from the join if the structure is Join-Fil(*)-RS, the 
expressions shouldn't be modified.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


INSERTing rows into a table with a BINARY-typed partition key

2020-01-09 Thread Rick Hillegas
I would appreciate your advice about how to INSERT data into a table 
with a BINARY-typed partition key.


It seems that hexit strings can be INSERTed into columns which aren't 
partition keys. The hexit strings look fine when you query the table. 
However, hexit strings come back as gibberish after being used as 
partition key values. The following Hive QL script shows this behavior:


USE rick;

DROP TABLE foo;

CREATE TABLE foo (a BINARY) PARTITIONED BY (b BINARY);

INSERT INTO foo PARTITION (b='FEED') VALUES ('FEED');

SELECT * FROM foo;


The result of the terminal SELECT is:

| foo.a | foo.b   |

|---|-|

| FEED  | [B@1b881a6f |


Thanks,
-Rick



[jira] [Created] (HIVE-22712) ReExec Driver execute submit the query in default queue irrespective of user defined queue

2020-01-09 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22712:
-

 Summary: ReExec Driver execute submit the query in default queue 
irrespective of user defined queue
 Key: HIVE-22712
 URL: https://issues.apache.org/jira/browse/HIVE-22712
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Affects Versions: 3.1.2
 Environment: Hive-3
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


we unset the queue name intentionally in 
TezSessionState#startSessionAndContainers, 

as a result reexec create a new session in the default queue and create a 
problem, its a cumbersome to add reexec.overlay.tez.queue.name at session level.

I could not find a better way of setting the queue name (I am open for the 
suggestion here) since it can create a  conflict with the Global queue name vs 
user-defined queue that's why setting while initialization of 
ReExecutionOverlayPlugin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22711) yearValue of UDF mask() should not start from 1900

2020-01-09 Thread Quanlong Huang (Jira)
Quanlong Huang created HIVE-22711:
-

 Summary: yearValue of UDF mask() should not start from 1900
 Key: HIVE-22711
 URL: https://issues.apache.org/jira/browse/HIVE-22711
 Project: Hive
  Issue Type: Bug
Reporter: Quanlong Huang


Here's the description of the UDF mask():
{code:java}
masks the given value  
Examples:  
   mask(ccn)   
   mask(ccn, 'X', 'x', '0')
   mask(ccn, 'x', 'x', 'x')
 Arguments:
   mask(value, upperChar, lowerChar, digitChar, otherChar, numberChar, 
dayValue, monthValue, yearValue) 
 value  - value to mask. Supported types: TINYINT, SMALLINT, INT, 
BIGINT, STRING, VARCHAR, CHAR, DATE 
 upperChar  - character to replace upper-case characters with. Specify -1 
to retain original character. Default value: 'X' 
 lowerChar  - character to replace lower-case characters with. Specify -1 
to retain original character. Default value: 'x' 
 digitChar  - character to replace digit characters with. Specify -1 to 
retain original character. Default value: 'n' 
 otherChar  - character to replace all other characters with. Specify -1 to 
retain original character. Default value: -1 
 numberChar - character to replace digits in a number with. Valid values: 
0-9. Default value: '1' 
 dayValue   - value to replace day field in a date with.  Specify -1 to 
retain original value. Valid values: 1-31. Default value: 1 
 monthValue - value to replace month field in a date with. Specify -1 to 
retain original value. Valid values: 0-11. Default value: 0 
 yearValue  - value to replace year field in a date with. Specify -1 to 
retain original value. Default value: 0 {code}
Although it says 'yearValue' is the value to replace year field in a DATE with, 
it actually counts start at 1900. E.g. yearValue = 0 means masking the year 
field to 1900, yearValue=2000 means masking the year field to 3900, 
yearValue=-2 means masking the year field to 1988.

Here are some query examples:
{code:sql}
beeline> select mask(cast('2019-02-03' as date), -1, -1, -1, -1, -1, -1, -1, 0);
1900-02-03
beeline> select mask(cast('2019-02-03' as date), -1, -1, -1, -1, -1, -1, -1, 
2000);
3900-02-03
beeline> select mask(cast('2019-02-03' as date), -1, -1, -1, -1, -1, -1, -1, 
-2);
1898-02-03
beeline> select mask(cast('2019-02-03' as date), -1, -1, -1, -1, -1, -1, -1, 
-100);
1800-02-03
{code}
The drawback of this behavior is that we can't mask year field to be 1899, 
since -1 already means retaining the original value.

It'd be better to change the behavior to be intuitive that simply masking year 
filed to yearValue. And only accept yearValue from 0 to . Still use -1 to 
retain original value. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71761: HIVE-22489

2020-01-09 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71761/
---

(Updated Jan. 9, 2020, 8:03 a.m.)


Review request for hive, Jesús Camacho Rodríguez and Zoltan Haindrich.


Bugs: HIVE-22489
https://issues.apache.org/jira/browse/HIVE-22489


Repository: hive-git


Description
---

Reduce Sink operator orders nulls first
===
1. Set the default null sort order by hive config when creating Reduce Sink 
Desc.
2. Hash join uses 
`org.apache.hadoop.hive.serde2.binarysortable.fast.BinarySortableSerializeWrite`
 or `BinarySortableDeserializeRead` for selializing keys. For bigtable keys 
always ascending and nulls first ordering was hardcoded. This patch changes 
this behaviour to use the `Operator.getConf().TableDesc.getProperties()` (in 
this case `MapJoinOperator`) to setup ordering in `BinarySortableSerializeWrite`
3. Use null ordering set in ReduceRecordSource at Reduce phase when comparing 
keys in `CommonMergeJoinOperator` (This is the null ordering of the children 
Reduce Sink operators)


Diffs (updated)
-

  accumulo-handler/src/test/results/positive/accumulo_queries.q.out 7c552621f2 
  contrib/src/test/results/clientpositive/udaf_example_group_concat.q.out 
6846720d95 
  hbase-handler/src/test/results/positive/hbase_queries.q.out a32ef81a7b 
  
itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
 e997fa65cf 
  kudu-handler/src/test/results/positive/kudu_complex_queries.q.out 73fc3e514f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonMergeJoinOperator.java 
3974627a24 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordSource.java 
72446afeda 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java
 2380d936f2 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerBigOnlyMultiKeyOperator.java
 f587517b08 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinInnerMultiKeyOperator.java
 cdee3fd957 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinLeftSemiMultiKeyOperator.java
 e5d9fdae19 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinOuterMultiKeyOperator.java
 29c531bd51 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMap.java
 a4cda921a5 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMultiSet.java
 43f093d906 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashSet.java
 8dce5b82d3 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java
 a35401d9b2 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringCommon.java
 1b108a8c14 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashMap.java
 446feb2526 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashMultiSet.java
 c28ef9be2b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashSet.java
 17bd5fda93 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java
 4ab8902a3f 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedCreateHashTable.java
 21c355cb42 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongCommon.java
 de1ee15c3b 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongHashMap.java
 42573f0898 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongHashMultiSet.java
 829a03737d 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedLongHashSet.java
 18e1435019 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringCommon.java
 da0e8365b1 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashMap.java
 6c4d8a81d1 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashMultiSet.java
 a6b754c7eb 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashSet.java
 fdcd83dde7 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
 5c409e4573 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/CountDistinctRewriteProc.java 
a50ad78e8f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java
 0f95d7788c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkMapJoinProc.java 
89b55001f0