[jira] [Created] (HIVE-22327) Repl: Ignore read-only transactions in notification log

2019-10-10 Thread Gopal Vijayaraghavan (Jira)
Gopal Vijayaraghavan created HIVE-22327:
---

 Summary: Repl: Ignore read-only transactions in notification log
 Key: HIVE-22327
 URL: https://issues.apache.org/jira/browse/HIVE-22327
 Project: Hive
  Issue Type: Improvement
  Components: repl
Reporter: Gopal Vijayaraghavan


Read txns need not be replicated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22326) StreamingV2: Fail streaming ingests if columns with default constraints are not provided

2019-10-10 Thread Gopal Vijayaraghavan (Jira)
Gopal Vijayaraghavan created HIVE-22326:
---

 Summary: StreamingV2: Fail streaming ingests if columns with 
default constraints are not provided
 Key: HIVE-22326
 URL: https://issues.apache.org/jira/browse/HIVE-22326
 Project: Hive
  Issue Type: Bug
Reporter: Gopal Vijayaraghavan


If a column has a default constraint, the StreamingV2 does not run the 
corresponding UDF (& in some cases cannot run one, like SURROGATE_KEY).

Fail visibly for that scenario by scenario, rather than allowing DEFAULT to be 
ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22325) variable expansion doesn't work in beeline-site.xml

2019-10-10 Thread Allan Espinosa (Jira)
Allan Espinosa created HIVE-22325:
-

 Summary: variable expansion doesn't work in beeline-site.xml
 Key: HIVE-22325
 URL: https://issues.apache.org/jira/browse/HIVE-22325
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 3.1.2
Reporter: Allan Espinosa
Assignee: Allan Espinosa


I have a default jdbc connection string and I want to build on top on it to 
have customized connections like setting custom queue names.  

{code}
$ cat .beeline/beeline-site.xml
  http://www.w3.org/2001/XInclude";>


  beeline.hs2.jdbc.url.base
  jdbc:hive2://localhost/



  beeline.hs2.jdbc.url.myqueue
  ${beeline.hs2.jdbc.url.base}?tez.queue.name=myqueue

  
$ beeline -c myqueue
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Error in parsing jdbc url: ${beeline.hs2.jdbc.url.base}?tez.queue.name=myqueue 
from beeline-site.xml
Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
beeline>
{code}

Relevant code is found in 
https://github.com/apache/hive/blob/master/beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineSiteParser.java#L94

Entry#getValue() skips the variable expansion .  Using 
Configuration#get(key) would make this work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22324) Checkin test output changes due to Calcite 1.21 upgrade

2019-10-10 Thread Steve Carlin (Jira)
Steve Carlin created HIVE-22324:
---

 Summary: Checkin test output changes due to Calcite 1.21 upgrade
 Key: HIVE-22324
 URL: https://issues.apache.org/jira/browse/HIVE-22324
 Project: Hive
  Issue Type: Sub-task
Reporter: Steve Carlin


On the upgrade to Calcite 1.21, CALC-2991 caused a change in some of the 
planner output.  This initial hive checkin for the upgrade did an override of 
the RelMdMaxRowCount class to simulate 1.19 behavior.

This task is to remove the HiveRelMdMaxRowCount class, use the new 1.21 code, 
and change the q.out files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 10, 2019, 4:09 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 


Diff: https://reviews.apache.org/r/71589/diff/2/

Changes: https://reviews.apache.org/r/71589/diff/1-2/


Testing
---

Unit + manual test


File Attachments


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch


Thanks,

Denys Kuzmenko



Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Denys Kuzmenko via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/
---

(Updated Oct. 10, 2019, 4:09 p.m.)


Review request for hive, Laszlo Pinter and Peter Vary.


Bugs: HIVE-21114
https://issues.apache.org/jira/browse/HIVE-21114


Repository: hive-git


Description
---

With HIVE-21036 we have a way to indicate that a txn is read only.
We should (at least in auto-commit mode) determine if the single stmt is a read 
and mark the txn accordingly.
Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks in 
write_set etc.

TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
TXN_OPEN) so it can read the txn type in the same SQL stmt.

HiveOperation only has QUERY, which includes Insert and Select, so this 
requires figuring out how to determine if a query is a SELECT. By the time 
Driver.openTransaction(); is called, we have already parsed the query so there 
should be a way to know if the statement only reads.

For multi-stmt txns (once these are supported) we should allow user to indicate 
that a txn is read-only and then not allow any statements that can make 
modifications in this txn. This should be a different jira.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java ac813c8288 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 1c53426966 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
cc86afedbf 
  ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java PRE-CREATION 


Diff: https://reviews.apache.org/r/71589/diff/1/


Testing
---

Unit + manual test


File Attachments (updated)


HIVE-21114.1.patch
  
https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch


Thanks,

Denys Kuzmenko



[jira] [Created] (HIVE-22323) Fix Desc Table bugs

2019-10-10 Thread Miklos Gergely (Jira)
Miklos Gergely created HIVE-22323:
-

 Summary: Fix Desc Table bugs
 Key: HIVE-22323
 URL: https://issues.apache.org/jira/browse/HIVE-22323
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Miklos Gergely
Assignee: Miklos Gergely
 Fix For: 4.0.0


DESC TABLE operation is having the following bugs:
 # Whole table descs have two headers.
 # Table column desc has incorrect long header, while the table is transposed 
having the headers in the first column.
 # Json formatted data also has the headers.
 # Json formatted data doesn't have the column statistics in it.
 # There is no TestBeeLineDriver test for desc table, thus the actual output is 
not tested, just some intermediary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22322) Remove Netty3 dependency of llap-server

2019-10-10 Thread Ivan Suller (Jira)
Ivan Suller created HIVE-22322:
--

 Summary: Remove Netty3 dependency of llap-server
 Key: HIVE-22322
 URL: https://issues.apache.org/jira/browse/HIVE-22322
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Reporter: Ivan Suller
Assignee: Ivan Suller


Llap-server depends on both Netty3 and Netty4. As Netty3 has known security 
issues fixed only in Netty4 we should eliminate any dependency on Netty3.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Denys Kuzmenko via Review Board


> On Oct. 10, 2019, 7:46 a.m., Peter Vary wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
> > Lines 47 (patched)
> > 
> >
> > What about CREATE TABLE AS SELECT * FROM...?
> > We still might miss some cases.
> > 
> > I would create an assertion on generating writeId for read only 
> > transactions (this would be usefull anyway), and use the ptest to run on 
> > all of the test cases to see if this assertion breaks anything.
> > 
> > What do you think?
> 
> Denys Kuzmenko wrote:
> Hi Peter, yes, I was thinking about something similar, however most 
> propably it would be one time check (won't be commited). 
> Somewhere in Driver.compile method, after SemanticAnalyzer add assert if 
> transaction marked as ReadOnly doesn't have assosiated write ids. 
> This way we could make sure we do not misclasify some of the existing 
> queries. Does this makes sence?
> 
> Peter Vary wrote:
> I would vote for the check to be committed for several reasons:
> - We might cause strange/flaky errors if we assume that a transacion is 
> RO, but in reality it writes something. Easier to catch this if we fail fast.
> - When introducing new commands, it would be easy to forget to update 
> this check, but if the assertion is there we will catch them compile time - 
> again fail fast
> 
> Just my 2 cents

Agreed! Thank you, Peter!


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218174
---


On Oct. 8, 2019, 2:27 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated Oct. 8, 2019, 2:27 p.m.)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/1/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



[jira] [Created] (HIVE-22321) Setting default nulls last does not take effect when order direction is specified

2019-10-10 Thread Krisztian Kasa (Jira)
Krisztian Kasa created HIVE-22321:
-

 Summary: Setting default nulls last does not take effect when 
order direction is specified
 Key: HIVE-22321
 URL: https://issues.apache.org/jira/browse/HIVE-22321
 Project: Hive
  Issue Type: Bug
  Components: Parser
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa


{code}
SET hive.default.nulls.last=true;
SELECT * FROM t_test ORDER BY col1 ASC;
{code}
{code}
POSTHOOK: query: SELECT * FROM t_test ORDER BY col1 ASC
POSTHOOK: type: QUERY
POSTHOOK: Input: default@t_test
 A masked pattern was here 
NULL
NULL
NULL
NULL
3
5
5
{code}

https://github.com/apache/hive/blob/cb83da943c8919e2ab3751244de5c2879c8fda1d/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g#L2510



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22320) Cluster and fs type settings can be replaced with a single minicluster setting in CliConfigs

2019-10-10 Thread Jira
László Bodor created HIVE-22320:
---

 Summary: Cluster and fs type settings can be replaced with a 
single minicluster setting in CliConfigs
 Key: HIVE-22320
 URL: https://issues.apache.org/jira/browse/HIVE-22320
 Project: Hive
  Issue Type: Bug
Reporter: László Bodor






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-10-10 Thread Marta Kuczora via Review Board


> On Oct. 10, 2019, 9 a.m., Peter Vary wrote:
> > Thanks for chasing this down!
> > Really appreciate it!

Thanks a lot for the review!


> On Oct. 10, 2019, 9 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
> > Lines 157 (patched)
> > 
> >
> > This is the best way to check this?
> > Is this always starts with char? CHAR? or anything else is not possible?

It always start with "char", but you are right that it is not the best way to 
check it. I changed it to use at least the name of the CHAR serde constant.


> On Oct. 10, 2019, 9 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
> > Lines 181 (patched)
> > 
> >
> > I do not like this.
> > Either we only aim for space, or we aim for whitespace characters, but 
> > the check and the replace is different.

You are right, thanks for pointing this out. Since the regex will always 
replace the whitespaces at the end of the string, the check if the string ends 
with space is not event necessary. If it doesn't end with space, the regex 
replace will do nothing.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71606/#review218175
---


On Oct. 10, 2019, 11:39 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71606/
> ---
> 
> (Updated Oct. 10, 2019, 11:39 a.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-21407
> https://issues.apache.org/jira/browse/HIVE-21407
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The previous approach didn't solve all use cases. In this new approach the 
> hive type is sent to the Parquet PPD part and trim the value which is pushed 
> to the predicate in case of CHAR hive type.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
>  5b051dd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
> fc9188f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
> 033e26a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
>  ca5e085 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
>  0210a0a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
>  7c7c657 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
> 4c40908 
>   ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 
>   ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71606/diff/2/
> 
> 
> Testing
> ---
> 
> Added new q test for testing the PPD for char and varchar types. Also 
> extended the unit tests for the 
> ParquetFilterPredicateConverter.toFilterPredicate method.
> 
> 
> The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are 
> both testing the same thing, the behavior of the 
> ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make 
> sense to have tests for the same use case in different test classes, so moved 
> the test cases from the TestParquetRecordReaderWrapper to 
> TestParquetFilterPredicate.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-10-10 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71606/
---

(Updated Oct. 10, 2019, 11:39 a.m.)


Review request for hive and Peter Vary.


Changes
---

Fix the issues from the review.


Bugs: HIVE-21407
https://issues.apache.org/jira/browse/HIVE-21407


Repository: hive-git


Description
---

The previous approach didn't solve all use cases. In this new approach the hive 
type is sent to the Parquet PPD part and trim the value which is pushed to the 
predicate in case of CHAR hive type.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
 5b051dd 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
fc9188f 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
033e26a 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 ca5e085 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
 0210a0a 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
 7c7c657 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
4c40908 
  ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 
  ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/71606/diff/2/

Changes: https://reviews.apache.org/r/71606/diff/1-2/


Testing
---

Added new q test for testing the PPD for char and varchar types. Also extended 
the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method.


The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both 
testing the same thing, the behavior of the 
ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense 
to have tests for the same use case in different test classes, so moved the 
test cases from the TestParquetRecordReaderWrapper to 
TestParquetFilterPredicate.


Thanks,

Marta Kuczora



[jira] [Created] (HIVE-22319) Repl load fails to create partition if the dump is from old version

2019-10-10 Thread mahesh kumar behera (Jira)
mahesh kumar behera created HIVE-22319:
--

 Summary: Repl load fails to create partition if the dump is from 
old version
 Key: HIVE-22319
 URL: https://issues.apache.org/jira/browse/HIVE-22319
 Project: Hive
  Issue Type: Bug
Reporter: mahesh kumar behera
Assignee: mahesh kumar behera


The engine field of column  stats in partition descriptor needs to be 
initialized. Handling needs to be added for column stat events also.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Peter Vary via Review Board


> On okt. 10, 2019, 7:46 de, Peter Vary wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
> > Lines 47 (patched)
> > 
> >
> > What about CREATE TABLE AS SELECT * FROM...?
> > We still might miss some cases.
> > 
> > I would create an assertion on generating writeId for read only 
> > transactions (this would be usefull anyway), and use the ptest to run on 
> > all of the test cases to see if this assertion breaks anything.
> > 
> > What do you think?
> 
> Denys Kuzmenko wrote:
> Hi Peter, yes, I was thinking about something similar, however most 
> propably it would be one time check (won't be commited). 
> Somewhere in Driver.compile method, after SemanticAnalyzer add assert if 
> transaction marked as ReadOnly doesn't have assosiated write ids. 
> This way we could make sure we do not misclasify some of the existing 
> queries. Does this makes sence?

I would vote for the check to be committed for several reasons:
- We might cause strange/flaky errors if we assume that a transacion is RO, but 
in reality it writes something. Easier to catch this if we fail fast.
- When introducing new commands, it would be easy to forget to update this 
check, but if the assertion is there we will catch them compile time - again 
fail fast

Just my 2 cents


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218174
---


On okt. 8, 2019, 2:27 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated okt. 8, 2019, 2:27 du)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/1/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Denys Kuzmenko via Review Board


> On Oct. 10, 2019, 7:46 a.m., Peter Vary wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
> > Lines 47 (patched)
> > 
> >
> > What about CREATE TABLE AS SELECT * FROM...?
> > We still might miss some cases.
> > 
> > I would create an assertion on generating writeId for read only 
> > transactions (this would be usefull anyway), and use the ptest to run on 
> > all of the test cases to see if this assertion breaks anything.
> > 
> > What do you think?

Hi Peter, yes, I was thinking about something similar, however most propably it 
would be one time check (won't be commited). 
Somewhere in Driver.compile method, after SemanticAnalyzer add assert if 
transaction marked as ReadOnly doesn't have assosiated write ids. 
This way we could make sure we do not misclasify some of the existing queries. 
Does this makes sence?


- Denys


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218174
---


On Oct. 8, 2019, 2:27 p.m., Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated Oct. 8, 2019, 2:27 p.m.)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/1/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-10-10 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71606/#review218175
---



Thanks for chasing this down!
Really appreciate it!


ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
Lines 157 (patched)


This is the best way to check this?
Is this always starts with char? CHAR? or anything else is not possible?



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
Lines 181 (patched)


I do not like this.
Either we only aim for space, or we aim for whitespace characters, but the 
check and the replace is different.


- Peter Vary


On okt. 10, 2019, 8:44 de, Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71606/
> ---
> 
> (Updated okt. 10, 2019, 8:44 de)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-21407
> https://issues.apache.org/jira/browse/HIVE-21407
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The previous approach didn't solve all use cases. In this new approach the 
> hive type is sent to the Parquet PPD part and trim the value which is pushed 
> to the predicate in case of CHAR hive type.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
>  5b051dd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
> fc9188f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
> 033e26a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
>  ca5e085 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
>  0210a0a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
>  7c7c657 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
> 4c40908 
>   ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 
>   ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71606/diff/1/
> 
> 
> Testing
> ---
> 
> Added new q test for testing the PPD for char and varchar types. Also 
> extended the unit tests for the 
> ParquetFilterPredicateConverter.toFilterPredicate method.
> 
> 
> The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are 
> both testing the same thing, the behavior of the 
> ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make 
> sense to have tests for the same use case in different test classes, so moved 
> the test cases from the TestParquetRecordReaderWrapper to 
> TestParquetFilterPredicate.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-10-10 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71606/
---

Review request for hive and Peter Vary.


Bugs: HIVE-21407
https://issues.apache.org/jira/browse/HIVE-21407


Repository: hive-git


Description
---

The previous approach didn't solve all use cases. In this new approach the hive 
type is sent to the Parquet PPD part and trim the value which is pushed to the 
predicate in case of CHAR hive type.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
 5b051dd 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
fc9188f 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
033e26a 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 ca5e085 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
 0210a0a 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
 7c7c657 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
4c40908 
  ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 
  ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/71606/diff/1/


Testing
---

Added new q test for testing the PPD for char and varchar types. Also extended 
the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method.


The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both 
testing the same thing, the behavior of the 
ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense 
to have tests for the same use case in different test classes, so moved the 
test cases from the TestParquetRecordReaderWrapper to 
TestParquetFilterPredicate.


Thanks,

Marta Kuczora



Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218174
---



One more question to find a way to identify write queries.
Otherwise as discussed, I do not see a better way to check the type fo the 
transaction :(


ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
Lines 47 (patched)


What about CREATE TABLE AS SELECT * FROM...?
We still might miss some cases.

I would create an assertion on generating writeId for read only 
transactions (this would be usefull anyway), and use the ptest to run on all of 
the test cases to see if this assertion breaks anything.

What do you think?


- Peter Vary


On okt. 8, 2019, 2:27 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated okt. 8, 2019, 2:27 du)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/1/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



[jira] [Created] (HIVE-22318) Java.io.exception:Two readers for

2019-10-10 Thread max_c (Jira)
max_c created HIVE-22318:


 Summary: Java.io.exception:Two readers for
 Key: HIVE-22318
 URL: https://issues.apache.org/jira/browse/HIVE-22318
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Affects Versions: 3.1.0
Reporter: max_c
 Attachments: hiveserver2 for exception.log

I create a ACID table with ORC format:

 
{noformat}
CREATE TABLE `some.TableA`( 
   
   )   
 ROW FORMAT SERDE   
   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  
 STORED AS INPUTFORMAT  
   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
 OUTPUTFORMAT   
   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'  
 TBLPROPERTIES (
   'bucketing_version'='2', 
   'orc.compress'='SNAPPY', 
   'transactional'='true',  
   'transactional_properties'='default'){noformat}
After executing merge into operation:
{noformat}
MERGE INTO some.TableA AS a USING (SELECT vend_no FROM some.TableB UNION ALL 
SELECT vend_no FROM some.TableC) AS b ON a.vend_no=b.vend_no WHEN MATCHED THEN 
DELETE
{noformat}
the problem happend(when selecting the TableA, the exception happens too):
{noformat}
java.io.IOException: java.io.IOException: Two readers for {originalWriteId: 4, 
bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}: new 
[key={originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 
25}, nextRecord={2, 4, 536870912, 2434, 25, null}, reader=Hive ORC 
Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_1,
 9223372036854775807)], old [key={originalWriteId: 4, bucket: 536870912(1.0.0), 
row: 2434, currentWriteId 25}, nextRecord={2, 4, 536870912, 2434, 25, null}, 
reader=Hive ORC 
Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_015_026/bucket_0{noformat}
Through orc_tools I scan all the files(bucket_0,bucket_1,bucket_2) 
under delete_delta and find all rows of files are the same.I think this will 
cause the same key(RecordIdentifer) when scan the bucket_1 after 
bucket_0 but I don't know why all the rows are the same in these bucket 
files.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)