[jira] [Created] (HIVE-28121) Use direct SQL for transactional altering table parameter

2024-03-15 Thread Rui Li (Jira)
Rui Li created HIVE-28121:
-

 Summary: Use direct SQL for transactional altering table parameter
 Key: HIVE-28121
 URL: https://issues.apache.org/jira/browse/HIVE-28121
 Project: Hive
  Issue Type: Bug
Reporter: Rui Li
Assignee: Rui Li


Follow up of HIVE-26882, where more details can be found in the discussions 
there.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-14 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827089#comment-17827089
 ] 

Rui Li commented on HIVE-26882:
---

[~pvary] thanks for the pointer. I think it might work. Will run our e2e test 
with it.

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-13 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17826001#comment-17826001
 ] 

Rui Li commented on HIVE-26882:
---

Hi [~pvary], I tried writing direct SQL with JDO:
{code:Java}
String dml = "update ...";
openTransaction();
query = pm.newQuery("javax.jdo.query.SQL", dml);
long numUpdated = (long) query.execute();
...
commitTransaction();
{code}
But I got an error:
{noformat}
javax.jdo.JDOUserException: JDOQL Single-String query should always start with 
SELECT
{noformat}
So it seems JDO only allows direct SELECT statements? I also tried appending a 
SELECT before the UPDATE, but then I got another error indicating it doesn't 
support multiple statements in the query string. Please let me know if I'm not 
using the correct APIs.

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-12 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825557#comment-17825557
 ] 

Rui Li commented on HIVE-26882:
---

bq. I'm trying to suggest to use the direct SQL to update the metadata location 
only, and keep the other parts of the code intact. I think this would be enough 
to prevent concurrent updates of the table.
Yes, but that requires the direct SQL and JDO run in the same transaction, 
right? Otherwise the update will not be atomic. I'm not very familiar with JDO. 
Does {{PersistenceManager::newQuery}} guarantees the query shares the same 
transaction?

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-11 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825297#comment-17825297
 ] 

Rui Li commented on HIVE-26882:
---

bq. What do you see as an issue with that?
The issue is each alter table operation updates more than just the metadata 
location. For example, when we change iceberg table schema, JDO will update 
both the iceberg metadata location, and the HMS storage descriptor. If we use 
direct SQL, then either we follow JDO to generate all the SQL statements, or we 
allow storage descriptor to be out of sync with iceberg metadata.


bq. The API only allows a single checked property, would it be enough to check 
the change of that?
Not sure I understand the question. You can execute multiple update statements 
in the transaction and check the affected rows for each of them. In our PoC, we 
update current and previous metadata location, and leave all other fields out 
of sync.


bq. Would READ COMMITTED serialization level enough for this solution?
I haven't tried that, but seems it will work.


bq. Is this a general solution which would work on all of the supported 
databases?
I only verified it for MariaDB. Not sure about other databases. But I think it 
works as long as the number of affected rows can be decided reliably.

I ran similar test with MS SQL Server 2017 [docker 
image|https://hub.docker.com/_/microsoft-mssql-server], and same as Postgres, 
it throws exception for concurrent writes at REPEATABLE_READ. I didn't find a 
working docker image for Oracle.

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-08 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824800#comment-17824800
 ] 

Rui Li commented on HIVE-26882:
---

I found two options for MariaDB:
# Use SERIALIZABLE level.
# Use REPEATABLE_READ and direct SQL. Concurrent writes can be detected by 
manually checking the number of updated rows. Pseudo code is in my [previous 
comment|https://issues.apache.org/jira/browse/HIVE-26882?focusedCommentId=17823671=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17823671].

In our PoC of option 2, we only update the metadata location for an iceberg 
table, which means other fields in HMS can be out of sync with iceberg 
metadata. This is fine for us, because HMS only serves as a pointer to the 
current metadata in our production. But I'm afraid it's not acceptable for the 
community.

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-08 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824669#comment-17824669
 ] 

Rui Li commented on HIVE-26882:
---

In terms of performance, we tested both SERIALIZABLE, and REPEATABLE_READ + 
direct SQL. We didn't see obvious difference in throughput. In both tests, 120 
processes finished in about 1 min and all of them succeeded.

If we only consider iceberg tables, the throughput is already much better than 
the HMS lock solution, especially when concurrency is high. In our production, 
60 concurrent writes using HMS lock can take over an hour to finish, most of 
the time is spent acquiring/waiting for the lock.

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-08 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824659#comment-17824659
 ] 

Rui Li commented on HIVE-26882:
---

Hi [~pvary], I tested using docker image {{{}mysql:5.6.17{}}}.

Tried both plain select and select for update, the result is same as in MariaDB.
{code:sql}
txn1> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

txn1> select * from tbl where `key` = 'k' and val = 'v0' for update;
Empty set (19.00 sec)

txn1> update tbl set val = 'v1' where `key` = 'k' and val = 'v0';
Query OK, 0 rows affected (0.00 sec)
Rows matched: 0  Changed: 0  Warnings: 0

txn1> commit;
Query OK, 0 rows affected (0.00 sec)

-

txn2> START TRANSACTION;
Query OK, 0 rows affected (0.00 sec)

txn2> select * from tbl where `key` = 'k' and val = 'v0' for update;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.00 sec)

txn2> update tbl set val = 'v2' where `key` = 'k' and val = 'v0';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

txn2> commit;
Query OK, 0 rows affected (0.01 sec)
{code}

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-07 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824618#comment-17824618
 ] 

Rui Li commented on HIVE-26882:
---

I tried running these SQLs in each transaction, but none of them made the 
concurrent writes fail. Also tried running {{DELETE}} in one transaction and 
{{UPDATE}} in the other and still no failure.
{code:SQL}
select * from tbl;
update tbl set val = '..' where `key` = 'k';
{code}

{code:SQL}
select * from tbl where `key` = 'k';
update tbl set val = '..' where `key` = 'k';
{code}

{code:SQL}
select * from tbl where `key` = 'k' and val = 'v0';
update tbl set val = '..' where `key` = 'k' and val = 'v0';
{code}

{code:SQL}
select * from tbl;
update tbl set val = '..';
{code}
I also find the ANSI SQL not very clear about the isolation semantics, and 
therefore vendors may have different interpretations of the standard.

So I prefer to use {{SERIALIZABLE}} by default, as it's most likely to work 
with all RDBMS, and data consistency is more important than throughput here. We 
can set {{REPEATABLE_READ}} for Postgres specifically, because the behavior is 
clearly documented in Postgres and verified by test.
Allow users to override the level is fine, in case they know better about the 
underlying DB.

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-07 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824352#comment-17824352
 ] 

Rui Li commented on HIVE-26882:
---

I run our test again, with MariaDB at SERIALIZABLE level and the results are 
correct this time. So there must be something wrong with our previous test, 
sorry about that :(

Also enabled audit log in MariaDB and verified the alter table process is 
similar to the test in my last comment, and the underlying error for commit 
conflict is:
{noformat}
Caused by: org.datanucleus.store.rdbms.exceptions.MappedDatastoreException: 
UPDATE `TABLE_PARAMS` SET `PARAM_VALUE` = ? WHERE `TBL_ID`=? AND `PARAM_KEY`=?
at 
org.datanucleus.store.rdbms.scostore.JoinMapStore.internalUpdate(JoinMapStore.java:1020)
at 
org.datanucleus.store.rdbms.scostore.JoinMapStore.put(JoinMapStore.java:304)
... 39 more
Caused by: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: 
Deadlock found when trying to get lock; try restarting transaction
at 
com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:123)
{noformat}

IMO {{REPEATABLE_READ}} cannot guarantee the semantics we need here. [~pvary] 
do you think we can change the level to SERIALIZABLE? Or do you think MariaDB 
doesn't implement {{REPEATABLE_READ}} correctly, in which case we might want to 
use different levels for different DBMS vendor?

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-07 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824303#comment-17824303
 ] 

Rui Li edited comment on HIVE-26882 at 3/7/24 8:34 AM:
---

[~pvary] Thanks for your comment. I ran some test for the scenarios mentioned 
above. Suppose the table is initialized like this:
{code:sql}
select * from tbl;
 key | val
-+-
 k   | v0
(1 row)
{code}
With Postgres:
{code:sql}
txn1> BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
txn1> select * from tbl;
 key | val
-+-
 k   | v0
(1 row)

txn1> update tbl set val = 'v1' where key = 'k' and val = 'v0';
UPDATE 1
txn1> select * from tbl;
 key | val
-+-
 k   | v1
(1 row)

txn1> commit;
COMMIT

-

txn2> BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
txn2> select * from tbl;
 key | val
-+-
 k   | v0
(1 row)

txn2> update tbl set val = 'v2' where key = 'k' and val = 'v0';
ERROR:  could not serialize access due to concurrent update
txn2> select * from tbl;
ERROR:  current transaction is aborted, commands ignored until end of 
transaction block
txn2> commit;
ROLLBACK
{code}
With MariaDB:
{code:sql}
txn1> SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.000 sec)

txn1> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

txn1> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.000 sec)

txn1> update tbl set val = 'v1' where `key` = 'k' and val = 'v0';
Query OK, 1 row affected (0.001 sec)
Rows matched: 1  Changed: 1  Warnings: 0

txn1> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v1   |
+--+--+
1 row in set (0.000 sec)

txn1> commit;
Query OK, 0 rows affected (0.001 sec)

-

txn2> SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.000 sec)

txn2> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

txn2> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.000 sec)

txn2> update tbl set val = 'v2' where `key` = 'k' and val = 'v0';
Query OK, 0 rows affected (20.548 sec)
Rows matched: 0  Changed: 0  Warnings: 0

txn2> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.000 sec)

txn2> commit;
Query OK, 0 rows affected (0.000 sec)
{code}
I understand the MariaDB behavior is not {{SERIALIZABLE}} because no serial 
execution can produce the same result. But I'm not sure whether it violates 
{{REPEATABLE_READ}} – the two select return same results and phantom read is 
allowed.

So I also tested MariaDB at {{SERIALIZABLE}} level:
{code:sql}
txn1> SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Query OK, 0 rows affected (0.000 sec)

txn1> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

txn1> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.001 sec)

txn1> update tbl set val = 'v1' where `key` = 'k' and val = 'v0';
Query OK, 1 row affected (0.001 sec)
Rows matched: 1  Changed: 1  Warnings: 0

txn1> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v1   |
+--+--+
1 row in set (0.001 sec)

txn1> commit;
Query OK, 0 rows affected (0.001 sec)

-

txn2> SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Query OK, 0 rows affected (0.000 sec)

txn2> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

txn2> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.001 sec)

txn2> update tbl set val = 'v2' where `key` = 'k' and val = 'v0';
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting 
transaction
{code}
The result is expected because txn2 is aborted. But I don't know why it didn't 
give correct results when we changed HMS code to use SERIALIZABLE level. I'll 
double check that.


was (Author: lirui):
[~pvary] Thanks for your comment. I ran some test for the scenarios mentioned 
above. Suppose the table is initialized like this:
{code:SQL}
select * from tbl;
 key | val
-+-
 k   | v0
(1 row)
{code}

With Postgres:
{code:SQL}
txn1> BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
txn1> select * from tbl;
 key | val
-+-
 k   | v0
(1 row)

txn1> update tbl set val = 'v1' where key = 'k' and val = 'v0';
UPDATE 1
txn1> select * from tbl;
 key | val
-+-
 k   | v1
(1 row)

txn1> commit;
COMMIT

-

txn2> BEGIN TRANSACTION 

[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-07 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824303#comment-17824303
 ] 

Rui Li commented on HIVE-26882:
---

[~pvary] Thanks for your comment. I ran some test for the scenarios mentioned 
above. Suppose the table is initialized like this:
{code:SQL}
select * from tbl;
 key | val
-+-
 k   | v0
(1 row)
{code}

With Postgres:
{code:SQL}
txn1> BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
txn1> select * from tbl;
 key | val
-+-
 k   | v0
(1 row)

txn1> update tbl set val = 'v1' where key = 'k' and val = 'v0';
UPDATE 1
txn1> select * from tbl;
 key | val
-+-
 k   | v1
(1 row)

txn1> commit;
COMMIT

-

txn2> BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN
txn2> select * from tbl;
 key | val
-+-
 k   | v0
(1 row)

txn2> update tbl set val = 'v2' where key = 'k' and val = 'v0';
ERROR:  could not serialize access due to concurrent update
txn2> select * from tbl;
ERROR:  current transaction is aborted, commands ignored until end of 
transaction block
txn2> commit;
ROLLBACK
{code}

With MariaDB:
{code:SQL}
txn1> SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.000 sec)

txn1> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

txn1> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.000 sec)

txn1> update tbl set val = 'v1' where `key` = 'k' and val = 'v0';
Query OK, 1 row affected (0.001 sec)
Rows matched: 1  Changed: 1  Warnings: 0

txn1> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v1   |
+--+--+
1 row in set (0.000 sec)

txn1> commit;
Query OK, 0 rows affected (0.001 sec)

-

txn2> SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Query OK, 0 rows affected (0.000 sec)

txn2> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

txn2> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.000 sec)

txn2> update tbl set val = 'v2' where `key` = 'k' and val = 'v0';
Query OK, 0 rows affected (20.548 sec)
Rows matched: 0  Changed: 0  Warnings: 0

txn2> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.000 sec)

txn2> commit;
Query OK, 0 rows affected (0.000 sec)
{code}

I understand the MariaDB behavior is not {{SERIALIZABLE}} because no serial 
execution can produce the same result. But I'm not sure whether it violates 
{{REPEATABLE_READ}} -- the two select return same results and phantom read is 
allowed.

So I also tested MariaDB at {{SERIALIZABLE}} level:
{code:SQL}
txn1> SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Query OK, 0 rows affected (0.000 sec)

txn1> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

txn1> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.001 sec)

txn1> update tbl set val = 'v1' where `key` = 'k' and val = 'v0';
Query OK, 1 row affected (0.001 sec)
Rows matched: 1  Changed: 1  Warnings: 0

txn1> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v1   |
+--+--+
1 row in set (0.001 sec)

txn1> commit;
Query OK, 0 rows affected (0.001 sec)

-

txn2> SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;
Query OK, 0 rows affected (0.000 sec)

txn2> START TRANSACTION;
Query OK, 0 rows affected (0.000 sec)

txn2>MariaDB [test]> select * from tbl;
+--+--+
| key  | val  |
+--+--+
| k| v0   |
+--+--+
1 row in set (0.001 sec)

txn2> update tbl set val = 'v2' where `key` = 'k' and val = 'v0';
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting 
transaction
{code}

The result is expected because txn2 is aborted. But I don't know why it didn't 
give correct results when we changed HMS code to use SERIALIZABLE level. I'll 
double check that.

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is 

[jira] [Comment Edited] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-05 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823671#comment-17823671
 ] 

Rui Li edited comment on HIVE-26882 at 3/5/24 3:50 PM:
---

I tested again with MariaDB and there're fewer commit conflicts in HMS log than 
in the test log. This is expected because iceberg checks for conflict itself 
before it calls {{{}alter_table{}}}. The number of conflicts triggered by HMS 
is the same as the number in HMS log.

I also tested with Postgres and the result is correct. I read the 
[doc|https://www.postgresql.org/docs/14/transaction-iso.html#XACT-REPEATABLE-READ]
 and I think it works because:
{quote}a repeatable read transaction cannot modify or lock rows changed by 
other transactions after the repeatable read transaction began
{quote}
But I suspect this is stricter than the ANSI SQL standard. I checked SQL:2011, 
and it says the following about {{SERIALIZABLE}} level:
{quote}The execution of concurrent SQL-transactions at transaction isolation 
level SERIALIZABLE is guaranteed to be serializable. A serializable execution 
is defined to be an execution of the operations of concurrently executing 
SQL-transactions that produces the same effect as some serial execution of 
those same SQL-transactions. A serial execution is one in which each 
SQL-transaction executes to completion before the next SQL-transaction begins.
{quote}
Suppose we have these two concurrent transactions trying to update the 
property. IIUC both transactions can commit and the result can be either {{v1}} 
or {{{}v2{}}}, even for {{SERIALIZABLE}} level.
{code:sql}
txn1> update tbl set val = 'v1' where key = 'k';

txn2> update tbl set val = 'v2' where key = 'k';
{code}
Maybe another solution is to use direct SQL and checks for the number of 
affected rows to detect conflict. We did a PoC for this and it provides correct 
results with MariaDB. The pseudo code is like this:
{code:java}
String key = ...;
String expectedVal = ...;
Table oldTable = ...;
Table newTable = ...;
Connection connection = getConnection(Connection.TRANSACTION_REPEATABLE_READ);
try {
  Statement statement = connection.createStatement();
  if (!expectedVal.equals(oldTable.getParameters().get(key))) {
throw new MetaException("Table has been modified");
  }
  int affectedRows = statement.executeUpdate("UPDATE TABLE_PARAMS SET 
PARAM_VALUE = 'new_val' WHERE TBL_ID = ... AND PARAM_KEY = 'key' AND 
PARAM_VALUE = 'expected_val'");
  if (affectedRows != 1) {
throw new MetaException("Table has been modified");
  }
  connection.commit();
} catch (Throwable t) {
  connection.rollback();
  throw t;
} finally {
  connection.close();
}
{code}
A problem is each iceberg commit can modify multiple properties or even other 
table fields. So it can be difficult to generate all the SQLs manually. Not 
sure how (or whether possible) to do this with JDO.


was (Author: lirui):
I tested again with MariaDB and there're fewer commit conflicts in HMS log than 
in the test log. This is expected because iceberg checks for conflict itself 
before it calls {{{}alter_table{}}}. The number of conflicts triggered by HMS 
is the same as the number in HMS log.

I also tested with Postgres and the result is correct. I read the 
[doc|https://www.postgresql.org/docs/14/transaction-iso.html#XACT-REPEATABLE-READ]
 and I think it works because:
{quote}a repeatable read transaction cannot modify or lock rows changed by 
other transactions after the repeatable read transaction began
{quote}
But I suspect this is stricter than the ANSI SQL standard. I checked SQL:2011, 
and it says the following about {{SERIALIZABLE}} level:
{quote}The execution of concurrent SQL-transactions at transaction isolation 
level SERIALIZABLE is guaranteed to be serializable. A serializable execution 
is defined to be an execution of the operations of concurrently executing 
SQL-transactions that produces the same effect as some serial execution of 
those same SQL-transactions. A serial execution is one in which each 
SQL-transaction executes to completion before the next SQL-transaction begins.
{quote}
Suppose we have these two concurrent transactions trying to update the 
property. IIUC both transactions can commit and the result can be either {{v1}} 
or {{{}v2{}}}, even for {{SERIALIZABLE}} level.
{code:sql}
txn1> update tbl set val = 'v1' where key = 'k';

txn2> update tbl set val = 'v2' where key = 'k';
{code}
Maybe another solution is to use direct SQL and checks for the number of 
affected rows to detect conflict. We did a PoC for this and it also provides 
the correct results. The pseudo code is like this:
{code:java}
String key = ...;
String expectedVal = ...;
Table oldTable = ...;
Table newTable = ...;
Connection connection = getConnection(Connection.TRANSACTION_REPEATABLE_READ);
try {
  Statement statement = connection.createStatement();
  if 

[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-05 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823678#comment-17823678
 ] 

Rui Li commented on HIVE-26882:
---

And if this feature has only been verified with Postgres, perhaps we should 
document that, or even throw an exception for other DBMS type?

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-05 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823671#comment-17823671
 ] 

Rui Li edited comment on HIVE-26882 at 3/5/24 3:22 PM:
---

I tested again with MariaDB and there're fewer commit conflicts in HMS log than 
in the test log. This is expected because iceberg checks for conflict itself 
before it calls {{{}alter_table{}}}. The number of conflicts triggered by HMS 
is the same as the number in HMS log.

I also tested with Postgres and the result is correct. I read the 
[doc|https://www.postgresql.org/docs/14/transaction-iso.html#XACT-REPEATABLE-READ]
 and I think it works because:
{quote}a repeatable read transaction cannot modify or lock rows changed by 
other transactions after the repeatable read transaction began
{quote}
But I suspect this is stricter than the ANSI SQL standard. I checked SQL:2011, 
and it says the following about {{SERIALIZABLE}} level:
{quote}The execution of concurrent SQL-transactions at transaction isolation 
level SERIALIZABLE is guaranteed to be serializable. A serializable execution 
is defined to be an execution of the operations of concurrently executing 
SQL-transactions that produces the same effect as some serial execution of 
those same SQL-transactions. A serial execution is one in which each 
SQL-transaction executes to completion before the next SQL-transaction begins.
{quote}
Suppose we have these two concurrent transactions trying to update the 
property. IIUC both transactions can commit and the result can be either {{v1}} 
or {{{}v2{}}}, even for {{SERIALIZABLE}} level.
{code:sql}
txn1> update tbl set val = 'v1' where key = 'k';

txn2> update tbl set val = 'v2' where key = 'k';
{code}
Maybe another solution is to use direct SQL and checks for the number of 
affected rows to detect conflict. We did a PoC for this and it also provides 
the correct results. The pseudo code is like this:
{code:java}
String key = ...;
String expectedVal = ...;
Table oldTable = ...;
Table newTable = ...;
Connection connection = getConnection(Connection.TRANSACTION_REPEATABLE_READ);
try {
  Statement statement = connection.createStatement();
  if (!expectedVal.equals(oldTable.getParameters().get(key))) {
throw new MetaException("Table has been modified");
  }
  int affectedRows = statement.executeUpdate("UPDATE TABLE_PARAMS SET 
PARAM_VALUE = 'new_val' WHERE TBL_ID = ... AND PARAM_KEY = 'key' AND 
PARAM_VALUE = 'expected_val'");
  if (affectedRows != 1) {
throw new MetaException("Table has been modified");
  }
  connection.commit();
} catch (Throwable t) {
  connection.rollback();
  throw t;
} finally {
  connection.close();
}
{code}
A problem is each iceberg commit can modify multiple properties or even other 
table fields. So it can be difficult to generate all the SQLs manually. Not 
sure how (or whether possible) to do this with JDO.


was (Author: lirui):
I tested again with MariaDB and there're fewer commit conflicts in HMS log than 
in the test log. This is expected because iceberg checks for conflict itself 
before it calls {{alter_table}}. The number of conflicts triggered by HMS is 
the same as the number in HMS log.

I also tested with Postgres and the result is correct. I read the 
[doc|https://www.postgresql.org/docs/14/transaction-iso.html#XACT-REPEATABLE-READ]
 and I think it works because:
bq. a repeatable read transaction cannot modify or lock rows changed by other 
transactions after the repeatable read transaction began
But I suspect this is stricter than the ANSI SQL standard. I checked SQL:2011, 
and it says the following about {{SERIALIZABLE}} level:
bq. The execution of concurrent SQL-transactions at transaction isolation level 
SERIALIZABLE is guaranteed to be serializable. A serializable execution is 
defined to be an execution of the operations of concurrently executing 
SQL-transactions that produces the same effect as some serial execution of 
those same SQL-transactions. A serial execution is one in which each 
SQL-transaction executes to completion before the next SQL-transaction begins.
Suppose we have these two concurrent transactions trying to update the 
property. IIUC the result can be either {{v1}} or {{v2}}, even for 
{{SERIALIZABLE}} level.
{code:SQL}
txn1> update tbl set val = 'v1' where key = 'k';

txn2> update tbl set val = 'v2' where key = 'k';
{code}

Maybe another solution is to use direct SQL and checks for the number of 
affected rows to detect conflict. We did a PoC for this and it also provides 
the correct results. The pseudo code is like this:
{code:Java}
String key = ...;
String expectedVal = ...;
Table oldTable = ...;
Table newTable = ...;
Connection connection = getConnection(Connection.TRANSACTION_REPEATABLE_READ);
try {
  Statement statement = connection.createStatement();
  if (!expectedVal.equals(oldTable.getParameters().get(key))) {
throw new MetaException("Table has been 

[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-05 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823671#comment-17823671
 ] 

Rui Li commented on HIVE-26882:
---

I tested again with MariaDB and there're fewer commit conflicts in HMS log than 
in the test log. This is expected because iceberg checks for conflict itself 
before it calls {{alter_table}}. The number of conflicts triggered by HMS is 
the same as the number in HMS log.

I also tested with Postgres and the result is correct. I read the 
[doc|https://www.postgresql.org/docs/14/transaction-iso.html#XACT-REPEATABLE-READ]
 and I think it works because:
bq. a repeatable read transaction cannot modify or lock rows changed by other 
transactions after the repeatable read transaction began
But I suspect this is stricter than the ANSI SQL standard. I checked SQL:2011, 
and it says the following about {{SERIALIZABLE}} level:
bq. The execution of concurrent SQL-transactions at transaction isolation level 
SERIALIZABLE is guaranteed to be serializable. A serializable execution is 
defined to be an execution of the operations of concurrently executing 
SQL-transactions that produces the same effect as some serial execution of 
those same SQL-transactions. A serial execution is one in which each 
SQL-transaction executes to completion before the next SQL-transaction begins.
Suppose we have these two concurrent transactions trying to update the 
property. IIUC the result can be either {{v1}} or {{v2}}, even for 
{{SERIALIZABLE}} level.
{code:SQL}
txn1> update tbl set val = 'v1' where key = 'k';

txn2> update tbl set val = 'v2' where key = 'k';
{code}

Maybe another solution is to use direct SQL and checks for the number of 
affected rows to detect conflict. We did a PoC for this and it also provides 
the correct results. The pseudo code is like this:
{code:Java}
String key = ...;
String expectedVal = ...;
Table oldTable = ...;
Table newTable = ...;
Connection connection = getConnection(Connection.TRANSACTION_REPEATABLE_READ);
try {
  Statement statement = connection.createStatement();
  if (!expectedVal.equals(oldTable.getParameters().get(key))) {
throw new MetaException("Table has been modified");
  }
  int affectedRows = statement.executeUpdate("UPDATE TABLE_PARAMS SET 
PARAM_VALUE = 'new_val' WHERE TBL_ID = ... AND PARAM_KEY = 'key' AND 
PARAM_VALUE = 'expected_val'");
  if (affectedRows != 1) {
throw new MetaException("Table has been modified");
  }
  connection.commit();
} catch (Throwable t) {
  connection.rollback();
  throw t;
} finally {
  connection.close();
}
{code}
A problem is each iceberg commit can modify multiple properties or even other 
table fields. So it can be difficult to generate all the SQLs manually. Not 
sure how (or whether possible) to do this with JDO.

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-01 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822776#comment-17822776
 ] 

Rui Li commented on HIVE-26882:
---

Hey [~pvary], thanks for your insights! I'll check whether the number of 
failures match and also test with postgres. Will update here once I have the 
results.

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2024-03-01 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822618#comment-17822618
 ] 

Rui Li commented on HIVE-26882:
---

Hi [~pvary], we're trying to improve performance of concurrent write to iceberg 
tables, and are evaluating this feature. While it considerably improves commit 
throughput, there seems to be some consistency issue.

Here's how we test: enable the no-lock feature in an iceberg table, and 
launches 120 processes to concurrently update this table. Each process randomly 
generates a unique key/value pair and adds it to the table property. After all 
the processes finish, we find that the number of successful process doesn't 
match the number of newly added property in the table, e.g. 72 processes 
succeeded but only 37 properties were added.

I searched the HMS logs. There're some commit conflicts [detected by the 
HMS|https://github.com/apache/hive/blob/branch-2.3/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java#L163],
 but I don't see any errors from the underlying DBMS (which implies the 
isolation level not working IIUC).

I also tried changing the isolation level to SERIALIZABLE but it doesn't help.

So I wonder if we're missing any configurations or misusing this feature. Any 
inputs would be helpful.
Our test env:
 * HMS built from the latest 
[branch-2.3|https://github.com/apache/hive/tree/branch-2.3]
 * Apache Iceberg 1.4.3
 * Backend DBMS is 10.1.45-MariaDB

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-7292) Hive on Spark

2021-10-19 Thread Rui Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li reassigned HIVE-7292:


Assignee: Xuefu Zhang  (was: Hui An)

> Hive on Spark
> -
>
> Key: HIVE-7292
> URL: https://issues.apache.org/jira/browse/HIVE-7292
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
>  Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5
> Fix For: 1.1.0
>
> Attachments: Hive-on-Spark.pdf
>
>
> Spark as an open-source data analytics cluster computing framework has gained 
> significant momentum recently. Many Hive users already have Spark installed 
> as their computing backbone. To take advantages of Hive, they still need to 
> have either MapReduce or Tez on their cluster. This initiative will provide 
> user a new alternative so that those user can consolidate their backend. 
> Secondly, providing such an alternative further increases Hive's adoption as 
> it exposes Spark users  to a viable, feature-rich de facto standard SQL tools 
> on Hadoop.
> Finally, allowing Hive to run on Spark also has performance benefits. Hive 
> queries, especially those involving multiple reducer stages, will run faster, 
> thus improving user experience as Tez does.
> This is an umbrella JIRA which will cover many coming subtask. Design doc 
> will be attached here shortly, and will be on the wiki as well. Feedback from 
> the community is greatly appreciated!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-17133) NoSuchMethodError in Hadoop FileStatus.compareTo

2019-12-26 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17003635#comment-17003635
 ] 

Rui Li commented on HIVE-17133:
---

While HADOOP-14683 has been fixed, it seems we still have to fix on our side.
[~sershe] [~xuefuz] Would you mind have a look at the patch?

> NoSuchMethodError in Hadoop FileStatus.compareTo
> 
>
> Key: HIVE-17133
> URL: https://issues.apache.org/jira/browse/HIVE-17133
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-17133.1.patch
>
>
> The stack trace is:
> {noformat}
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.hadoop.fs.FileStatus.compareTo(Lorg/apache/hadoop/fs/FileStatus;)I
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.lambda$getAcidState$0(AcidUtils.java:931)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
>   at java.util.TimSort.sort(TimSort.java:234)
>   at java.util.Arrays.sort(Arrays.java:1512)
>   at java.util.ArrayList.sort(ArrayList.java:1454)
>   at java.util.Collections.sort(Collections.java:175)
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:929)
> {noformat}
> I'm on Hive master and using Hadoop 2.7.2. The method signature in Hadoop 
> 2.7.2 is:
> https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L336
> In Hadoop 2.8.0 it becomes:
> https://github.com/apache/hadoop/blob/release-2.8.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L332
> I think that breaks binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-17133) NoSuchMethodError in Hadoop FileStatus.compareTo

2019-12-26 Thread Rui Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-17133:
--
Status: Patch Available  (was: Open)

> NoSuchMethodError in Hadoop FileStatus.compareTo
> 
>
> Key: HIVE-17133
> URL: https://issues.apache.org/jira/browse/HIVE-17133
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-17133.1.patch
>
>
> The stack trace is:
> {noformat}
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.hadoop.fs.FileStatus.compareTo(Lorg/apache/hadoop/fs/FileStatus;)I
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.lambda$getAcidState$0(AcidUtils.java:931)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
>   at java.util.TimSort.sort(TimSort.java:234)
>   at java.util.Arrays.sort(Arrays.java:1512)
>   at java.util.ArrayList.sort(ArrayList.java:1454)
>   at java.util.Collections.sort(Collections.java:175)
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:929)
> {noformat}
> I'm on Hive master and using Hadoop 2.7.2. The method signature in Hadoop 
> 2.7.2 is:
> https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L336
> In Hadoop 2.8.0 it becomes:
> https://github.com/apache/hadoop/blob/release-2.8.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L332
> I think that breaks binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-17133) NoSuchMethodError in Hadoop FileStatus.compareTo

2019-12-26 Thread Rui Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li reassigned HIVE-17133:
-

Assignee: Rui Li

> NoSuchMethodError in Hadoop FileStatus.compareTo
> 
>
> Key: HIVE-17133
> URL: https://issues.apache.org/jira/browse/HIVE-17133
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-17133.1.patch
>
>
> The stack trace is:
> {noformat}
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.hadoop.fs.FileStatus.compareTo(Lorg/apache/hadoop/fs/FileStatus;)I
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.lambda$getAcidState$0(AcidUtils.java:931)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
>   at java.util.TimSort.sort(TimSort.java:234)
>   at java.util.Arrays.sort(Arrays.java:1512)
>   at java.util.ArrayList.sort(ArrayList.java:1454)
>   at java.util.Collections.sort(Collections.java:175)
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:929)
> {noformat}
> I'm on Hive master and using Hadoop 2.7.2. The method signature in Hadoop 
> 2.7.2 is:
> https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L336
> In Hadoop 2.8.0 it becomes:
> https://github.com/apache/hadoop/blob/release-2.8.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L332
> I think that breaks binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-17133) NoSuchMethodError in Hadoop FileStatus.compareTo

2019-12-26 Thread Rui Li (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-17133:
--
Attachment: HIVE-17133.1.patch

> NoSuchMethodError in Hadoop FileStatus.compareTo
> 
>
> Key: HIVE-17133
> URL: https://issues.apache.org/jira/browse/HIVE-17133
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-17133.1.patch
>
>
> The stack trace is:
> {noformat}
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.hadoop.fs.FileStatus.compareTo(Lorg/apache/hadoop/fs/FileStatus;)I
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.lambda$getAcidState$0(AcidUtils.java:931)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
>   at java.util.TimSort.sort(TimSort.java:234)
>   at java.util.Arrays.sort(Arrays.java:1512)
>   at java.util.ArrayList.sort(ArrayList.java:1454)
>   at java.util.Collections.sort(Collections.java:175)
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:929)
> {noformat}
> I'm on Hive master and using Hadoop 2.7.2. The method signature in Hadoop 
> 2.7.2 is:
> https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L336
> In Hadoop 2.8.0 it becomes:
> https://github.com/apache/hadoop/blob/release-2.8.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L332
> I think that breaks binary compatibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22158) HMS Translation layer - Disallow non-ACID MANAGED tables.

2019-11-12 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973065#comment-16973065
 ] 

Rui Li commented on HIVE-22158:
---

Thanks [~ngangam]. Setting {{metastore.warehouse.external.dir}} solves my issue.

> HMS Translation layer - Disallow non-ACID MANAGED tables.
> -
>
> Key: HIVE-22158
> URL: https://issues.apache.org/jira/browse/HIVE-22158
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22158.1.patch, HIVE-22158.1.patch, 
> HIVE-22158.2.patch
>
>
> In the recent commits, we have allowed non-ACID MANAGED tables to be created 
> by clients that have some form of ACID WRITE capabilities. 
> I think it would make sense to disallow this entirely. MANAGED tables should 
> be ACID tables only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22158) HMS Translation layer - Disallow non-ACID MANAGED tables.

2019-11-11 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971481#comment-16971481
 ] 

Rui Li commented on HIVE-22158:
---

Hi [~ngangam], I recently hit some issue with this change. When I create some 
non-ACID tables, they get silently converted to external tables. And if I try 
to insert data into these tables, I get exception that {{"An external table's 
location should not be located within managed warehouse root directory"}}. So 
it doesn't seem very user friendly to me. Any thoughts?

> HMS Translation layer - Disallow non-ACID MANAGED tables.
> -
>
> Key: HIVE-22158
> URL: https://issues.apache.org/jira/browse/HIVE-22158
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22158.1.patch, HIVE-22158.1.patch, 
> HIVE-22158.2.patch
>
>
> In the recent commits, we have allowed non-ACID MANAGED tables to be created 
> by clients that have some form of ACID WRITE capabilities. 
> I think it would make sense to disallow this entirely. MANAGED tables should 
> be ACID tables only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22053) Function name is not normalized when creating function

2019-07-31 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897084#comment-16897084
 ] 

Rui Li commented on HIVE-22053:
---

Thanks [~pvary] :)

> Function name is not normalized when creating function
> --
>
> Key: HIVE-22053
> URL: https://issues.apache.org/jira/browse/HIVE-22053
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22053.1.patch
>
>
> If a function is created with a name containing upper case characters, we get 
> NoSuchObjectException when trying to get that function.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22053) Function name is not normalized when creating function

2019-07-31 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-22053:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Guess PR is not mandatory. Pushed to master. Thanks Xuefu for the review!

> Function name is not normalized when creating function
> --
>
> Key: HIVE-22053
> URL: https://issues.apache.org/jira/browse/HIVE-22053
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22053.1.patch
>
>
> If a function is created with a name containing upper case characters, we get 
> NoSuchObjectException when trying to get that function.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22053) Function name is not normalized when creating function

2019-07-27 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894617#comment-16894617
 ] 

Rui Li commented on HIVE-22053:
---

Thanks [~xuefuz] for the review.
[~pvary] it'd be good if you can also have a look. BTW do I need to submit a 
PR? Not sure about the code contribution process right now.

> Function name is not normalized when creating function
> --
>
> Key: HIVE-22053
> URL: https://issues.apache.org/jira/browse/HIVE-22053
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-22053.1.patch
>
>
> If a function is created with a name containing upper case characters, we get 
> NoSuchObjectException when trying to get that function.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (HIVE-22053) Function name is not normalized when creating function

2019-07-26 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-22053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893703#comment-16893703
 ] 

Rui Li commented on HIVE-22053:
---

cc [~xuefuz]

> Function name is not normalized when creating function
> --
>
> Key: HIVE-22053
> URL: https://issues.apache.org/jira/browse/HIVE-22053
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-22053.1.patch
>
>
> If a function is created with a name containing upper case characters, we get 
> NoSuchObjectException when trying to get that function.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22053) Function name is not normalized when creating function

2019-07-26 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-22053:
--
Status: Patch Available  (was: Open)

> Function name is not normalized when creating function
> --
>
> Key: HIVE-22053
> URL: https://issues.apache.org/jira/browse/HIVE-22053
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-22053.1.patch
>
>
> If a function is created with a name containing upper case characters, we get 
> NoSuchObjectException when trying to get that function.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-22053) Function name is not normalized when creating function

2019-07-26 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-22053:
--
Attachment: HIVE-22053.1.patch

> Function name is not normalized when creating function
> --
>
> Key: HIVE-22053
> URL: https://issues.apache.org/jira/browse/HIVE-22053
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-22053.1.patch
>
>
> If a function is created with a name containing upper case characters, we get 
> NoSuchObjectException when trying to get that function.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (HIVE-22053) Function name is not normalized when creating function

2019-07-26 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-22053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li reassigned HIVE-22053:
-


> Function name is not normalized when creating function
> --
>
> Key: HIVE-22053
> URL: https://issues.apache.org/jira/browse/HIVE-22053
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>
> If a function is created with a name containing upper case characters, we get 
> NoSuchObjectException when trying to get that function.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (HIVE-21111) ConditionalTask cannot be cast to MapRedTask

2019-03-24 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-2:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> ConditionalTask cannot be cast to MapRedTask
> 
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.1, 3.1.1, 2.3.4
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-2.1.patch
>
>
> We met error like this in our product environment:
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.ConditionalTask 
> cannot be cast to org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:173)
>  
> There is a bug in function 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch:
> if (tsk.isMapRedTask()) {
>  Task newTask = this.processCurrentTask((MapRedTask) 
> tsk,
>  ((ConditionalTask) currTask), physicalContext.getContext());
>  walkerCtx.addToDispatchList(newTask);
> }
> In the above code, when tsk is instance of ConditionalTask, 
> tsk.isMapRedTask() still can be true, but it cannot be cast to MapRedTask.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21111) ConditionalTask cannot be cast to MapRedTask

2019-03-24 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800361#comment-16800361
 ] 

Rui Li commented on HIVE-2:
---

[~qunyan] glad to know it helps. Closing this one.

> ConditionalTask cannot be cast to MapRedTask
> 
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.1, 3.1.1, 2.3.4
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Major
> Attachments: HIVE-2.1.patch
>
>
> We met error like this in our product environment:
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.ConditionalTask 
> cannot be cast to org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:173)
>  
> There is a bug in function 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch:
> if (tsk.isMapRedTask()) {
>  Task newTask = this.processCurrentTask((MapRedTask) 
> tsk,
>  ((ConditionalTask) currTask), physicalContext.getContext());
>  walkerCtx.addToDispatchList(newTask);
> }
> In the above code, when tsk is instance of ConditionalTask, 
> tsk.isMapRedTask() still can be true, but it cannot be cast to MapRedTask.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21111) ConditionalTask cannot be cast to MapRedTask

2019-03-22 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798876#comment-16798876
 ] 

Rui Li commented on HIVE-2:
---

I'm not sure whether it's valid to have a conditional task in another 
conditional task's list in the first place. There was some issue when map join 
and skew join are both enabled, i.e. HIVE-14557. Could you try whether that 
patch helps? If not, please provide a full stack trace of the exception.

> ConditionalTask cannot be cast to MapRedTask
> 
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.1, 3.1.1, 2.3.4
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Major
> Attachments: HIVE-2.1.patch
>
>
> We met error like this in our product environment:
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.ConditionalTask 
> cannot be cast to org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:173)
>  
> There is a bug in function 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch:
> if (tsk.isMapRedTask()) {
>  Task newTask = this.processCurrentTask((MapRedTask) 
> tsk,
>  ((ConditionalTask) currTask), physicalContext.getContext());
>  walkerCtx.addToDispatchList(newTask);
> }
> In the above code, when tsk is instance of ConditionalTask, 
> tsk.isMapRedTask() still can be true, but it cannot be cast to MapRedTask.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21111) ConditionalTask cannot be cast to MapRedTask

2019-03-14 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793257#comment-16793257
 ] 

Rui Li commented on HIVE-2:
---

[~qunyan], could you provide a case that can reproduce this issue?

> ConditionalTask cannot be cast to MapRedTask
> 
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 2.1.1, 3.1.1, 2.3.4
>Reporter: zhuwei
>Assignee: zhuwei
>Priority: Major
> Attachments: HIVE-2.1.patch
>
>
> We met error like this in our product environment:
> java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.ConditionalTask 
> cannot be cast to org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:173)
>  
> There is a bug in function 
> org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch:
> if (tsk.isMapRedTask()) {
>  Task newTask = this.processCurrentTask((MapRedTask) 
> tsk,
>  ((ConditionalTask) currTask), physicalContext.getContext());
>  walkerCtx.addToDispatchList(newTask);
> }
> In the above code, when tsk is instance of ConditionalTask, 
> tsk.isMapRedTask() still can be true, but it cannot be cast to MapRedTask.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2018-12-18 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724030#comment-16724030
 ] 

Rui Li commented on HIVE-17020:
---

Hi [~vgarg], is the failure just due to output diff?

> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-17020.1.patch, HIVE-17020.2.patch, 
> HIVE-17020.3.patch
>
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2018-12-11 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16718383#comment-16718383
 ] 

Rui Li commented on HIVE-17020:
---

+1

> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17020.1.patch, HIVE-17020.2.patch, 
> HIVE-17020.3.patch
>
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2018-12-11 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16716911#comment-16716911
 ] 

Rui Li commented on HIVE-17020:
---

{code}
   quotedid_smb.q,\
+  reducesink_dedup.q\
   resourceplan.q,\
{code}
Missing a comma here?

> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17020.1.patch, HIVE-17020.2.patch
>
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-11-16 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-14557:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks [~ychena] for the review.

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-14557.2.patch, HIVE-14557.3.patch, 
> HIVE-14557.3.patch, HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-11-14 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-14557:
--
Attachment: HIVE-14557.3.patch

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-14557.2.patch, HIVE-14557.3.patch, 
> HIVE-14557.3.patch, HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-11-14 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687395#comment-16687395
 ] 

Rui Li commented on HIVE-14557:
---

Sure, attaching the same patch

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-14557.2.patch, HIVE-14557.3.patch, 
> HIVE-14557.3.patch, HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-11-14 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686476#comment-16686476
 ] 

Rui Li commented on HIVE-14557:
---

The metastore tests passed on my laptop. [~ychena], any idea what might be the 
cause of the failures?

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-14557.2.patch, HIVE-14557.3.patch, HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-11-09 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681401#comment-16681401
 ] 

Rui Li commented on HIVE-14557:
---

Sorry about the late response. Attach an updated patch.

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-14557.2.patch, HIVE-14557.3.patch, HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-11-09 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-14557:
--
Attachment: HIVE-14557.3.patch

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-14557.2.patch, HIVE-14557.3.patch, HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-08-06 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569935#comment-16569935
 ] 

Rui Li commented on HIVE-14557:
---

[~aihuaxu], could you please verify whether the patch can solve your issue?

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-14557.2.patch, HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-08-06 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569817#comment-16569817
 ] 

Rui Li commented on HIVE-14557:
---

Upload a patch based on Nemon's solution.

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-14557.2.patch, HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-08-06 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-14557:
--
Attachment: HIVE-14557.2.patch

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-14557.2.patch, HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-14557) Nullpointer When both SkewJoin and Mapjoin Enabled

2018-08-06 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li reassigned HIVE-14557:
-

Assignee: Rui Li

> Nullpointer When both SkewJoin  and Mapjoin Enabled
> ---
>
> Key: HIVE-14557
> URL: https://issues.apache.org/jira/browse/HIVE-14557
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 1.1.0, 2.1.0
>Reporter: Nemon Lou
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-14557.patch
>
>
> The following sql failed with return code 2 on mr.
> {noformat}
> create table a(id int,id1 int);
> create table b(id int,id1 int);
> create table c(id int,id1 int);
> set hive.optimize.skewjoin=true;
> select a.id,b.id,c.id1 from a,b,c where a.id=b.id and a.id1=c.id1;
> {noformat}
> Error log as follows:
> {noformat}
> 2016-08-17 21:13:42,081 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> Id =0
>   
> Id =21
>   
> Id =28
>   
> Id =16
>   
>   <\Children>
>   Id = 28 null<\Parent>
> <\FS>
>   <\Children>
>   Id = 21 nullId = 33 
> Id =33
>   null
>   <\Children>
>   <\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
>   <\Children>
>   Id = 0 null<\Parent>
> <\TS>
>   <\Children>
>   <\Parent>
> <\MAP>
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing operator TS[21]
> 2016-08-17 21:13:42,084 INFO [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Initializing dummy operator
> 2016-08-17 21:13:42,086 INFO [main] 
> org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0, 
> RECORDS_IN:0, 
> 2016-08-17 21:13:42,087 ERROR [main] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: Hit error while closing 
> operators - failing tree
> 2016-08-17 21:13:42,088 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.lang.RuntimeException: Hive Runtime Error 
> while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:474)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:696)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:189)
>   ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20032) Don't serialize hashCode for repartitionAndSortWithinPartitions

2018-07-26 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559162#comment-16559162
 ] 

Rui Li commented on HIVE-20032:
---

+1

> Don't serialize hashCode for repartitionAndSortWithinPartitions
> ---
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch, HIVE-20032.5.patch, 
> HIVE-20032.6.patch, HIVE-20032.7.patch, HIVE-20032.8.patch, 
> HIVE-20032.9.patch, HIVE-20032.91.patch, HIVE-20032.92.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20032) Don't serialize hashCode for repartitionAndSortWithinPartitions

2018-07-26 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558085#comment-16558085
 ] 

Rui Li commented on HIVE-20032:
---

[~stakiar], thanks for update. I can run my queries with latest patch.
For the failed tests, I think you can manually add the registrator jar, or 
disable the feature since it's already tested in qtest.

> Don't serialize hashCode for repartitionAndSortWithinPartitions
> ---
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch, HIVE-20032.5.patch, 
> HIVE-20032.6.patch, HIVE-20032.7.patch, HIVE-20032.8.patch, 
> HIVE-20032.9.patch, HIVE-20032.91.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20032) Don't serialize hashCode for repartitionAndSortWithinPartitions

2018-07-25 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555466#comment-16555466
 ] 

Rui Li commented on HIVE-20032:
---

bq. I originally thought that --jars would add the specified jars to the 
executor and driver class path, but apparently thats not the case.
Is this a Spark issue? Because according to the 
[docs|https://github.com/apache/spark/blob/v2.3.0/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L525],
 the jars should be added to CP. It seems ApplicationMaster uses a [custom 
class 
loader|https://github.com/apache/spark/blob/v2.3.0/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L152]
 for the user class, which should load the jars added by {{--jars}}.
A possible cause is that the jars are usually [not added to system class 
loader|https://github.com/apache/spark/blob/v2.3.0/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1273].
 Sometimes that can give you ClassNotFoundException even when the jars are 
there -- you just need to use the correct class loader.

> Don't serialize hashCode for repartitionAndSortWithinPartitions
> ---
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch, HIVE-20032.5.patch, 
> HIVE-20032.6.patch, HIVE-20032.7.patch, HIVE-20032.8.patch, HIVE-20032.9.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20032) Don't serialize hashCode for repartitionAndSortWithinPartitions

2018-07-25 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1612#comment-1612
 ] 

Rui Li commented on HIVE-20032:
---

I run a simple query in yarn-cluster mode with patch v8 and hit an issue:
{noformat}
2018-07-25T17:58:05,859 ERROR [6f7f3077-05bf-45cc-bf32-4c65132ccf48 main] 
status.SparkJobMonitor: Spark job[-1] failed
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/io/HiveKey
at 
org.apache.hive.spark.HiveKryoRegistrator.registerClasses(HiveKryoRegistrator.java:37)
 ~[hive-kryo-registrator-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$6.apply(KryoSerializer.scala:136)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.serializer.KryoSerializer$$anonfun$newKryo$6.apply(KryoSerializer.scala:136)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
 ~[scala-library-2.11.8.jar:?]
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) 
~[scala-library-2.11.8.jar:?]
at 
org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:136) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:324)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.serializer.KryoSerializerInstance.(KryoSerializer.scala:309)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.serializer.KryoSerializer.newInstance(KryoSerializer.scala:218)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.broadcast.TorrentBroadcast$.blockifyObject(TorrentBroadcast.scala:288)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:127)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.broadcast.TorrentBroadcast.(TorrentBroadcast.scala:88) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1481) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.rdd.HadoopRDD.(HadoopRDD.scala:117) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.SparkContext$$anonfun$hadoopRDD$1.apply(SparkContext.scala:997)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.SparkContext$$anonfun$hadoopRDD$1.apply(SparkContext.scala:988)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.SparkContext.withScope(SparkContext.scala:692) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:988) 
~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:416)
 ~[spark-core_2.11-2.3.0.jar:2.3.0]
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:239)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:176)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:127)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:361)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:400)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:365)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_151]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_151]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_151]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hive.ql.io.HiveKey
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) 
~[?:1.8.0_151]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[?:1.8.0_151]
at 

[jira] [Commented] (HIVE-20032) Don't serialize hashCode for repartitionAndSortWithinPartitions

2018-07-24 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554183#comment-16554183
 ] 

Rui Li commented on HIVE-20032:
---

Hi [~stakiar], I left some comments on RB. Meanwhile, could you explain why we 
need to put the registrator jar in driver's extra class path? Won't {{--jars}} 
add the jar to both driver and executor's class paths? And IIUC, driver extra 
class path only takes jars on local FS. So will that be a problem for cluster 
mode?

> Don't serialize hashCode for repartitionAndSortWithinPartitions
> ---
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch, HIVE-20032.5.patch, 
> HIVE-20032.6.patch, HIVE-20032.7.patch, HIVE-20032.8.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20056) SparkPartitionPruner shouldn't be triggered by Spark tasks

2018-07-22 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552297#comment-16552297
 ] 

Rui Li commented on HIVE-20056:
---

+1

> SparkPartitionPruner shouldn't be triggered by Spark tasks
> --
>
> Key: HIVE-20056
> URL: https://issues.apache.org/jira/browse/HIVE-20056
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20056.1.patch, HIVE-20056.2.patch
>
>
> It looks like {{SparkDynamicPartitionPruner}} is being called by every Spark 
> task because it gets created whenever {{getRecordReader}} is called on the 
> associated {{InputFormat}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20056) SparkPartitionPruner shouldn't be triggered by Spark tasks

2018-07-19 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550173#comment-16550173
 ] 

Rui Li commented on HIVE-20056:
---

Thanks [~stakiar], nice catch. Patch looks good to me.
Nit:
{code}
+if (work instanceof MapWork) {
+  if (HiveConf.isSparkDPPAny(jobConf)) {
{code}
=> {{if (work instanceof MapWork && HiveConf.isSparkDPPAny(jobConf))}}

{code}
   if (spec == null) {
-throw new AssertionException("No partition spec found in dynamic 
pruning");
+throw new IllegalStateException("No partition spec found in dynamic 
pruning");
   }
{code}
Just use Preconditions?

> SparkPartitionPruner shouldn't be triggered by Spark tasks
> --
>
> Key: HIVE-20056
> URL: https://issues.apache.org/jira/browse/HIVE-20056
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20056.1.patch
>
>
> It looks like {{SparkDynamicPartitionPruner}} is being called by every Spark 
> task because it gets created whenever {{getRecordReader}} is called on the 
> associated {{InputFormat}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled

2018-07-18 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1654#comment-1654
 ] 

Rui Li commented on HIVE-20032:
---

Hi [~stakiar], kryo was relocated not just because Spark uses a different 
version. My concern is if we remove the relocation, it may break user's 
applications that depend on Hive.

> Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
> -
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch, HIVE-20032.4.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20032) Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled

2018-07-11 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541076#comment-16541076
 ] 

Rui Li commented on HIVE-20032:
---

Hi [~stakiar], thanks for working on this. There're other cases where RDD is 
cached, e.g. parallel order by. So you need to serialize the hash code in all 
these cases (maybe multi-insert is another one).
Having separate SerDe for caching and shuffling would be good. But I guess that 
needs help from Spark side. And btw, have you run benchmarks to get the 
improvements of this change?

> Don't serialize hashCode when groupByShuffle and RDD cacheing is disabled
> -
>
> Key: HIVE-20032
> URL: https://issues.apache.org/jira/browse/HIVE-20032
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-20032.1.patch, HIVE-20032.2.patch, 
> HIVE-20032.3.patch
>
>
> Follow up on HIVE-15104, if we don't enable RDD cacheing or groupByShuffles, 
> then we don't need to serialize the hashCode when shuffling data in HoS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20007) Hive should carry out timestamp computations in UTC

2018-06-28 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527010#comment-16527010
 ] 

Rui Li commented on HIVE-20007:
---

I see. Thanks [~jcamachorodriguez].

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-20007
> URL: https://issues.apache.org/jira/browse/HIVE-20007
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
>  Labels: timestamp
> Fix For: 4.0.0
>
> Attachments: HIVE-20007.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20007) Hive should carry out timestamp computations in UTC

2018-06-28 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526249#comment-16526249
 ] 

Rui Li commented on HIVE-20007:
---

[~ashutoshc] suggested in HIVE-14412 that we should replace 
{{java.sql.Timestamp}} with LocalDateTime as in-memory representation for 
Timestamp type. I wonder do we still want to do that?

> Hive should carry out timestamp computations in UTC
> ---
>
> Key: HIVE-20007
> URL: https://issues.apache.org/jira/browse/HIVE-20007
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Ryan Blue
>Assignee: Jesus Camacho Rodriguez
>Priority: Blocker
>  Labels: timestamp
> Attachments: HIVE-20007.patch
>
>
> Hive currently uses the "local" time of a java.sql.Timestamp to represent the 
> SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use 
> {{Timestamp#getYear()}} and similar methods to implement SQL functions like 
> {{year}}.
> When the SQL session's time zone is a DST zone, such as America/Los_Angeles 
> that alternates between PST and PDT, there are times that cannot be 
> represented because the effective zone skips them.
> {code}
> hive> select TIMESTAMP '2015-03-08 02:10:00.101';
> 2015-03-08 03:10:00.101
> {code}
> Using UTC instead of the SQL session time zone as the underlying zone for a 
> java.sql.Timestamp avoids this bug, while still returning correct values for 
> {{getYear}} etc. Using UTC as the convenience representation (timestamp 
> without time zone has no real zone) would make timestamp calculations more 
> consistent and avoid similar problems in the future.
> Notably, this would break the {{unix_timestamp}} UDF that specifies the 
> result is with respect to ["the default timezone and default 
> locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions].
>  That function would need to be updated to use the 
> {{System.getProperty("user.timezone")}} zone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19671) Distribute by rand() can lead to data inconsistency

2018-06-24 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16521790#comment-16521790
 ] 

Rui Li commented on HIVE-19671:
---

We can check all RS and look for non-deterministic UDF in partition keys -- 
{{FunctionRegistry::isDeterministic}} can be used.
I noted Hive itself may also use non-deterministic partitioning, e.g. to handle 
skewed GBY, we first shuffle randomly to do partial aggregation. Do you think 
it makes sense to print a warning for that?

> Distribute by rand() can lead to data inconsistency
> ---
>
> Key: HIVE-19671
> URL: https://issues.apache.org/jira/browse/HIVE-19671
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>
> Noticed the following queries can give different results:
> {code}
> select count(*) from tbl;
> select count(*) from (select * from tbl distribute by rand()) a;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19671) Distribute by rand() can lead to data inconsistency

2018-06-21 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519357#comment-16519357
 ] 

Rui Li commented on HIVE-19671:
---

[~xuefuz], I agree it's not trivial to solve this on Hive side. Maybe we can at 
least print some warning if the query has nondeterministic partitioning?
And another potential solution is to retry all downstream tasks when any 
upstream task fails, which needs help from the execution engine.

> Distribute by rand() can lead to data inconsistency
> ---
>
> Key: HIVE-19671
> URL: https://issues.apache.org/jira/browse/HIVE-19671
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>
> Noticed the following queries can give different results:
> {code}
> select count(*) from tbl;
> select count(*) from (select * from tbl distribute by rand()) a;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19671) Distribute by rand() can lead to data inconsistency

2018-06-20 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517846#comment-16517846
 ] 

Rui Li commented on HIVE-19671:
---

[~xuefuz], thanks for your input. I think rand(seed) may not work if the 
mapper's input is not in deterministic order. As an example, suppose a mapper 
needs to process key {{1, 2, 3, 4, 5}}. The partition in 1st attempt is as 
below:
{noformat}
key   rand(seed)
 1  ->   1
 2  ->   2
 3  ->   3
 4  ->   4
 5  ->   5
{noformat}
So there'll be 5 reducers to fetch data from this mapper. Suppose the first 4 
reducers have finished. And when the 5th reducer starts, the node hosting the 
mapper's output is lost, so the mapper is rerun. And the 2nd attempt has the 
following partition:
{noformat}
key   rand(seed)
 1  ->   1
 3  ->   2
 5  ->   3
 2  ->   4
 4  ->   5
{noformat}
Then the 5th reducer is rerun and fetches key 4, which means key 4 is 
duplicated and key 5 is lost.

To avoid the issue, we need to make sure record reader can guarantee an order 
when reading data from HDFS, and we don't use shuffling that doesn't order the 
keys, e.g. groupByKey of Spark. What do you think?

> Distribute by rand() can lead to data inconsistency
> ---
>
> Key: HIVE-19671
> URL: https://issues.apache.org/jira/browse/HIVE-19671
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>
> Noticed the following queries can give different results:
> {code}
> select count(*) from tbl;
> select count(*) from (select * from tbl distribute by rand()) a;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19895) The unique ID in SparkPartitionPruningSinkOperator is no longer needed

2018-06-14 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-19895:
--
Description: With HIVE-19237 in place, seems we don't need to maintain a 
unique ID in 

> The unique ID in SparkPartitionPruningSinkOperator is no longer needed
> --
>
> Key: HIVE-19895
> URL: https://issues.apache.org/jira/browse/HIVE-19895
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Priority: Major
>
> With HIVE-19237 in place, seems we don't need to maintain a unique ID in 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19895) The unique ID in SparkPartitionPruningSinkOperator is no longer needed

2018-06-14 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-19895:
--
Description: With HIVE-19237 in place, seems we don't need to maintain a 
unique ID in SparkPartitionPruningSinkOperator.  (was: With HIVE-19237 in 
place, seems we don't need to maintain a unique ID in )

> The unique ID in SparkPartitionPruningSinkOperator is no longer needed
> --
>
> Key: HIVE-19895
> URL: https://issues.apache.org/jira/browse/HIVE-19895
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Priority: Major
>
> With HIVE-19237 in place, seems we don't need to maintain a unique ID in 
> SparkPartitionPruningSinkOperator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs

2018-06-06 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16502983#comment-16502983
 ] 

Rui Li commented on HIVE-18533:
---

[~stakiar], please take a look at the Yetus report. I think some warnings are 
valid.

> Add option to use InProcessLauncher to submit spark jobs
> 
>
> Key: HIVE-18533
> URL: https://issues.apache.org/jira/browse/HIVE-18533
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-18533.1.patch, HIVE-18533.2.patch, 
> HIVE-18533.3.patch, HIVE-18533.4.patch, HIVE-18533.5.patch, 
> HIVE-18533.6.patch, HIVE-18533.7.patch, HIVE-18533.8.patch, 
> HIVE-18533.9.patch, HIVE-18533.91.patch, HIVE-18533.94.patch, 
> HIVE-18533.95.patch, HIVE-18533.96.patch, HIVE-18533.97.patch, 
> HIVE-18533.98.patch, HIVE-18831.93.patch
>
>
> See discussion in HIVE-16484 for details.
> I think this will help with reducing the amount of time it takes to open a 
> HoS session + debuggability (no need launch a separate process to run a Spark 
> app).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)

2018-06-05 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501662#comment-16501662
 ] 

Rui Li commented on HIVE-16391:
---

[~jerryshao], assigning this to you so you should have permission to upload.

> Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
> -
>
> Key: HIVE-16391
> URL: https://issues.apache.org/jira/browse/HIVE-16391
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Reporter: Reynold Xin
>Priority: Major
>  Labels: pull-request-available
>
> Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the 
> only change in the fork is to work around the issue that Hive publishes only 
> two sets of jars: one set with no dependency declared, and another with all 
> the dependencies included in the published uber jar. That is to say, Hive 
> doesn't publish a set of jars with the proper dependencies declared.
> There is general consensus on both sides that we should remove the forked 
> Hive.
> The change in the forked version is recorded here 
> https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2
> Note that the fork in the past included other fixes but those have all become 
> unnecessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-16391) Publish proper Hive 1.2 jars (without including all dependencies in uber jar)

2018-06-05 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li reassigned HIVE-16391:
-

Assignee: Saisai Shao

> Publish proper Hive 1.2 jars (without including all dependencies in uber jar)
> -
>
> Key: HIVE-16391
> URL: https://issues.apache.org/jira/browse/HIVE-16391
> Project: Hive
>  Issue Type: Task
>  Components: Build Infrastructure
>Reporter: Reynold Xin
>Assignee: Saisai Shao
>Priority: Major
>  Labels: pull-request-available
>
> Apache Spark currently depends on a forked version of Apache Hive. AFAIK, the 
> only change in the fork is to work around the issue that Hive publishes only 
> two sets of jars: one set with no dependency declared, and another with all 
> the dependencies included in the published uber jar. That is to say, Hive 
> doesn't publish a set of jars with the proper dependencies declared.
> There is general consensus on both sides that we should remove the forked 
> Hive.
> The change in the forked version is recorded here 
> https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2
> Note that the fork in the past included other fixes but those have all become 
> unnecessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19602) Refactor inplace progress code in Hive-on-spark progress monitor to use ProgressMonitor instance

2018-06-02 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498996#comment-16498996
 ] 

Rui Li commented on HIVE-19602:
---

Thanks for pinging me. +1 pending test

> Refactor inplace progress code in Hive-on-spark progress monitor to use 
> ProgressMonitor instance
> 
>
> Key: HIVE-19602
> URL: https://issues.apache.org/jira/browse/HIVE-19602
> Project: Hive
>  Issue Type: Bug
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-19602.3.patch
>
>
> We can refactor the HOS inplace progress monitor code 
> (SparkJobMonitor#printStatusInPlace) to use InplaceUpdate#render.
> We can create an instance of ProgressMonitor and use it to show the progress.
> This would be similar to :
> [https://github.com/apache/hive/blob/0b6bea89f74b607299ad944b37e4b62c711aaa69/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/monitoring/RenderStrategy.java#L181]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19237) Only use an operatorId once in a plan

2018-05-30 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495191#comment-16495191
 ] 

Rui Li commented on HIVE-19237:
---

Thanks [~kgyrtkirk], I think it works. For this particular issue, seems you can 
simply make the marker a string instead of a set.

What I had in mind was to pass the branching operator to cloneOperatorTree, so 
that we'll get the cloned branching operator back. Then we need to find cloned 
TS operators to set table meta data. The TS operators can be found using 
logicalEquals because TS must be root operators so they can be used 
interchangeably as long as they logically equal.

Please choose whichever way you think is simpler.

> Only use an operatorId once in a plan
> -
>
> Key: HIVE-19237
> URL: https://issues.apache.org/jira/browse/HIVE-19237
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-19237.01.patch, HIVE-19237.02.patch, 
> HIVE-19237.03.patch, HIVE-19237.04.patch, HIVE-19237.05.patch, 
> HIVE-19237.05.patch, HIVE-19237.06.patch
>
>
> Column stats autogather plan part is added from a plan compiled by the driver 
> itself; however that driver starts to use operatorIds from 1 ; so it's 
> possible that 2 SEL_1 operators end up in the same plan...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19237) Only use an operatorId once in a plan

2018-05-30 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494782#comment-16494782
 ] 

Rui Li commented on HIVE-19237:
---

[~kgyrtkirk], I agree storing origin ID is not ideal and I'd be happy to stop 
relying on operator ID if there's another reliable way to associate cloned 
operators.
Let's go the logicalEquals way (patch v5) to unblock your work here. I'll open 
a follow up for the Spark code. Sorry about the back and forth.

> Only use an operatorId once in a plan
> -
>
> Key: HIVE-19237
> URL: https://issues.apache.org/jira/browse/HIVE-19237
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-19237.01.patch, HIVE-19237.02.patch, 
> HIVE-19237.03.patch, HIVE-19237.04.patch, HIVE-19237.05.patch, 
> HIVE-19237.05.patch, HIVE-19237.06.patch
>
>
> Column stats autogather plan part is added from a plan compiled by the driver 
> itself; however that driver starts to use operatorIds from 1 ; so it's 
> possible that 2 SEL_1 operators end up in the same plan...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs

2018-05-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494623#comment-16494623
 ] 

Rui Li commented on HIVE-18533:
---

[~stakiar] sorry I overlooked your previous comments. +1 pending tests
bq. For your third point, Are there any code changes required for this?
No, we only need to update the wiki. As for the spark-client.jar, I think it's 
bundled in hive-exec.jar so packaging.pom doesn't include it.

> Add option to use InProcessLauncher to submit spark jobs
> 
>
> Key: HIVE-18533
> URL: https://issues.apache.org/jira/browse/HIVE-18533
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18533.1.patch, HIVE-18533.2.patch, 
> HIVE-18533.3.patch, HIVE-18533.4.patch, HIVE-18533.5.patch, 
> HIVE-18533.6.patch, HIVE-18533.7.patch, HIVE-18533.8.patch, 
> HIVE-18533.9.patch, HIVE-18533.91.patch, HIVE-18533.94.patch, 
> HIVE-18533.95.patch, HIVE-18533.96.patch, HIVE-18533.97.patch, 
> HIVE-18831.93.patch
>
>
> See discussion in HIVE-16484 for details.
> I think this will help with reducing the amount of time it takes to open a 
> HoS session + debuggability (no need launch a separate process to run a Spark 
> app).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19237) Only use an operatorId once in a plan

2018-05-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494616#comment-16494616
 ] 

Rui Li commented on HIVE-19237:
---

Hey [~kgyrtkirk], how about we do it this way:
# The old id should only be set to the original id if it's a cloned operator.
# Explicitly set the old id in: 
{{SerializationUtilities::cloneBaseWork,cloneOperatorTree,clonePlan}}, and 
{{Operator::cloneOp,clone}}.

What do you think?

> Only use an operatorId once in a plan
> -
>
> Key: HIVE-19237
> URL: https://issues.apache.org/jira/browse/HIVE-19237
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-19237.01.patch, HIVE-19237.02.patch, 
> HIVE-19237.03.patch, HIVE-19237.04.patch, HIVE-19237.05.patch, 
> HIVE-19237.05.patch, HIVE-19237.06.patch
>
>
> Column stats autogather plan part is added from a plan compiled by the driver 
> itself; however that driver starts to use operatorIds from 1 ; so it's 
> possible that 2 SEL_1 operators end up in the same plan...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19237) Only use an operatorId once in a plan

2018-05-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493479#comment-16493479
 ] 

Rui Li commented on HIVE-19237:
---

Hi [~kgyrtkirk], my concern is there can be multiple logically-same operators. 
Does it make sense to remember the original operatorId in the cloned operator, 
i.e. add an extra field like "clonedFrom"?

> Only use an operatorId once in a plan
> -
>
> Key: HIVE-19237
> URL: https://issues.apache.org/jira/browse/HIVE-19237
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-19237.01.patch, HIVE-19237.02.patch, 
> HIVE-19237.03.patch, HIVE-19237.04.patch, HIVE-19237.05.patch, 
> HIVE-19237.05.patch
>
>
> Column stats autogather plan part is added from a plan compiled by the driver 
> itself; however that driver starts to use operatorIds from 1 ; so it's 
> possible that 2 SEL_1 operators end up in the same plan...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19237) Only use an operatorId once in a plan

2018-05-29 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493443#comment-16493443
 ] 

Rui Li commented on HIVE-19237:
---

The spark tests failed because some code relies on the operatorId to associate 
cloned operator with the original one. Is there any other way to achieve the 
purpose w/o using operatorId?

> Only use an operatorId once in a plan
> -
>
> Key: HIVE-19237
> URL: https://issues.apache.org/jira/browse/HIVE-19237
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-19237.01.patch, HIVE-19237.02.patch, 
> HIVE-19237.03.patch, HIVE-19237.04.patch, HIVE-19237.05.patch, 
> HIVE-19237.05.patch
>
>
> Column stats autogather plan part is added from a plan compiled by the driver 
> itself; however that driver starts to use operatorIds from 1 ; so it's 
> possible that 2 SEL_1 operators end up in the same plan...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19671) Distribute by rand() can lead to data inconsistency

2018-05-28 Thread Rui Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493065#comment-16493065
 ] 

Rui Li commented on HIVE-19671:
---

Verified the issue only happens when there're task retries. I can think of two 
possible solutions:
 # Use rand(seed) instead of rand(). Rand(seed) is supposed to generate 
deterministic sequences so the retried task will have the same partition as 
original one. The prerequisite is the sequence of method calls is the same, and 
the task input is in deterministic order.
 # Disable task retry if the shuffle partition key is non-deterministic, using 
configs like {{mapreduce.map.maxattempts}}, 
{{tez.am.task.max.failed.attempts}}, {{spark.task.maxFailures}}.

[~gopalv], [~xuefuz], [~ashutoshc], do you have any suggestions? Thanks.

> Distribute by rand() can lead to data inconsistency
> ---
>
> Key: HIVE-19671
> URL: https://issues.apache.org/jira/browse/HIVE-19671
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>
> Noticed the following queries can give different results:
> {code}
> select count(*) from tbl;
> select count(*) from (select * from tbl distribute by rand()) a;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19671) Distribute by rand() can lead to data inconsistency

2018-05-28 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li reassigned HIVE-19671:
-

Assignee: Rui Li

> Distribute by rand() can lead to data inconsistency
> ---
>
> Key: HIVE-19671
> URL: https://issues.apache.org/jira/browse/HIVE-19671
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
>
> Noticed the following queries can give different results:
> {code}
> select count(*) from tbl;
> select count(*) from (select * from tbl distribute by rand()) a;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19671) Distribute by rand() can lead to data inconsistency

2018-05-23 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487148#comment-16487148
 ] 

Rui Li commented on HIVE-19671:
---

I haven't verified it but my guess is the issue happens with task failover. 
Suppose mappers of {{distribute by}} finish successfully. Then reducers start 
but fail to fetch shuffle data because some nodes hosting mapper output are 
lost. Then those mappers are retried. But since partition keys are randomly 
generated, the retried tasks can produce different partitions than the previous 
attempt, which leads to the inconsistency.

> Distribute by rand() can lead to data inconsistency
> ---
>
> Key: HIVE-19671
> URL: https://issues.apache.org/jira/browse/HIVE-19671
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Priority: Major
>
> Noticed the following queries can give different results:
> {code}
> select count(*) from tbl;
> select count(*) from (select * from tbl distribute by rand());
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19671) Distribute by rand() can lead to data inconsistency

2018-05-23 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-19671:
--
Description: 
Noticed the following queries can give different results:
{code}
select count(*) from tbl;
select count(*) from (select * from tbl distribute by rand()) a;
{code}

  was:
Noticed the following queries can give different results:
{code}
select count(*) from tbl;
select count(*) from (select * from tbl distribute by rand());
{code}


> Distribute by rand() can lead to data inconsistency
> ---
>
> Key: HIVE-19671
> URL: https://issues.apache.org/jira/browse/HIVE-19671
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Priority: Major
>
> Noticed the following queries can give different results:
> {code}
> select count(*) from tbl;
> select count(*) from (select * from tbl distribute by rand()) a;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs

2018-05-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474138#comment-16474138
 ] 

Rui Li commented on HIVE-18533:
---

Seems the new qtest has unstable output. Please look into it.
 Some more comments:
 # The new configuration takes values like {{SPARK_SUBMIT_CLIENT}} and 
{{SPARK_LAUNCHER_CLIENT}}. How about just use "submit" and "launcher" instead, 
which are user friendly and more like conventional Hive configs?
 # {{hive.spark.client.type}} needs to be added to 
{{HIVE_SPARK_RSC_CONF_LIST}}, so changing it will create a new session.
 # More Spark jars need to be added to Hive in order to use the new client. I 
found 3 extra jars: {{scala-reflect}}, {{spark-launcher}} and {{spark-yarn}}. 
We can update the 
[wiki|https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started#HiveonSpark:GettingStarted-ConfiguringHive]
 once the patch is in.

> Add option to use InProcessLauncher to submit spark jobs
> 
>
> Key: HIVE-18533
> URL: https://issues.apache.org/jira/browse/HIVE-18533
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18533.1.patch, HIVE-18533.2.patch, 
> HIVE-18533.3.patch, HIVE-18533.4.patch, HIVE-18533.5.patch, 
> HIVE-18533.6.patch, HIVE-18533.7.patch, HIVE-18533.8.patch, 
> HIVE-18533.9.patch, HIVE-18533.91.patch, HIVE-18533.94.patch, 
> HIVE-18533.95.patch, HIVE-18831.93.patch
>
>
> See discussion in HIVE-16484 for details.
> I think this will help with reducing the amount of time it takes to open a 
> HoS session + debuggability (no need launch a separate process to run a Spark 
> app).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs

2018-05-12 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473112#comment-16473112
 ] 

Rui Li commented on HIVE-18533:
---

Thanks for the update [~stakiar]. There's some issue with 
TestSparkLauncherSparkClient. I think we can't assert the Future is done after 
we issue the state change -- there can be a race condition and it's not against 
the Future contracts. We can remove those assertions, or we can assert 
Future::get should return in some reasonable amount of time.

> Add option to use InProcessLauncher to submit spark jobs
> 
>
> Key: HIVE-18533
> URL: https://issues.apache.org/jira/browse/HIVE-18533
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18533.1.patch, HIVE-18533.2.patch, 
> HIVE-18533.3.patch, HIVE-18533.4.patch, HIVE-18533.5.patch, 
> HIVE-18533.6.patch, HIVE-18533.7.patch, HIVE-18533.8.patch, 
> HIVE-18533.9.patch, HIVE-18533.91.patch, HIVE-18533.94.patch, 
> HIVE-18831.93.patch
>
>
> See discussion in HIVE-16484 for details.
> I think this will help with reducing the amount of time it takes to open a 
> HoS session + debuggability (no need launch a separate process to run a Spark 
> app).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs

2018-05-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466816#comment-16466816
 ] 

Rui Li commented on HIVE-18533:
---

Hi [~stakiar], my concern for SparkLauncherFuture is mainly about the cancel 
logic. SparkLauncherFuture::cancel calls SparkAppHandle::stop, which I think is 
an async method. So it doesn't immediately unblock threads waiting on 
SparkLauncherFuture::get. And subsequent calls to isCancelled and isDone may 
not return true. Besides, JavaDoc mentions SparkAppHandle::stop is only best 
effort to ask the app to stop, so it doesn't even guarantee a state change.
Another issue is SparkLauncherFuture::isCancelled considers all failed states 
as cancelled. So it may return true even if cancel is not called.

I know this might not be an issue according to the way AbstractSparkClient 
works at the moment. But if we want to make changes to AbstractSparkClient in 
the future, it's better if the two sub-classes behave consistently and both 
honor the Future contracts.

If we use a FutureTask, we can interrupt the thread when we cancel the Future. 
The thread can handle the interrupt exception and call SparkAppHandle::stop 
(probably need to cancel the RPC as well) -- similar to what we do in 
SparkSubmitSparkClient.

> Add option to use InProcessLauncher to submit spark jobs
> 
>
> Key: HIVE-18533
> URL: https://issues.apache.org/jira/browse/HIVE-18533
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18533.1.patch, HIVE-18533.2.patch, 
> HIVE-18533.3.patch, HIVE-18533.4.patch, HIVE-18533.5.patch, 
> HIVE-18533.6.patch, HIVE-18533.7.patch, HIVE-18533.8.patch, 
> HIVE-18533.9.patch, HIVE-18533.91.patch, HIVE-18831.93.patch
>
>
> See discussion in HIVE-16484 for details.
> I think this will help with reducing the amount of time it takes to open a 
> HoS session + debuggability (no need launch a separate process to run a Spark 
> app).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19439) MapWork shouldn't be reused when Spark task fails during initialization

2018-05-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466775#comment-16466775
 ] 

Rui Li commented on HIVE-19439:
---

BTW, the hash table is loaded when we init the dummy operators 
[here|https://github.com/apache/hive/blob/rel/release-2.2.0/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMapRecordHandler.java#L113].

> MapWork shouldn't be reused when Spark task fails during initialization
> ---
>
> Key: HIVE-19439
> URL: https://issues.apache.org/jira/browse/HIVE-19439
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Priority: Major
>
> Issue identified in HIVE-19388. When a Spark task fails during initializing 
> the map operator, the task is retried with the same MapWork retrieved from 
> cache. This can be problematic because the MapWork may be partially 
> initialized, e.g. some operators are already in INIT state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19439) MapWork shouldn't be reused when Spark task fails during initialization

2018-05-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466772#comment-16466772
 ] 

Rui Li commented on HIVE-19439:
---

Hi [~vihangk1], the task is retried by Spark, and it calls 
SparkMapRecordHandler::init to initialize the map operator. This is where we 
retrieve the MapWork [from 
cache|https://github.com/apache/hive/blob/rel/release-2.2.0/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkMapRecordHandler.java#L75].
I'm not sure whether we have a way to reset the operators to UNINIT state. If 
not, guess we have to clear the cache when initialization fails.

> MapWork shouldn't be reused when Spark task fails during initialization
> ---
>
> Key: HIVE-19439
> URL: https://issues.apache.org/jira/browse/HIVE-19439
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Priority: Major
>
> Issue identified in HIVE-19388. When a Spark task fails during initializing 
> the map operator, the task is retried with the same MapWork retrieved from 
> cache. This can be problematic because the MapWork may be partially 
> initialized, e.g. some operators are already in INIT state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19388) ClassCastException during VectorMapJoinCommonOperator initialization

2018-05-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465911#comment-16465911
 ] 

Rui Li commented on HIVE-19388:
---

[~vihangk1], thanks for fixing this. The change looks good. +1
As for your observation about {{spark_vectorized_dynamic_partition_pruning.q}}, 
seems that's indeed another bug. The task fails during MapWork initialization. 
When we retry the task, we retrieve the MapWork from cache. At this point, some 
operator's state is {{State.INIT}}, although the previous initialization 
actually failed. So initialization is skipped and the task somehow finishes 
successfully. I think one way to fix it is to clear the work cache when 
initialization fails. I've created HIVE-19439 to track that.

> ClassCastException during VectorMapJoinCommonOperator initialization
> 
>
> Key: HIVE-19388
> URL: https://issues.apache.org/jira/browse/HIVE-19388
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1, 2.2.0, 3.0.0, 2.3.2, 3.1.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19388.01.patch, HIVE-19388.02.patch
>
>
> I see the following exceptions when I a mapjoin operator is being initialized 
> on Hive-on-Spark and when vectorization is turned on.
> This happens when the hashTable is empty. The code in 
> {{MapJoinTableContainerSerDe#getDefaultEmptyContainer}} method returns a 
> HashMapWrapper while the VectorMapJoinOperator expects a 
> {{MapJoinBytesTableContainer}} when {{hive.mapjoin.optimized.hashtable}} is 
> set to true.
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper cannot be cast to 
> org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerDirectAccess
>  at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.optimized.VectorMapJoinOptimizedHashTable.(VectorMapJoinOptimizedHashTable.java:92)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.optimized.VectorMapJoinOptimizedHashMap.(VectorMapJoinOptimizedHashMap.java:127)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.optimized.VectorMapJoinOptimizedStringHashMap.(VectorMapJoinOptimizedStringHashMap.java:60)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.optimized.VectorMapJoinOptimizedCreateHashTable.createHashTable(VectorMapJoinOptimizedCreateHashTable.java:80)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.setUpHashTable(VectorMapJoinCommonOperator.java:485)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.completeInitializationOp(VectorMapJoinCommonOperator.java:461)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:471)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:401) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:574) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:526) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:387) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:109)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
>  ... 16 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs

2018-05-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465624#comment-16465624
 ] 

Rui Li commented on HIVE-18533:
---

Hi [~stakiar], for SparkLauncherSparkClient, how about we use a thread to wait 
on the countdown latch and return a FutureTask like we did in 
SparkSubmitSparkClient? I think it makes the two clients more consistent and 
it's easier than implementing a custom Future. For example, when 
{{Future::cancel}} is called, threads waiting on {{Future::get}} should 
immediately be unblocked, and {{Future::isCancelled}} should return true. We 
don't have to worry about breaking these contracts if we use FutureTask. What 
do you think?

> Add option to use InProcessLauncher to submit spark jobs
> 
>
> Key: HIVE-18533
> URL: https://issues.apache.org/jira/browse/HIVE-18533
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18533.1.patch, HIVE-18533.2.patch, 
> HIVE-18533.3.patch, HIVE-18533.4.patch, HIVE-18533.5.patch, 
> HIVE-18533.6.patch, HIVE-18533.7.patch, HIVE-18533.8.patch, 
> HIVE-18533.9.patch, HIVE-18533.91.patch, HIVE-18831.93.patch
>
>
> See discussion in HIVE-16484 for details.
> I think this will help with reducing the amount of time it takes to open a 
> HoS session + debuggability (no need launch a separate process to run a Spark 
> app).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs

2018-05-03 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462612#comment-16462612
 ] 

Rui Li commented on HIVE-18533:
---

Hi [~stakiar], could you explain why we now need to monitor the driver with a 
future instead of a thread?

> Add option to use InProcessLauncher to submit spark jobs
> 
>
> Key: HIVE-18533
> URL: https://issues.apache.org/jira/browse/HIVE-18533
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18533.1.patch, HIVE-18533.2.patch, 
> HIVE-18533.3.patch, HIVE-18533.4.patch, HIVE-18533.5.patch, 
> HIVE-18533.6.patch, HIVE-18533.7.patch, HIVE-18533.8.patch, HIVE-18533.9.patch
>
>
> See discussion in HIVE-16484 for details.
> I think this will help with reducing the amount of time it takes to open a 
> HoS session + debuggability (no need launch a separate process to run a Spark 
> app).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs

2018-05-02 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16461318#comment-16461318
 ] 

Rui Li commented on HIVE-18533:
---

[~stakiar], thanks for working on this and sorry for the delay. I had a quick 
overview of the patch and will look into more details tomorrow.
[~xuefuz], you might also want to take a look.

> Add option to use InProcessLauncher to submit spark jobs
> 
>
> Key: HIVE-18533
> URL: https://issues.apache.org/jira/browse/HIVE-18533
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18533.1.patch, HIVE-18533.2.patch, 
> HIVE-18533.3.patch, HIVE-18533.4.patch, HIVE-18533.5.patch, 
> HIVE-18533.6.patch, HIVE-18533.7.patch, HIVE-18533.8.patch
>
>
> See discussion in HIVE-16484 for details.
> I think this will help with reducing the amount of time it takes to open a 
> HoS session + debuggability (no need launch a separate process to run a Spark 
> app).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19316) StatsTask fails due to ClassCastException

2018-04-28 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457537#comment-16457537
 ] 

Rui Li commented on HIVE-19316:
---

My understanding is we can't send LongColumnStatsDataInspector over thrift. 
Maybe we should create LongColumnStatsDataInspector from the 
LongColumnStatsData that we received from the request, and then perform the 
merge.
[~jcamachorodriguez], I think you added the feature in HIVE-17286. Could you 
share your thoughts on this? Thanks.

> StatsTask fails due to ClassCastException
> -
>
> Key: HIVE-19316
> URL: https://issues.apache.org/jira/browse/HIVE-19316
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Rui Li
>Priority: Major
>
> The stack trace:
> {noformat}
> 2018-04-26T20:17:37,674 ERROR [pool-7-thread-11] 
> metastore.RetryingHMSHandler: java.lang.ClassCastException: 
> org.apache.hadoop.hive.metastore.api.LongColumnStatsData cannot be cast to 
> org.apache.hadoop.hive.metastore.columnstats.cache.LongColumnStatsDataInspector
> at 
> org.apache.hadoop.hive.metastore.columnstats.merge.LongColumnStatsMerger.merge(LongColumnStatsMerger.java:30)
> at 
> org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.mergeColStats(MetaStoreUtils.java:1052)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy26.set_aggr_stats_for(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:16795)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:16779)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19316) StatsTask fails due to ClassCastException

2018-04-26 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454296#comment-16454296
 ] 

Rui Li commented on HIVE-19316:
---

The issue only happens with remote HMS. We're expecting 
LongColumnStatsDataInspector in LongColumnStatsMerger, but only 
LongColumnStatsData is defined in {{hive_metastore.thrift}}. Not sure if that 
could be the cause.

> StatsTask fails due to ClassCastException
> -
>
> Key: HIVE-19316
> URL: https://issues.apache.org/jira/browse/HIVE-19316
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Reporter: Rui Li
>Priority: Major
>
> The stack trace:
> {noformat}
> 2018-04-26T20:17:37,674 ERROR [pool-7-thread-11] 
> metastore.RetryingHMSHandler: java.lang.ClassCastException: 
> org.apache.hadoop.hive.metastore.api.LongColumnStatsData cannot be cast to 
> org.apache.hadoop.hive.metastore.columnstats.cache.LongColumnStatsDataInspector
> at 
> org.apache.hadoop.hive.metastore.columnstats.merge.LongColumnStatsMerger.merge(LongColumnStatsMerger.java:30)
> at 
> org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.mergeColStats(MetaStoreUtils.java:1052)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7202)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy26.set_aggr_stats_for(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:16795)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:16779)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19266) Use UDFs in Hive-On-Spark complains Unable to find class Exception regarding kryo

2018-04-26 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453553#comment-16453553
 ] 

Rui Li commented on HIVE-19266:
---

Hey [~jason4z], thanks for reporting the this. The issue should have been fixed 
in HIVE-16292 and the fix version is 2.3.0. Which Hive version are you using? 
The code snippet in your blog post seems to be from an older version.

> Use UDFs in Hive-On-Spark complains Unable to find class Exception regarding 
> kryo
> -
>
> Key: HIVE-19266
> URL: https://issues.apache.org/jira/browse/HIVE-19266
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.3.2
>Reporter: Di Zhu
>Priority: Major
>
> For a SQL with UDF as below in Hive:
> {code:java}
> set hive.execution.engine=spark;
> add jar viewfs:///path_to_the_jar/aaa.jar;
> create temporary function func_name AS 'com.abc.ClassName';
> select func_name(col_a) from table_name limit 100;{code}
> it complains the following error in spark-cluster mode (in spark-client mode 
> it's working fine).
> {code:java}
> ERROR : Job failed with java.lang.ClassNotFoundException: com.abc.ClassName
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find 
> class: com.abc.ClassName
> Serialization trace:
> genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
> colList (org.apache.hadoop.hive.ql.plan.SelectDesc)
> conf (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> left (org.apache.commons.lang3.tuple.ImmutablePair)
> edgeProperties (org.apache.hadoop.hive.ql.plan.SparkWork)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
> at 
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:181)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:118)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
> at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:214)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
> at 
> org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
> at 
> org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
> at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:176)
> at 
> 

[jira] [Updated] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs

2018-04-25 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-17193:
--
   Resolution: Fixed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks for the review, [~stakiar].

> HoS: don't combine map works that are targets of different DPPs
> ---
>
> Key: HIVE-17193
> URL: https://issues.apache.org/jira/browse/HIVE-17193
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-17193.1.patch, HIVE-17193.2.patch, 
> HIVE-17193.3.patch, HIVE-17193.4.patch, HIVE-17193.5.patch
>
>
> Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger 
> the issue:
> {code}
> explain
> select * from
>   (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) 
> a
> join
>   (select srcpart.ds,srcpart.key from srcpart join src on 
> srcpart.ds=src.value) b
> on a.key=b.key;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18862) qfiles: prepare .q files for using datasets

2018-04-24 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451595#comment-16451595
 ] 

Rui Li commented on HIVE-18862:
---

It's great if we can have this in branch-3, which makes it easier to cherry 
pick commits in master into branch-3.

> qfiles: prepare .q files for using datasets
> ---
>
> Key: HIVE-18862
> URL: https://issues.apache.org/jira/browse/HIVE-18862
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18862.01.patch, HIVE-18862.02.patch, 
> HIVE-18862.03.patch, HIVE-18862.04.patch, HIVE-18862.05.patch, 
> HIVE-18862.06.patch, HIVE-18862.07.patch, HIVE-18862.08.patch, 
> HIVE-18862.09.patch
>
>
> # Parse .q files for source table usage
>  # Add needed dataset annotations
>  # Remove create table statements from "q_test_init.sql" like files
>  # Handle oncoming issues related to dataset introduction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs

2018-04-24 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-17193:
--
Attachment: HIVE-17193.5.patch

> HoS: don't combine map works that are targets of different DPPs
> ---
>
> Key: HIVE-17193
> URL: https://issues.apache.org/jira/browse/HIVE-17193
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-17193.1.patch, HIVE-17193.2.patch, 
> HIVE-17193.3.patch, HIVE-17193.4.patch, HIVE-17193.5.patch
>
>
> Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger 
> the issue:
> {code}
> explain
> select * from
>   (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) 
> a
> join
>   (select srcpart.ds,srcpart.key from srcpart join src on 
> srcpart.ds=src.value) b
> on a.key=b.key;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs

2018-04-21 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447064#comment-16447064
 ] 

Rui Li commented on HIVE-17193:
---

The failures are not related. [~stakiar], could you take a look at the latest 
patch? Thanks.

> HoS: don't combine map works that are targets of different DPPs
> ---
>
> Key: HIVE-17193
> URL: https://issues.apache.org/jira/browse/HIVE-17193
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-17193.1.patch, HIVE-17193.2.patch, 
> HIVE-17193.3.patch, HIVE-17193.4.patch
>
>
> Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger 
> the issue:
> {code}
> explain
> select * from
>   (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) 
> a
> join
>   (select srcpart.ds,srcpart.key from srcpart join src on 
> srcpart.ds=src.value) b
> on a.key=b.key;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs

2018-04-17 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-17193:
--
Attachment: HIVE-17193.4.patch

> HoS: don't combine map works that are targets of different DPPs
> ---
>
> Key: HIVE-17193
> URL: https://issues.apache.org/jira/browse/HIVE-17193
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-17193.1.patch, HIVE-17193.2.patch, 
> HIVE-17193.3.patch, HIVE-17193.4.patch
>
>
> Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger 
> the issue:
> {code}
> explain
> select * from
>   (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) 
> a
> join
>   (select srcpart.ds,srcpart.key from srcpart join src on 
> srcpart.ds=src.value) b
> on a.key=b.key;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-04-10 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432179#comment-16432179
 ] 

Rui Li commented on HIVE-18831:
---

+1 pending test

> Differentiate errors that are thrown by Spark tasks
> ---
>
> Key: HIVE-18831
> URL: https://issues.apache.org/jira/browse/HIVE-18831
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18831.1.patch, HIVE-18831.2.patch, 
> HIVE-18831.3.patch, HIVE-18831.4.patch, HIVE-18831.6.patch, 
> HIVE-18831.7.patch, HIVE-18831.8.WIP.patch, HIVE-18831.9.patch, 
> HIVE-18831.90.patch, HIVE-18831.91.patch
>
>
> We propagate exceptions from Spark task failures to the client well, but we 
> don't differentiate between errors from HS2 / RSC vs. errors thrown by 
> individual tasks.
> Main motivation is that when the client sees a propagated Spark exception its 
> difficult to know what part of the excution threw the exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18831) Differentiate errors that are thrown by Spark tasks

2018-04-08 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429750#comment-16429750
 ] 

Rui Li commented on HIVE-18831:
---

Thanks [~stakiar] for the update. The patch looks good to me overall, although 
it needs rebasing. Left only a minor comments on RB.

> Differentiate errors that are thrown by Spark tasks
> ---
>
> Key: HIVE-18831
> URL: https://issues.apache.org/jira/browse/HIVE-18831
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HIVE-18831.1.patch, HIVE-18831.2.patch, 
> HIVE-18831.3.patch, HIVE-18831.4.patch, HIVE-18831.6.patch, 
> HIVE-18831.7.patch, HIVE-18831.8.WIP.patch, HIVE-18831.9.patch, 
> HIVE-18831.90.patch
>
>
> We propagate exceptions from Spark task failures to the client well, but we 
> don't differentiate between errors from HS2 / RSC vs. errors thrown by 
> individual tasks.
> Main motivation is that when the client sees a propagated Spark exception its 
> difficult to know what part of the excution threw the exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs

2018-04-04 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-17193:
--
Attachment: HIVE-17193.3.patch

> HoS: don't combine map works that are targets of different DPPs
> ---
>
> Key: HIVE-17193
> URL: https://issues.apache.org/jira/browse/HIVE-17193
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Major
> Attachments: HIVE-17193.1.patch, HIVE-17193.2.patch, 
> HIVE-17193.3.patch
>
>
> Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger 
> the issue:
> {code}
> explain
> select * from
>   (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) 
> a
> join
>   (select srcpart.ds,srcpart.key from srcpart join src on 
> srcpart.ds=src.value) b
> on a.key=b.key;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18955) HoS: Unable to create Channel from class NioServerSocketChannel

2018-04-03 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-18955:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks guys for the review.

> HoS: Unable to create Channel from class NioServerSocketChannel
> ---
>
> Key: HIVE-18955
> URL: https://issues.apache.org/jira/browse/HIVE-18955
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HIVE-18955.1.patch, HIVE-18955.1.patch
>
>
> Hit the issue when trying launch spark job. Stack trace:
> {noformat}
> Caused by: java.lang.NoSuchMethodError: 
> io.netty.channel.DefaultChannelId.newInstance()Lio/netty/channel/DefaultChannelId;
> at io.netty.channel.AbstractChannel.newId(AbstractChannel.java:111) 
> ~[netty-all-4.1.17.Final.jar:4.1.17.Final]
> at io.netty.channel.AbstractChannel.(AbstractChannel.java:83) 
> ~[netty-all-4.1.17.Final.jar:4.1.17.Final]
> at 
> io.netty.channel.nio.AbstractNioChannel.(AbstractNioChannel.java:84) 
> ~[netty-all-4.1.17.Final.jar:4.1.17.Final]
> at 
> io.netty.channel.nio.AbstractNioMessageChannel.(AbstractNioMessageChannel.java:42)
>  ~[netty-all-4.1.17.Final.jar:4.1.17.Final]
> at 
> io.netty.channel.socket.nio.NioServerSocketChannel.(NioServerSocketChannel.java:86)
>  ~[netty-all-4.1.17.Final.jar:4.1.17.Final]
> at 
> io.netty.channel.socket.nio.NioServerSocketChannel.(NioServerSocketChannel.java:72)
>  ~[netty-all-4.1.17.Final.jar:4.1.17.Final]
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method) ~[?:1.8.0_151]
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
> ~[?:1.8.0_151]
> at 
> io.netty.channel.ReflectiveChannelFactory.newChannel(ReflectiveChannelFactory.java:38)
>  ~[netty-all-4.1.17.Final.jar:4.1.17.Final]
> ... 32 more
> {noformat}
> It seems we have conflicts versions of class 
> {{io.netty.channel.DefaultChannelId}} from async-http-client.jar and 
> netty-all.jar



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >