I don’t think there is any way for you to get rid of the table level shared 
lock (though it may be a reasonable improvement to make).
You could use 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.txn.strict.locking.mode
To change X lock on write to S lock to get around this but this may not be 
appropriate for the rest of your logic.

Eugene

From: Igor Kuzmenko <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Friday, October 13, 2017 at 2:16 AM
To: "[email protected]" <[email protected]>
Subject: Re: Hive locking mechanism on read partition.

Hi, Eugene.

Tables are not transactional and locks are backed by DbTxnManager.

On Fri, Oct 13, 2017 at 2:30 AM, Eugene Koifman 
<[email protected]<mailto:[email protected]>> wrote:
Which lock manager are you using?
Do you have acid enabled and if so are these tables transactional?

Eugene


From: Igor Kuzmenko <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, October 12, 2017 at 3:58 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Hive locking mechanism on read partition.

Hello, I'm using HDP 2.5.0.0  with included hive 1.2.1. And I have problem with 
locking mechanism.

Most of my queries to hive looks like this.

(1)    insert into table results_table partition(task_id=${task_id})
        select * from data_table  where ....;

results_table partitioned by task_id field and i expect to get exclusive lock 
on corresponding partition.Which is true:

Lock ID Database
        Table Partition
  State    Blocked By
Type           Transaction ID
136639682.4 default
results_table         task_id=5556
  ACQUIRED
                 EXCLUSIVE           NULL



Another type of query is fetching data from results_table:

(2)  select * from results_table where task_id = ${task_id}

This select doesn't require any map reduce and executes fast. This is exactly 
what I want.
But if I execute this two queries at the same time I can't perform read from 
result_table partition while inserting data into another.

Locks looks like this:

Lock ID Database
        Table Partition
  State         Blocked By
Type         Transaction ID
136639682.4 default
results_table         task_id=5556
  ACQUIRED
                        EXCLUSIVE         NULL
136639700.1 default
results_table         NULL
  WAITING 136639682.4
SHARED_READ NULL
136639700.2 default
results_table         task_id=5535
   WAITING
                SHARED_READ NULL


Reading data from specified partition requires shared lock on whole table. This 
prevents me to get data untill first query completes.

As I can see on this page 
<https://cwiki.apache.org/confluence/display/Hive/Locking#Locking-UseCases>  
this is expected behaivor. But I don't understand why we need lock on table.
Can I get rid of shared lock on whole table, while still having shared lock on 
specific partition?




Reply via email to