[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-19 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Attachment: HIVE-13622.branch-1.patch

filed HIVE-13795 to followup on Item 3.

Committed to branch-1 and master
Thanks Alan for the review

> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-13622.2.patch, HIVE-13622.3.patch, 
> HIVE-13622.4.patch, HIVE-13622.branch-1.patch
>
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> 5. TxnHandler.enqueueLockWithRetry() - this currently adds components that 
> are only being read to TXN_COMPONENTS.   This is useless at best since read 
> op don't generate anything to compact.  For example, delete from T where t1 
> in (select c1 from C) - no reason to add C to txn_components but we do.
>  
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()
> Also see comments in 
> [here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Status: Open  (was: Patch Available)

> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13622.2.patch, HIVE-13622.3.patch, 
> HIVE-13622.4.patch
>
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> 5. TxnHandler.enqueueLockWithRetry() - this currently adds components that 
> are only being read to TXN_COMPONENTS.   This is useless at best since read 
> op don't generate anything to compact.  For example, delete from T where t1 
> in (select c1 from C) - no reason to add C to txn_components but we do.
>  
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()
> Also see comments in 
> [here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-16 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Attachment: HIVE-13622.4.patch

> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13622.2.patch, HIVE-13622.3.patch, 
> HIVE-13622.4.patch
>
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> 5. TxnHandler.enqueueLockWithRetry() - this currently adds components that 
> are only being read to TXN_COMPONENTS.   This is useless at best since read 
> op don't generate anything to compact.  For example, delete from T where t1 
> in (select c1 from C) - no reason to add C to txn_components but we do.
>  
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()
> Also see comments in 
> [here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-16 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Attachment: HIVE-13622.3.patch

> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13622.2.patch, HIVE-13622.3.patch
>
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> 5. TxnHandler.enqueueLockWithRetry() - this currently adds components that 
> are only being read to TXN_COMPONENTS.   This is useless at best since read 
> op don't generate anything to compact.  For example, delete from T where t1 
> in (select c1 from C) - no reason to add C to txn_components but we do.
>  
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()
> Also see comments in 
> [here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Status: Patch Available  (was: Open)

> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13622.2.patch
>
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> 5. TxnHandler.enqueueLockWithRetry() - this currently adds components that 
> are only being read to TXN_COMPONENTS.   This is useless at best since read 
> op don't generate anything to compact.  For example, delete from T where t1 
> in (select c1 from C) - no reason to add C to txn_components but we do.
>  
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()
> Also see comments in 
> [here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-13 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Attachment: HIVE-13622.2.patch

patch 2 includes items 1,2,5 above

> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-13622.2.patch
>
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> 5. TxnHandler.enqueueLockWithRetry() - this currently adds components that 
> are only being read to TXN_COMPONENTS.   This is useless at best since read 
> op don't generate anything to compact.  For example, delete from T where t1 
> in (select c1 from C) - no reason to add C to txn_components but we do.
>  
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()
> Also see comments in 
> [here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-10 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Description: 
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.
4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
multiple rows into single SQL stmt (but with a limit for extreme cases)
5. TxnHandler.enqueueLockWithRetry() - this currently adds components that are 
only being read to TXN_COMPONENTS.   This is useless at best since read op 
don't generate anything to compact.  For example, delete from T where t1 in 
(select c1 from C) - no reason to add C to txn_components but we do.
 
All of these require some Thrift changes

Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()

Also see comments in 
[here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]

  was:
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.
4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
multiple rows into single SQL stmt (but with a limit for extreme cases)
5. TxnHandler.enqueueLockWithRetry() - this currently adds components that are 
only being read to TXN_COMPONENTS.   This is useless at best since read op 
don't generate anything to compact.
 
All of these require some Thrift changes

Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()

Also see comments in 
[here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]


> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during 

[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-10 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Target Version/s: 1.3.0, 2.1.0

> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> 5. TxnHandler.enqueueLockWithRetry() - this currently adds components that 
> are only being read to TXN_COMPONENTS.   This is useless at best since read 
> op don't generate anything to compact.  For example, delete from T where t1 
> in (select c1 from C) - no reason to add C to txn_components but we do.
>  
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()
> Also see comments in 
> [here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-10 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Description: 
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.
4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
multiple rows into single SQL stmt (but with a limit for extreme cases)
5. TxnHandler.enqueueLockWithRetry() - this currently adds components that are 
only being read to TXN_COMPONENTS.   This is useless at best since read op 
don't generate anything to compact.
 
All of these require some Thrift changes

Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()

Also see comments in 
[here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]

  was:
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.
4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
multiple rows into single SQL stmt (but with a limit for extreme cases)
All of these require some Thrift changes

Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()

Also see comments in 
[here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]


> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> 5. 

[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-05 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Description: 
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.
4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
multiple rows into single SQL stmt (but with a limit for extreme cases)
All of these require some Thrift changes

Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()

Also see comments in 
[here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]

  was:
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.
4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
multiple rows into single SQL stmt (but with a limit for extreme cases)
All of these require some Thrift changes

Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()


> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()
> Also see comments in 
> [here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-05-05 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Priority: Critical  (was: Major)

> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-04-27 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Description: 
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.
4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
multiple rows into single SQL stmt (but with a limit for extreme cases)
All of these require some Thrift changes

Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()

  was:
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.

All of these require some Thrift changes

Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()


> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing 
> multiple rows into single SQL stmt (but with a limit for extreme cases)
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-04-26 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Description: 
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.

All of these require some Thrift changes

Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()

  was:
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.

All of these require some Thrift changes


> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13622) WriteSet tracking optimizations

2016-04-26 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13622:
--
Description: 
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.

All of these require some Thrift changes

  was:
HIVE-13395 solves the the lost update problem with some inefficiencies.

1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  
distinguish between Update and Delete but would be useful.  See comments in 
TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
into TxnHandler.
2. TxnHandler.addDynamicPartitions() should know the OperationType as well from 
the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but 
requires extra SQL statements and is thus less performant.  It will not work 
multi-stmt txns.  See comments in the code.
3. TxnHandler.checkLock() see more comments around 
"isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
called as part of an op running with dynamic partitions, it could be more 
efficient.  In that case we don't have to write to TXN_COMPONENTS at all during 
lock acquisition.  Conversely, if not running with DynPart then, we can kill 
current txn on lock grant rather than wait until commit time.


> WriteSet tracking optimizations
> ---
>
> Key: HIVE-13622
> URL: https://issues.apache.org/jira/browse/HIVE-13622
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.3.0, 2.1.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't 
>  distinguish between Update and Delete but would be useful.  See comments in 
> TxnHandler.  Should be able to pass in Insert/Update/Delete info from client 
> into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well 
> from the client.  It currently extrapolates it from TXN_COMPONENTS.  This 
> works but requires extra SQL statements and is thus less performant.  It will 
> not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around 
> "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being 
> called as part of an op running with dynamic partitions, it could be more 
> efficient.  In that case we don't have to write to TXN_COMPONENTS at all 
> during lock acquisition.  Conversely, if not running with DynPart then, we 
> can kill current txn on lock grant rather than wait until commit time.
> All of these require some Thrift changes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)