[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-09-06 Thread Shannon Ladymon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15469081#comment-15469081
 ] 

Shannon Ladymon commented on HIVE-12353:


These have been documented in Configuration Properties:
* [hive.compactor.history.retention.succeeded | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.compactor.history.retention.succeeded]
* [hive.compactor.history.retention.failed | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.compactor.history.retention.failed]
* [hive.compactor.history.retention.attempted | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.compactor.history.retention.attempted]
* [hive.compactor.history.reaper.interval | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.compactor.history.reaper.interval]
* [hive.compactor.initiator.failed.compacts.threshold | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.compactor.initiator.failed.compacts.threshold]

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.6.patch, HIVE-12353.7.patch, 
> HIVE-12353.8.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-08-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438355#comment-15438355
 ] 

Eugene Koifman commented on HIVE-12353:
---

I've update https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions 
but it could use some editing.

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.6.patch, HIVE-12353.7.patch, 
> HIVE-12353.8.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-08-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438309#comment-15438309
 ] 

Eugene Koifman commented on HIVE-12353:
---

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ShowCompactions


> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
>  Labels: TODOC1.3
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.6.patch, HIVE-12353.7.patch, 
> HIVE-12353.8.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110327#comment-15110327
 ] 

Hive QA commented on HIVE-12353:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783461/HIVE-12353.7.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10010 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6690/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6690/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6690/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12783461 - PreCommit-HIVE-TRUNK-Build

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.6.patch, HIVE-12353.7.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-21 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111259#comment-15111259
 ] 

Eugene Koifman commented on HIVE-12353:
---

The failures don't seem to be related but just in case I added patch 8 
(identical to 7) to get another bot run to confirm

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.6.patch, HIVE-12353.7.patch, 
> HIVE-12353.8.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-21 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111519#comment-15111519
 ] 

Alan Gates commented on HIVE-12353:
---

bq. ResultSet.getLong() returns 0 when value was actually NULL so I thought it 
would be more consistent with other places in the code that use 0 as invalid 
txn id
Makes sense

bq. PreparedStatement cc_meta_info is a binary field - I don't think there is 
way to set it w/o prepared statement. (Even though it isn't set yet but will be 
soon)
Ok.

bq. A concern on the history stuff my thought here was that if I'm looking at 
10 failed compactions, it would be useful for me to know when the last 
successful run was. Allowing retention value to be set per type allows that.
Makes sense.  It's unfortunate that users could easily get confused that there 
were actually 50 successful compactions since the last failed one, but maybe we 
can fix that in the future.

bq. checkFailedCompactions  The idea in this loop is to look at the last 
"failedThreshold" elements. If they are all "failed" - don't start another one. 
If not all "failed" - that's the intervening success.
Ok, I missed that the loop only goes through "allowed failed transactions" 
number of records, so you'll never reach that value unless they are all 
transactions.  Weird style, but it works.

bq. It also seems users could unwittingly mess things
bq. Yes, that is why there is 
CompactionTxnHandler.getFailedCompactionRetention() to make sure this doesn't 
happen
Ok, missed that.

I'm +1 on the patch.

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.6.patch, HIVE-12353.7.patch, 
> HIVE-12353.8.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-20 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109886#comment-15109886
 ] 

Alan Gates commented on HIVE-12353:
---

CompactionTxnHandler, around line 295, I'm curious why you removed the line 
{code}
info.highestTxnId = rs.wasNull() ? null : highestTxnId;
{code}
Throughout you seem to assume this will be 0 and never null, but I'm not clear 
why.

In markCleaned, why do the insert into COMPLETED_COMPACTIONS as a prepared 
statement?  You're only executing this once per call, what's the value of 
preparing the statement?  This isn't a big deal.

A concern on the history stuff.  By keeping not just the last x records, but 
the last number of success, fails, and attempts it may be hard for users to see 
that there were actually 50 successes since the 2 fails shown in the compaction 
history.  How will we make that clear to users?  Perhaps you have enough info 
in the sequence numbers to communicate that.

In checkFailedCompactions:
{code}
812 while(rs.next() && ++numTotal <= failedThreshold) {
813   if(rs.getString(1).charAt(0) == FAILED_STATE) {
814 numFailed++;
815   }
816   else {
817 numFailed--;
818   }
819 }
{code}
This logic doesn't look right to me.  This will allow an accumulation of 
failures over time to stop compaction, even if there have been intervening 
successes.  It seems it should stop counting as soon as it sees a success in 
the list of compactions, on the assumption that whatever was previously 
blocking compaction is fixed at that point.

It also seems users could unwittingly mess things up by setting config values 
poorly.  If, for example, they set the number of retained failures to be 2 but 
the number of consecutive failures that prevents further compactions to 5, 
they'd most likely never block compaction, as the new house keeping thread 
would remove the history too quickly.

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.6.patch, HIVE-12353.7.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110009#comment-15110009
 ] 

Eugene Koifman commented on HIVE-12353:
---

bq. info.highestTxnId = rs.wasNull() ? null : highestTxnId;
ResultSet.getLong() returns 0 when value was actually NULL so I thought it 
would be more consistent with other places in the code that use 0 as invalid 
txn id

bq. PreparedStatement
cc_meta_info is a binary field - I don't think there is way to set it w/o 
prepared statement.  (Even though it isn't set yet but will be soon)

bq. A concern on the history stuff
my thought here was that if I'm looking at 10 failed compactions, it would be 
useful for me to know when the last successful run was.  Allowing retention 
value to be set per type allows that.  

bq. checkFailedCompactions
The idea in this loop is to look at the last "failedThreshold" elements.  If 
they are all "failed" - don't start another one.  If not all "failed" - that's 
the intervening success.

bq. It also seems users could unwittingly mess things
Yes, that is why there is CompactionTxnHandler.getFailedCompactionRetention() 
to make sure this doesn't happen

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.6.patch, HIVE-12353.7.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-20 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109657#comment-15109657
 ] 

Eugene Koifman commented on HIVE-12353:
---

[~alangates] could you review patch 6 please

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.6.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-14 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15099215#comment-15099215
 ] 

Alan Gates commented on HIVE-12353:
---

With the commit of HIVE-12832 to master and branch-2.0 I think all of the 
schema changes needed for this are in.

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15099181#comment-15099181
 ] 

Sergey Shelukhin commented on HIVE-12353:
-

Is it possible to include schema changes into a patch for HiveQA here? This 
separate thrift change development process makes no sense whatsoever and will 
delay commits because of HiveQA

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-13 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096799#comment-15096799
 ] 

Alan Gates commented on HIVE-12353:
---

bq. I'm nervous that in the initiator and cleaner we are marking compactions as 
successful (or will be once that bit is done) when they haven't been done. 
I agree you're not making this any worse, as there was no notion of success or 
failure before and all was just forgotten.  I'm wondering if we can make this 
enhancement even better by having a third state beyond success and failure, a 
state where the compaction was never done because the table was dropped or 
something.

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-12 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15094975#comment-15094975
 ] 

Eugene Koifman commented on HIVE-12353:
---

I don't follow the last comment.  This logic is the same as before except for 
cases where there is an error.
Can you point to specific instance of concern?

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-11 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15092956#comment-15092956
 ] 

Alan Gates commented on HIVE-12353:
---


>From comments on CompactionTxnHandler.checkFailedCompactions:
{quote}
   * 1. need config var to control retention period in COMPLETED_COMPACTIONS.  
Should this
   * be system wide or a table prop?  Best is to have both.  Some tables make 
have
   * frequent compactions (every 20min) others once a week.  Thus a single 24h 
retention
   * period won't really work.  Ideally, we should have a general mechanism 
where every (where reasonable)
   * property can be specified system wide and overridden per table.
   * 
   * Perhaps instead retention period should be number of runs worth of history?
{quote}
+1 to having it # of runs instead of time based.  This is much simpler.  If you 
wanted to make it per table the way to do it today would be with table 
properties.  So you could have a system wide configured value and then modify 
it in the table properties.

In markFailed, wouldn't it be more efficient to move the row via a "insert into 
COMPLETED_COMPACTIONS (x, y, z) SELECT x, y, z from COMPACTION_QUEUE rather 
than using your program as a temp store and transmitting the data back and 
forth twice?

I'm nervous that in the initiator and cleaner we are marking compactions as 
successful (or will be once that bit is done) when they haven't been done.  I 
see why you're not marking them failed, since they didn't fail either.  But I'm 
wondering if we need a third state (never ran, or AWOL, or something).



> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-07 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15088121#comment-15088121
 ] 

Alan Gates commented on HIVE-12353:
---

One thought I had was, would it make more sense to create a separate table 
COMPLETED_COMPACTION (like COMPLETED_TXN_COMPONENTS)?  This would give you a 
place to store stuff, possibly for a longer time, without worrying about 
clogging up COMPACTION_QUEUE.  It might also help with HIVE-11444, as you could 
generate stats off this table.

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)