[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-23 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510470#comment-14510470
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

{quote}
I don't see the WALEdit sequenceid being used when we replicate. Is this 
something to implement? (Sounds like a good idea... )
{quote}
[~saint@gmail.com]I thought we already had used it because 
intra-replication did otherwise I can give a first try on this.

 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack
 Attachments: 13389.txt


 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-22 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14508495#comment-14508495
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

[~saint@gmail.com] Well said and good examples! As of today. there are two 
cases that we could have out of order puts: DLR or replication, where the order 
of wal files to be replayed isn't guaranteed.  

For non-adjacent hfile compactions, it seems that we have to keep mvcc in KVs 
level, For example, hfile1(max mvcc=1) hfile2(max mvcc=2) and hfile3(max 
mvcc=3). If we just compact hfile1 and hfile3, we can't set the newly compacted 
hfile's max mvcc=3 because hfile2 may have same rows in either hfile1 or hfile2.

Keeping mvcc will make the haunting out-of-order issue go away and one less 
concern. Let me know which option we should go and I can also help on the fix.

 

 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack
 Attachments: 13389.txt


 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-19 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502173#comment-14502173
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

{quote}
All other cases we should be covering with metadata in the HFiles trailer, not 
on individual Cells.
{quote}
This may be hard to achieve because out of order puts can be flushed at 
different time. Let's say row1/logSeqId=2 is flushed earlier than 
row1/logSeqId=1. HFile trailer meta data's mvcc range will be overlapped among 
multiple HFiles.  

One option is that we can reinstate your original code by checking against the 
oldest running scanner and only keep mvcc around during region recovery time so 
that we can still keep HBASE-12600 goal. 

If not much overall read performance degrade(because this part may not be the 
bottleneck in the read path), I think it's better to keep current way so all 
cases can work correctly for out of order puts. How do you guys think? Thanks.
 

 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack
 Attachments: 13389.txt


 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-19 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14502373#comment-14502373
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

That sounds good. We can shorter the time period to 2 or 3 days. In one case 
that keeping mvcc longer can gain some performance because it makes possible 
that we can compact HFiles out of order in minor compactions.

 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack
 Attachments: 13389.txt


 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482558#comment-14482558
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

Changing comparing order in row  column - ts - seqId - type order can make 
things more consistently and doesn't change HBase current idempotence. For 
example, for puts with the same timestamp, the last put wins while if we do 
put, delete, put or delete, put , put and the delete always win.

I think it's better that a delete should be treated as a put so users can have 
same exceptions as puts. Otherwise, for low time resolution OS or when a put is 
missing, we often want to check if there is a delete overshadowing newer puts.

Yeah, keeping mvcc 3 days is good enough. 


 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack
 Attachments: 13389.txt


 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481348#comment-14481348
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

{quote}
To what does the above statement apply? To all three of your 'cases' or just to 
the last case, case #3?
{quote}
Just for case#3. The other two cases need mvcc around for a little bit time.

 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack
 Attachments: 13389.txt


 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395994#comment-14395994
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

Thanks [~lhofhansl] for looking this. I think your patch can help a bit.

{quote}
Do we need valid (non 0) mvcc readpoints for committed data (i.e. data that was 
flushed to an HFile and hence we'll never need to replay any HLogs for those)? 
Do we need these anywhere but in the memstore?
{quote}
There are three cases(I could think of and maybe more) that we need the 
logSeqId(mvcc) around to help us keep the put order.

Assuming all put/deletes are of same row  timestamp(version) 
case 1) region server recovery case
We need mvcc(logSeqId) only when region is in recovery mode but not after 
recovery.

case 2) replication receiving side, we need logSeqId to maintain the order 
because region move or recovery in replication playing side cause puts out of 
order
We need mvcc for couple of days(to be safe) so that at least the data 
eventually in receiving side are correct.

case 3) put , delete, put. Currently delete overshadows the later put but with 
logSeqId we can easily solve the issue because logSeqId is the real version of 
a put.
Seems to me not needed(before I thought we need to keep mvcc around till a 
major compaction)



 

 


 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack
 Attachments: 13389.txt


 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395995#comment-14395995
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

There is another thought. If we can keep mvcc being part of key byte 
array(logically it is but not in key serialization  deserialization) then we 
could use lazy read approach because mvcc is hardly used during key comparison.

 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack
 Attachments: 13389.txt


 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-03 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394111#comment-14394111
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

[~stack] The performance regression is due to we keep mvcc values 
longer(HBASE-11315) so comes the later change 
https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96.
 

I'm surprised the extra mvcc value caused so much perf regression. Here is the 
code which calculates minSeqId to keep in file Compactor.java during 
compaction. 

{code}
// when isAllFiles is true, all files are compacted so we can calculate 
the smallest 
// MVCC value to keep
if(fd.minSeqIdToKeep  file.getMaxMemstoreTS()) {
  fd.minSeqIdToKeep = file.getMaxMemstoreTS();
}

// output to writer:
for (Cell c : cells) {
  if (cleanSeqId  c.getSequenceId() = smallestReadPoint) {
CellUtil.setSequenceId(c, 0);
  }
{code}

 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack

 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations

2015-04-03 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394114#comment-14394114
 ] 

Jeffrey Zhong commented on HBASE-13389:
---

Should we keep the time period configuration shorter or revert all related 
changes? Thanks.

 [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
 -

 Key: HBASE-13389
 URL: https://issues.apache.org/jira/browse/HBASE-13389
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: stack

 HBASE-12600 moved the edit sequenceid from tags to instead exploit the 
 mvcc/sequenceid slot in a key. Now Cells near-always have an associated 
 mvcc/sequenceid where previous it was rare or the mvcc was kept up at the 
 file level. This is sort of how it should be many of us would argue but as a 
 side-effect of this change, read-time optimizations that helped speed scans 
 were undone by this change.
 In this issue, lets see if we can get the optimizations back -- or just 
 remove the optimizations altogether.
 The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
 The optimizations undone by this changes are (to quote the optimizer himself, 
 Mr [~lhofhansl]):
 {quote}
 Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
 We're always storing the mvcc readpoints, and we never compare them against 
 the actual smallestReadpoint, and hence we're always performing all the 
 checks, tests, and comparisons that these jiras removed in addition to 
 actually storing the data - which with up to 8 bytes per Cell is not trivial.
 {quote}
 This is the 'breaking' change: 
 https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13172) TestDistributedLogSplitting.testThreeRSAbort fails several times on branch-1

2015-03-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352559#comment-14352559
 ] 

Jeffrey Zhong commented on HBASE-13172:
---

I just skimmed through the thread. It seems the test was stucked in 
isServerReachable(). 

[~Apache9] In order to make the test case stable you can set config 
hbase.master.maximum.ping.server.attempts to 3(by default it's 10). For 
isServerReachable() call, inside IOException catch block, we should check 
following conditions and return false immediately when any of them is true.
1) if current server is put in deadServer already
2) If current IOException is one of RegionServerStoppedException or 
ServerNotRunningYetException

[~jxiang] The following code inside RegionStates seems unnecessary and should 
just return false(because the result of isServerReachable call may still return 
false positive info after retries) . In addition, should we expire the server 
instead directly put it in deadServers? Thanks.

{code}
if (serverManager.isServerReachable(server)) {
  return false;
}
// The size of deadServers won't grow unbounded.
deadServers.put(hostAndPort, Long.valueOf(startCode));
{code}

 TestDistributedLogSplitting.testThreeRSAbort fails several times on branch-1
 

 Key: HBASE-13172
 URL: https://issues.apache.org/jira/browse/HBASE-13172
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 1.1.0
Reporter: zhangduo

 The direct reason is we are stuck in ServerManager.isServerReachable.
 https://builds.apache.org/job/HBase-1.1/253/testReport/org.apache.hadoop.hbase.master/TestDistributedLogSplitting/testThreeRSAbort/
 {noformat}
 2015-03-06 04:06:19,430 DEBUG [AM.-pool300-t1] master.ServerManager(855): 
 Couldn't reach asf906.gq1.ygridcore.net,59366,1425614770146, try=0 of 10
 2015-03-06 04:07:10,545 DEBUG [AM.-pool300-t1] master.ServerManager(855): 
 Couldn't reach asf906.gq1.ygridcore.net,59366,1425614770146, try=9 of 10
 {noformat}
 The interval between first and last retry log is about 1 minute, and we only 
 wait 1 minute so the test is timeout.
 Still do not know why this happen.
 And at last there are lots of this 
 {noformat}
 2015-03-06 04:07:21,529 DEBUG [AM.-pool300-t1] master.ServerManager(855): 
 Couldn't reach asf906.gq1.ygridcore.net,59366,1425614770146, try=9 of 10
 org.apache.hadoop.hbase.ipc.StoppedRpcClientException
   at 
 org.apache.hadoop.hbase.ipc.RpcClientImpl.getConnection(RpcClientImpl.java:1261)
   at 
 org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1146)
   at 
 org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
   at 
 org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
   at 
 org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$BlockingStub.getServerInfo(AdminProtos.java:22031)
   at 
 org.apache.hadoop.hbase.protobuf.ProtobufUtil.getServerInfo(ProtobufUtil.java:1797)
   at 
 org.apache.hadoop.hbase.master.ServerManager.isServerReachable(ServerManager.java:850)
   at 
 org.apache.hadoop.hbase.master.RegionStates.isServerDeadAndNotProcessed(RegionStates.java:843)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.forceRegionStateToOffline(AssignmentManager.java:1969)
   at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1576)
   at 
 org.apache.hadoop.hbase.master.AssignCallable.call(AssignCallable.java:48)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 {noformat}
 I think the problem is here
 {code:title=ServerManager.java}
 while (retryCounter.shouldRetry()) {
 ...
 try {
   retryCounter.sleepUntilNextRetry();
 } catch(InterruptedException ie) {
   Thread.currentThread().interrupt();
 }
 ...
 }
 {code}
 We need to break out of the while loop when getting InterruptedException, not 
 just mark current thread as interrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13160) SplitLogWorker does not pick up the task immediately

2015-03-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350757#comment-14350757
 ] 

Jeffrey Zhong commented on HBASE-13160:
---

+1. Two very minor things:

if we could append the condition
 {code}
if (seq_start == taskReadySeq) {
{code}
to 
{code}
if (seq_start == taskReadySeq  numTasks==0) {
{code}

2) the following isn't needed any more
{noformat}
  if (childrenPaths != null) {
return childrenPaths;
  }
{noformat}


 SplitLogWorker does not pick up the task immediately
 

 Key: HBASE-13160
 URL: https://issues.apache.org/jira/browse/HBASE-13160
 Project: HBase
  Issue Type: Bug
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-13160_v1.patch


 We were reading some code with Jeffrey, and we realized that the 
 SplitLogWorker's internal task loop is weird. It does {{ls}} every second and 
 sleeps, but have another mechanism to learn about new tasks, but does not 
 make affective use of the zk notification. 
 I have a simple patch which might improve this area. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13121) Async wal replication for region replicas and dist log replay does not work together

2015-03-05 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349872#comment-14349872
 ] 

Jeffrey Zhong commented on HBASE-13121:
---

Looks good to me(+1) with two minor comments:

1) Could you still set recovering to false and then submit the rest work into 
executor

2) openSeqNum in the following code may still use the old value?

{code}
status.setStatus(Writing region open event marker to WAL because recovery is 
finished);
try {
  writeRegionOpenMarker(wal, openSeqNum);
} catch (IOException e) {
{code}

 Async wal replication for region replicas and dist log replay does not work 
 together
 

 Key: HBASE-13121
 URL: https://issues.apache.org/jira/browse/HBASE-13121
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-13121_v1.patch


 We had not tested dist log replay while testing async wal replication for 
 region replicas. There seems to be a couple of issues, but fixable. 
 The distinction for dist log replay is that, the region will be opened for 
 recovery and regular writes when a primary fails over. This causes the region 
 open event marker to be written to WAL, but at this time, the region actually 
 does not contain all the edits flushed (since it is still recovering). If 
 secondary regions see this event, and picks up all the files in the region 
 open event marker, then they can drop edits. 
 The solution is: 
  - Only write the region open event marker to WAL when region is out of 
 recovering mode. 
  - Force a flush out of recovering mode. This ensures that all data is force 
 flushed in this case. Before the region open event marker is written, we 
 guarantee that all data in the region is flushed, so the list of files in the 
 event marker is complete.  
  - Edits coming from recovery are re-written to WAL when recovery is in 
 action. These edits will have a larger seqId then their original seqId. If 
 this is the case, we do not replicate these edits to the secondary replicas. 
 Since the dist log replay recovers edits out of order (coming from parallel 
 replays from WAL file split tasks), this ensures that TIMELINE consistency is 
 respected and edits are not seen out of order in secondaries. These edits are 
 seen from secondaries via the forced flush event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12562) Handling memory pressure for secondary region replicas

2015-03-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348056#comment-14348056
 ] 

Jeffrey Zhong commented on HBASE-12562:
---

+1. Looks good to me with some some minor comments:
1)
{code}
+  if (store.getSnapshotSize()  0) {
+canDrop = false;
+  }
{code}
You can break the loop after set canDrop to false

2) Just to check acquiring lock on writestate and memstore are always in this 
order

3) There maybe no need for the following condition
{code}
+if (region.writestate.flushing
{code}

4. Rename getBiggestMemstoreOfSecondaryRegion to 
getBiggestMemstoreOfRegionReplica may be better

 Handling memory pressure for secondary region replicas
 --

 Key: HBASE-12562
 URL: https://issues.apache.org/jira/browse/HBASE-12562
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-12562_v1.patch


 This issue will track the implementation of how to handle the memory pressure 
 for secondary region replicas. Since the replicas cannot flush by themselves, 
 the region server might get blocked or cause extensive flushing for its 
 primary regions. The design doc attached at HBASE-11183 contains two possible 
 solutions that we can pursue. The first one is to not allow secondary region 
 replicas to not flush by themselves, but instead of needed allow them to 
 refresh their store files on demand (which possibly allows them to drop their 
 memstore snapshots or memstores). The second approach is to allow the 
 secondaries to flush to a temporary space. 
 Both have pros and cons, but for simplicity and to not cause extra write 
 amplification, we have implemented the first approach. More details can be 
 found in the design doc, but we can also discuss other options here. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11571) Bulk load handling from secondary region replicas

2015-03-03 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345997#comment-14345997
 ] 

Jeffrey Zhong commented on HBASE-11571:
---

Thanks [~enis] for the reviews! I've integrated the patch into master  
branch-1. 

 Bulk load handling from secondary region replicas
 -

 Key: HBASE-11571
 URL: https://issues.apache.org/jira/browse/HBASE-11571
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-11571-rebase.patch, HBASE-11571-v2.patch, 
 hbase-11571.patch


 We should be replaying the bulk load events from the primary region replica 
 in the secondary region replica so that the bulk loaded files will be made 
 visible in the secondaries. 
 This will depend on HBASE-11567 and HBASE-11568



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11571) Bulk load handling from secondary region replicas

2015-03-03 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-11571:
--
   Resolution: Fixed
Fix Version/s: 1.1.0
   2.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

 Bulk load handling from secondary region replicas
 -

 Key: HBASE-11571
 URL: https://issues.apache.org/jira/browse/HBASE-11571
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-11571-rebase.patch, HBASE-11571-v2.patch, 
 hbase-11571.patch


 We should be replaying the bulk load events from the primary region replica 
 in the secondary region replica so that the bulk loaded files will be made 
 visible in the secondaries. 
 This will depend on HBASE-11567 and HBASE-11568



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-13136) TestSplitLogManager.testGetPreviousRecoveryMode is flakey

2015-03-02 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344005#comment-14344005
 ] 

Jeffrey Zhong commented on HBASE-13136:
---

[~Apache9] There does exist an race condition. Since SplitLogManager has a 
chore(TimeoutMonitor) which creates rescan znode, the newly created rescan 
znode causes the flaky ness. 

Below is suggested changes OR we could also just fix test case to make sure 
there is no znode under splitLogZnode :
{code}
diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/coordination/ZKSplitLogManagerCoordination.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/coordination/ZKSplitLogManagerCoordination.java
index 694ccff..8ed4357 100644
--- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/coordination/ZKSplitLogManagerCoordination.java
+++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/coordination/ZKSplitLogManagerCoordination.java
@@ -27,6 +27,7 @@ import static 
org.apache.hadoop.hbase.master.SplitLogManager.TerminationStatus.S

 import java.io.IOException;
 import java.io.InterruptedIOException;
+import java.util.ArrayList;
 import java.util.List;
 import java.util.Set;
 import java.util.concurrent.ConcurrentMap;
@@ -801,7 +802,16 @@ public class ZKSplitLogManagerCoordination extends 
ZooKeeperListener implements
   }
   if (previousRecoveryMode == RecoveryMode.UNKNOWN) {
 // Secondly check if there are outstanding split log task
-ListString tasks = ZKUtil.listChildrenNoWatch(watcher, 
watcher.splitLogZNode);
+ListString tmpTasks = ZKUtil.listChildrenNoWatch(watcher, 
watcher.splitLogZNode);
+// Remove rescan nodes
+ListString tasks = new ArrayListString();
+for(String tmpTask : tmpTasks) {
+  String znodePath = ZKUtil.joinZNode(watcher.splitLogZNode, tmpTask);
+  if (ZKSplitLog.isRescanNode(watcher, znodePath)) {
+continue;
+  }
+  tasks.add(tmpTask);
+}
 if (tasks != null  !tasks.isEmpty()) {
{code}

 TestSplitLogManager.testGetPreviousRecoveryMode is flakey
 -

 Key: HBASE-13136
 URL: https://issues.apache.org/jira/browse/HBASE-13136
 Project: HBase
  Issue Type: Bug
Reporter: zhangduo

 Add test code to run it 100 times then we can make it fail always.
 {code:title=TestSplitLogManager.java}
   @Test
   public void test() throws Exception {
 for (int i = 0; i  100; i++) {
   setup();
   testGetPreviousRecoveryMode();
   teardown();
 }
   }
 {code}
 Add then add some ugly debug logs(Yeah I usually debug in this way...)
 {code:title=ZKSplitLogManagerCoordination.java}
   @Override
   public void setRecoveryMode(boolean isForInitialization) throws IOException 
 {
 synchronized(this) {
   if (this.isDrainingDone) {
 // when there is no outstanding splitlogtask after master start up, 
 we already have up to 
 // date recovery mode
 return;
   }
 }
 if (this.watcher == null) {
   // when watcher is null(testing code) and recovery mode can only be 
 LOG_SPLITTING
   synchronized(this) {
 this.isDrainingDone = true;
 this.recoveryMode = RecoveryMode.LOG_SPLITTING;
   }
   return;
 }
 boolean hasSplitLogTask = false;
 boolean hasRecoveringRegions = false;
 RecoveryMode previousRecoveryMode = RecoveryMode.UNKNOWN;
 RecoveryMode recoveryModeInConfig =
 (isDistributedLogReplay(conf)) ? RecoveryMode.LOG_REPLAY : 
 RecoveryMode.LOG_SPLITTING;
 // Firstly check if there are outstanding recovering regions
 try {
   ListString regions = ZKUtil.listChildrenNoWatch(watcher, 
 watcher.recoveringRegionsZNode);
   LOG.debug(=== + regions);
   if (regions != null  !regions.isEmpty()) {
 hasRecoveringRegions = true;
 previousRecoveryMode = RecoveryMode.LOG_REPLAY;
   }
   if (previousRecoveryMode == RecoveryMode.UNKNOWN) {
 // Secondly check if there are outstanding split log task
 ListString tasks = ZKUtil.listChildrenNoWatch(watcher, 
 watcher.splitLogZNode);
 LOG.debug(=== + tasks);
 if (tasks != null  !tasks.isEmpty()) {
   hasSplitLogTask = true;
   if (isForInitialization) {
 // during initialization, try to get recovery mode from 
 splitlogtask
 int listSize = tasks.size();
 for (int i = 0; i  listSize; i++) {
   String task = tasks.get(i);
   try {
 byte[] data =
 ZKUtil.getData(this.watcher, 
 ZKUtil.joinZNode(watcher.splitLogZNode, task));
 if (data == null) continue;
 SplitLogTask slt = SplitLogTask.parseFrom(data);
 previousRecoveryMode = 

[jira] [Commented] (HBASE-13136) TestSplitLogManager.testGetPreviousRecoveryMode is flakey

2015-03-02 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-13136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344191#comment-14344191
 ] 

Jeffrey Zhong commented on HBASE-13136:
---

Looks good to me. (+1). 

 TestSplitLogManager.testGetPreviousRecoveryMode is flakey
 -

 Key: HBASE-13136
 URL: https://issues.apache.org/jira/browse/HBASE-13136
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 1.1.0
Reporter: zhangduo
Assignee: zhangduo
 Attachments: HBASE-13136.patch


 Add test code to run it 100 times then we can make it fail always.
 {code:title=TestSplitLogManager.java}
   @Test
   public void test() throws Exception {
 for (int i = 0; i  100; i++) {
   setup();
   testGetPreviousRecoveryMode();
   teardown();
 }
   }
 {code}
 Add then add some ugly debug logs(Yeah I usually debug in this way...)
 {code:title=ZKSplitLogManagerCoordination.java}
   @Override
   public void setRecoveryMode(boolean isForInitialization) throws IOException 
 {
 synchronized(this) {
   if (this.isDrainingDone) {
 // when there is no outstanding splitlogtask after master start up, 
 we already have up to 
 // date recovery mode
 return;
   }
 }
 if (this.watcher == null) {
   // when watcher is null(testing code) and recovery mode can only be 
 LOG_SPLITTING
   synchronized(this) {
 this.isDrainingDone = true;
 this.recoveryMode = RecoveryMode.LOG_SPLITTING;
   }
   return;
 }
 boolean hasSplitLogTask = false;
 boolean hasRecoveringRegions = false;
 RecoveryMode previousRecoveryMode = RecoveryMode.UNKNOWN;
 RecoveryMode recoveryModeInConfig =
 (isDistributedLogReplay(conf)) ? RecoveryMode.LOG_REPLAY : 
 RecoveryMode.LOG_SPLITTING;
 // Firstly check if there are outstanding recovering regions
 try {
   ListString regions = ZKUtil.listChildrenNoWatch(watcher, 
 watcher.recoveringRegionsZNode);
   LOG.debug(=== + regions);
   if (regions != null  !regions.isEmpty()) {
 hasRecoveringRegions = true;
 previousRecoveryMode = RecoveryMode.LOG_REPLAY;
   }
   if (previousRecoveryMode == RecoveryMode.UNKNOWN) {
 // Secondly check if there are outstanding split log task
 ListString tasks = ZKUtil.listChildrenNoWatch(watcher, 
 watcher.splitLogZNode);
 LOG.debug(=== + tasks);
 if (tasks != null  !tasks.isEmpty()) {
   hasSplitLogTask = true;
   if (isForInitialization) {
 // during initialization, try to get recovery mode from 
 splitlogtask
 int listSize = tasks.size();
 for (int i = 0; i  listSize; i++) {
   String task = tasks.get(i);
   try {
 byte[] data =
 ZKUtil.getData(this.watcher, 
 ZKUtil.joinZNode(watcher.splitLogZNode, task));
 if (data == null) continue;
 SplitLogTask slt = SplitLogTask.parseFrom(data);
 previousRecoveryMode = slt.getMode();
 if (previousRecoveryMode == RecoveryMode.UNKNOWN) {
   // created by old code base where we don't set recovery 
 mode in splitlogtask
   // we can safely set to LOG_SPLITTING because we're in 
 master initialization code
   // before SSH is enabled  there is no outstanding 
 recovering regions
   previousRecoveryMode = RecoveryMode.LOG_SPLITTING;
 }
 break;
   } catch (DeserializationException e) {
 LOG.warn(Failed parse data for znode  + task, e);
   } catch (InterruptedException e) {
 throw new InterruptedIOException();
   }
 }
   }
 }
   }
 } catch (KeeperException e) {
   throw new IOException(e);
 }
 synchronized (this) {
   if (this.isDrainingDone) {
 return;
   }
   if (!hasSplitLogTask  !hasRecoveringRegions) {
 this.isDrainingDone = true;
 LOG.debug(set to  + recoveryModeInConfig);
 this.recoveryMode = recoveryModeInConfig;
 return;
   } else if (!isForInitialization) {
 // splitlogtask hasn't drained yet, keep existing recovery mode
 return;
   }
   if (previousRecoveryMode != RecoveryMode.UNKNOWN) {
 LOG.debug(set to  + previousRecoveryMode);
 this.isDrainingDone = (previousRecoveryMode == recoveryModeInConfig);
 this.recoveryMode = previousRecoveryMode;
   } else {
 LOG.debug(set to  + recoveryModeInConfig);
 this.recoveryMode = recoveryModeInConfig;
   }
 }
   }
 {code}
 When failing, I got this
 {noformat}
 2015-03-02 

[jira] [Updated] (HBASE-11571) Bulk load handling from secondary region replicas

2015-03-02 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-11571:
--
Attachment: HBASE-11571-v2.patch

The v2 patch addressed [~enis] comments.

 Bulk load handling from secondary region replicas
 -

 Key: HBASE-11571
 URL: https://issues.apache.org/jira/browse/HBASE-11571
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
 Attachments: HBASE-11571-rebase.patch, HBASE-11571-v2.patch, 
 hbase-11571.patch


 We should be replaying the bulk load events from the primary region replica 
 in the secondary region replica so that the bulk loaded files will be made 
 visible in the secondaries. 
 This will depend on HBASE-11567 and HBASE-11568



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11571) Bulk load handling from secondary region replicas

2015-02-27 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-11571:
--
Attachment: HBASE-11571-rebase.patch

Rebase patch. [~enis] Please review it. Thanks.

 Bulk load handling from secondary region replicas
 -

 Key: HBASE-11571
 URL: https://issues.apache.org/jira/browse/HBASE-11571
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
 Attachments: HBASE-11571-rebase.patch, hbase-11571.patch


 We should be replaying the bulk load events from the primary region replica 
 in the secondary region replica so that the bulk loaded files will be made 
 visible in the secondaries. 
 This will depend on HBASE-11567 and HBASE-11568



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11571) Bulk load handling from secondary region replicas

2015-02-27 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-11571:
--
Status: Patch Available  (was: Open)

 Bulk load handling from secondary region replicas
 -

 Key: HBASE-11571
 URL: https://issues.apache.org/jira/browse/HBASE-11571
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
 Attachments: HBASE-11571-rebase.patch, hbase-11571.patch


 We should be replaying the bulk load events from the primary region replica 
 in the secondary region replica so that the bulk loaded files will be made 
 visible in the secondaries. 
 This will depend on HBASE-11567 and HBASE-11568



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11580) Failover handling for secondary region replicas

2015-02-27 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14341098#comment-14341098
 ] 

Jeffrey Zhong commented on HBASE-11580:
---

I've reviewed the patch and left some comments on the review board. +1 assuming 
unit tests pass.

For the flush amplifications, it's more an optimization issue which can be 
addressed by secondary replica sends the seqId it firstly sees as part of flush 
request. Primary region can check if it's last flushed seqId is larger than the 
passed seqId from replica to decide if to perform a flush.

 Failover handling for secondary region replicas
 ---

 Key: HBASE-11580
 URL: https://issues.apache.org/jira/browse/HBASE-11580
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar

 With the async wal approach (HBASE-11568), the edits are not persisted (to 
 wal) in the secondary region replicas. However this means that we have to 
 deal with secondary region replica failures. 
 We can seek to re-replicate the edits from primary to the secondary when the 
 secondary region is opened in another server but this would mean to setup a 
 replication queue again, and holding on to the wals for longer. 
 Instead, we can design it so that the edits form the secondaries are not 
 persisted to wal, and if the secondary replica fails over, it will not start 
 serving reads until it has guaranteed that it has all the past data. 
 For guaranteeing that the secondary replica has all the edits before serving 
 reads, we can use flush and region opening markers. Whenever a region open 
 event is seen, it writes all the files at the time of opening to wal 
 (HBASE-11512). In case of flush, the flushed file is written as well, and the 
 secondary replica can do a ls for the store files and pick up all the files 
 before the seqId of the flushed file. So, in this design, the secodary 
 replica will wait until it sees and replays a flush or region open marker 
 from wal from primary. and then start serving. For speeding up replica 
 opening time, we can trigger a flush to the primary whenever the secondary 
 replica opens as an optimization. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13077) BoundedCompletionService doesn't pass trace info to server

2015-02-24 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-13077:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I've integrated the fix into branch-1.0, branch-1 and master branch. Thanks 
[~enis] for the review and [~ndimiduk] for the help!

 BoundedCompletionService doesn't pass trace info to server
 --

 Key: HBASE-13077
 URL: https://issues.apache.org/jira/browse/HBASE-13077
 Project: HBase
  Issue Type: Bug
  Components: hbase
Affects Versions: 1.0.0, 2.0.0, 1.1.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 2.0.0, 1.0.1, 1.1.0

 Attachments: HBASE-13077.patch


 Today [~ndimiduk]  I found that BoundedCompletionService doesn't pass htrace 
 info to server. This issue causes scan doesn't pass trace info to server.
 [~enis] FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13077) BoundedCompletionService doesn't pass trace info to server

2015-02-19 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-13077:
--
Attachment: HBASE-13077.patch

This patch is for 1.0. Thanks.

 BoundedCompletionService doesn't pass trace info to server
 --

 Key: HBASE-13077
 URL: https://issues.apache.org/jira/browse/HBASE-13077
 Project: HBase
  Issue Type: Bug
  Components: hbase
Affects Versions: 1.0.0, 2.0.0, 1.1.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-13077.patch


 Today [~ndimiduk]  I found that BoundedCompletionService doesn't pass htrace 
 info to server.
 [~enis] FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13077) BoundedCompletionService doesn't pass trace info to server

2015-02-19 Thread Jeffrey Zhong (JIRA)
Jeffrey Zhong created HBASE-13077:
-

 Summary: BoundedCompletionService doesn't pass trace info to server
 Key: HBASE-13077
 URL: https://issues.apache.org/jira/browse/HBASE-13077
 Project: HBase
  Issue Type: Bug
  Components: hbase
Affects Versions: 1.0.0, 2.0.0, 1.1.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong


Today [~ndimiduk]  I found that BoundedCompletionService doesn't pass htrace 
info to server.

[~enis] FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13077) BoundedCompletionService doesn't pass trace info to server

2015-02-19 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-13077:
--
Status: Patch Available  (was: Open)

 BoundedCompletionService doesn't pass trace info to server
 --

 Key: HBASE-13077
 URL: https://issues.apache.org/jira/browse/HBASE-13077
 Project: HBase
  Issue Type: Bug
  Components: hbase
Affects Versions: 1.0.0, 2.0.0, 1.1.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-13077.patch


 Today [~ndimiduk]  I found that BoundedCompletionService doesn't pass htrace 
 info to server.
 [~enis] FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-13077) BoundedCompletionService doesn't pass trace info to server

2015-02-19 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-13077:
--
Description: 
Today [~ndimiduk]  I found that BoundedCompletionService doesn't pass htrace 
info to server. This issue causes scan doesn't pass trace info to server.

[~enis] FYI.

  was:
Today [~ndimiduk]  I found that BoundedCompletionService doesn't pass htrace 
info to server.

[~enis] FYI.


 BoundedCompletionService doesn't pass trace info to server
 --

 Key: HBASE-13077
 URL: https://issues.apache.org/jira/browse/HBASE-13077
 Project: HBase
  Issue Type: Bug
  Components: hbase
Affects Versions: 1.0.0, 2.0.0, 1.1.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-13077.patch


 Today [~ndimiduk]  I found that BoundedCompletionService doesn't pass htrace 
 info to server. This issue causes scan doesn't pass trace info to server.
 [~enis] FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11569) Flush / Compaction handling from secondary region replicas

2015-02-10 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14315467#comment-14315467
 ] 

Jeffrey Zhong commented on HBASE-11569:
---

Looks good to me(+1). I posted few minor comments in the review board. Thanks.

 Flush / Compaction handling from secondary region replicas
 --

 Key: HBASE-11569
 URL: https://issues.apache.org/jira/browse/HBASE-11569
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 1.1.0

 Attachments: hbase-11569-master-v3.patch


 We should be handling flushes and compactions from the primary region replica 
 being replayed to the secondary region replica via HBASE-11568. 
 Some initial thoughts for how can this be done is discussed in HBASE-11183. 
 More details will come together with the patch. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11567) Write bulk load COMMIT events to WAL

2015-02-06 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-11567:
--
   Resolution: Fixed
Fix Version/s: 1.1.0
   2.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks [~posix4e] for the contributions! I've integrated the v4-rebase patch 
into master and branch-1. 

 Write bulk load COMMIT events to WAL
 

 Key: HBASE-11567
 URL: https://issues.apache.org/jira/browse/HBASE-11567
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Alex Newman
 Fix For: 2.0.0, 1.1.0

 Attachments: HBASE-11567-v1.patch, HBASE-11567-v2.patch, 
 HBASE-11567-v4-rebase.patch, hbase-11567-branch-1.0-partial.patch, 
 hbase-11567-v3.patch, hbase-11567-v4.patch


 Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and 
 region open/close (HBASE-11512) , we should persist bulk load events to WAL.
 This is especially important for secondary region replicas, since we can use 
 this information to pick up primary regions' files from secondary replicas.
 A design doc for secondary replica replication can be found at HBASE-11183.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11567) Write bulk load COMMIT events to WAL

2015-02-04 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-11567:
--
Attachment: HBASE-11567-v4-rebase.patch

 Write bulk load COMMIT events to WAL
 

 Key: HBASE-11567
 URL: https://issues.apache.org/jira/browse/HBASE-11567
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Alex Newman
 Attachments: HBASE-11567-v1.patch, HBASE-11567-v2.patch, 
 HBASE-11567-v4-rebase.patch, hbase-11567-v3.patch, hbase-11567-v4.patch


 Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and 
 region open/close (HBASE-11512) , we should persist bulk load events to WAL.
 This is especially important for secondary region replicas, since we can use 
 this information to pick up primary regions' files from secondary replicas.
 A design doc for secondary replica replication can be found at HBASE-11183.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11567) Write bulk load COMMIT events to WAL

2015-02-04 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-11567:
--
Attachment: (was: HBASE-11567-v4-rebase.patch)

 Write bulk load COMMIT events to WAL
 

 Key: HBASE-11567
 URL: https://issues.apache.org/jira/browse/HBASE-11567
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Alex Newman
 Attachments: HBASE-11567-v1.patch, HBASE-11567-v2.patch, 
 hbase-11567-v3.patch, hbase-11567-v4.patch


 Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and 
 region open/close (HBASE-11512) , we should persist bulk load events to WAL.
 This is especially important for secondary region replicas, since we can use 
 this information to pick up primary regions' files from secondary replicas.
 A design doc for secondary replica replication can be found at HBASE-11183.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11567) Write bulk load COMMIT events to WAL

2015-02-04 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-11567:
--
Attachment: HBASE-11567-v4-rebase.patch

rebased the v4 patch against master. The patch is long due. [~enis] could u 
please give a quick review? I think the old v4 patch is in ready state but 
somehow it's left over. Thanks.

 Write bulk load COMMIT events to WAL
 

 Key: HBASE-11567
 URL: https://issues.apache.org/jira/browse/HBASE-11567
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Alex Newman
 Attachments: HBASE-11567-v1.patch, HBASE-11567-v2.patch, 
 HBASE-11567-v4-rebase.patch, hbase-11567-v3.patch, hbase-11567-v4.patch


 Similar to writing flush (HBASE-11511), compaction(HBASE-2231) to WAL and 
 region open/close (HBASE-11512) , we should persist bulk load events to WAL.
 This is especially important for secondary region replicas, since we can use 
 this information to pick up primary regions' files from secondary replicas.
 A design doc for secondary replica replication can be found at HBASE-11183.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12782) ITBLL fails for me if generator does anything but 5M per maptask

2015-01-30 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14299325#comment-14299325
 ] 

Jeffrey Zhong commented on HBASE-12782:
---

[~saint@gmail.com] Great findings! I previously reviewed the patch. The 
intention was good and should do flush |=  restoreEdit(store, cell); as 
[~lhofhansl] mentioned above but it apparently that the fix did more than that. 
Thanks.

 ITBLL fails for me if generator does anything but 5M per maptask
 

 Key: HBASE-12782
 URL: https://issues.apache.org/jira/browse/HBASE-12782
 Project: HBase
  Issue Type: Bug
  Components: integration tests
Affects Versions: 1.0.0, 0.98.9
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.11

 Attachments: 12782.fix.txt, 
 12782.search.plus.archive.recovered.edits.txt, 12782.search.plus.txt, 
 12782.search.txt, 12782.unit.test.and.it.test.txt, 
 12782.unit.test.writing.txt, 12782v2.0.98.txt, 12782v2.txt


 Anyone else seeing this?  If I do an ITBLL with generator doing 5M rows per 
 maptask, all is good -- verify passes. I've been running 5 servers and had 
 one splot per server.  So below works:
 HADOOP_CLASSPATH=/home/stack/conf_hbase:`/home/stack/hbase/bin/hbase 
 classpath` ./hadoop/bin/hadoop --config ~/conf_hadoop 
 org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList --monkey 
 serverKilling Generator 5 500 g1.tmp
 or if I double the map tasks, it works:
 HADOOP_CLASSPATH=/home/stack/conf_hbase:`/home/stack/hbase/bin/hbase 
 classpath` ./hadoop/bin/hadoop --config ~/conf_hadoop 
 org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList --monkey 
 serverKilling Generator 10 500 g2.tmp
 ...but if I change the 5M to 50M or 25M, Verify fails.
 Looking into it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12743) [ITBLL] Master fails rejoining cluster stuck splitting logs; Distributed log replay=true

2015-01-16 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280997#comment-14280997
 ] 

Jeffrey Zhong commented on HBASE-12743:
---

For the error org.apache.hadoop.hbase.NotServingRegionException: 
org.apache.hadoop.hbase.NotServingRegionException: Region 
hbase:namespace,,1417551886199.ecdcd0172cd3e32d291bc282771895da. is not 
online, master won't start. But it should not unrelated to log recovery either 
splitting/replay. 

[~saint@gmail.com] could you share more master logs so that I can check why 
hbase:namespace wasn't online  assigned for two hours? Thanks.

 [ITBLL] Master fails rejoining cluster stuck splitting logs; Distributed log 
 replay=true
 

 Key: HBASE-12743
 URL: https://issues.apache.org/jira/browse/HBASE-12743
 Project: HBase
  Issue Type: Bug
Reporter: stack
 Fix For: 1.0.0, 2.0.0, 1.1.0


 Master is stuck for two days trying to rejoin cluster after monkey killed and 
 restarted it.
 After retrying to get namespace 350 times, Master goes down:
 {code}
 2014-12-20 18:43:54,285 INFO  [c2020:16020.activeMasterManager] 
 client.RpcRetryingCaller: Call exception, tries=349, retries=350, 
 started=6885331 ms ago, cancelled=false, msg=row 'default' on table 
 'hbase:namespace' at 
 region=hbase:namespace,,1417551886199.ecdcd0172cd3e32d291bc282771895da., 
 hostname=c2023.halxg.cloudera.com,16020,1418988286696, seqNum=600190
 2014-12-20 18:43:54,303 WARN  [c2020:16020.activeMasterManager] 
 master.TableNamespaceManager: Caught exception in initializing namespace 
 table manager
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
 attempts=350, exceptions:
 Sat Dec 20 16:49:08 PST 2014, 
 RpcRetryingCaller{globalStartTime=1419122948954, pause=100, retries=350}, 
 org.apache.hadoop.hbase.NotServingRegionException: 
 org.apache.hadoop.hbase.NotServingRegionException: Region 
 hbase:namespace,,1417551886199.ecdcd0172cd3e32d291bc282771895da. is not 
 online on c2023.halxg.cloudera.com,16020,1418988286696
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2722)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:851)
 at 
 org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1695)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:30434)
 {code}
 Seems like 2014-12-20 16:49:03,665 INFO  [RS_LOG_REPLAY_OPS-c2021:16020-0] 
 wal.WALSplitter: DistributedLogReplay = true
 Seems easy enough to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12746) [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)

2014-12-26 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12746:
--
Attachment: 12746-v2.patch

[~saint@gmail.com] I amended your patch to address the three test failures 
for your references. Thanks.

 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)
 --

 Key: HBASE-12746
 URL: https://issues.apache.org/jira/browse/HBASE-12746
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 1.0.0
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 1.0.0

 Attachments: 12746-v2.patch, 12746.txt, 12746.txt


 Testing the 1.0.0RC0 candidate, I noticed DLR was on (because I was bumping 
 into HBASE-12743)  I thought it my environment but apparently not.
 If I add this to HMaster
 diff --git 
 a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 
 b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 index a85c2e7..d745f94 100644
 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 @@ -416,6 +416,10 @@ public class HMaster extends HRegionServer implements 
 MasterServices, Server {
throw new IOException(Failed to start redirecting jetty server, e);
  }
  masterInfoPort = connector.getPort();
 + boolean dlr =
 + conf.getBoolean(HConstants.DISTRIBUTED_LOG_REPLAY_KEY,
 +  HConstants.DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG);
 +  LOG.info(Distributed log replay= + dlr);
}
 It says DLR is on.  HBASE-12577 was not enough it seems.  The 
 hbase-default.xml still has DLR as true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12746) [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)

2014-12-26 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14259242#comment-14259242
 ] 

Jeffrey Zhong commented on HBASE-12746:
---

The only rest change is the following:

{quote}
+ds = new DummyServer(zkw, testConf);
{quote}
Because we want to use testConf which has DISTRIBUTED_LOG_REPLAY_KEY on as in 
the beginning of the test case we have 
testConf.setBoolean(HConstants.DISTRIBUTED_LOG_REPLAY_KEY, true);

 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)
 --

 Key: HBASE-12746
 URL: https://issues.apache.org/jira/browse/HBASE-12746
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 1.0.0
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 1.0.0

 Attachments: 12746-v2.patch, 12746.txt, 12746.txt


 Testing the 1.0.0RC0 candidate, I noticed DLR was on (because I was bumping 
 into HBASE-12743)  I thought it my environment but apparently not.
 If I add this to HMaster
 diff --git 
 a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 
 b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 index a85c2e7..d745f94 100644
 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
 @@ -416,6 +416,10 @@ public class HMaster extends HRegionServer implements 
 MasterServices, Server {
throw new IOException(Failed to start redirecting jetty server, e);
  }
  masterInfoPort = connector.getPort();
 + boolean dlr =
 + conf.getBoolean(HConstants.DISTRIBUTED_LOG_REPLAY_KEY,
 +  HConstants.DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG);
 +  LOG.info(Distributed log replay= + dlr);
}
 It says DLR is on.  HBASE-12577 was not enough it seems.  The 
 hbase-default.xml still has DLR as true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-16 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248725#comment-14248725
 ] 

Jeffrey Zhong commented on HBASE-10201:
---

Looks good to me(+1) for master branch.  Branch-1 should rely on [~enis]'s 
feedbacks.

 Port 'Make flush decisions per column family' to trunk
 --

 Key: HBASE-10201
 URL: https://issues.apache.org/jira/browse/HBASE-10201
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Ted Yu
Assignee: zhangduo
 Fix For: 1.0.0, 2.0.0

 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
 HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
 HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
 HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
 HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
 HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
 HBASE-10201_19.patch, HBASE-10201_2.patch, HBASE-10201_3.patch, 
 HBASE-10201_4.patch, HBASE-10201_5.patch, HBASE-10201_6.patch, 
 HBASE-10201_7.patch, HBASE-10201_8.patch, HBASE-10201_9.patch, 
 compactions.png, count.png, io.png, memstore.png


 Currently the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-11 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243274#comment-14243274
 ] 

Jeffrey Zhong commented on HBASE-10201:
---

{quote}
Now I always generate a new flushSeqId and use this as the seqId of flushed 
StoreFiles. And use a maxFlushedSeqId to record completeSequenceId that passed 
to HMaster. Is it OK?
{quote}
Sounds good to me. 

 Port 'Make flush decisions per column family' to trunk
 --

 Key: HBASE-10201
 URL: https://issues.apache.org/jira/browse/HBASE-10201
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Ted Yu
Assignee: zhangduo
 Fix For: 1.0.0, 2.0.0

 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
 HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
 HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
 HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
 HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
 HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
 HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
 HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
 HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, 
 memstore.png


 Currently the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-11 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243528#comment-14243528
 ] 

Jeffrey Zhong commented on HBASE-10201:
---

[~saint@gmail.com] 
{quote}
Are you referring to the following: Will this mean we drop edits because 
region thinks its sequenceid is higher than it should be?
{quote}
Yes, as of today during replay edits in both modes, we drop WAL edits whose 
seqId less than relating store Seq Ids. There some edge cases(like a new PUT, 
region move to a different RS, DELETE on the new PUT, major compaction, move 
back to the original RS and the RS crashes) we have to know the hFile seqId 
accurately otherwise the PUT may be restored after recovery. 

We need to pass flushed seqIds per store to master so that we can optimize 
recovery process but doesn't impact correctness. 

 Port 'Make flush decisions per column family' to trunk
 --

 Key: HBASE-10201
 URL: https://issues.apache.org/jira/browse/HBASE-10201
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Ted Yu
Assignee: zhangduo
 Fix For: 1.0.0, 2.0.0

 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
 HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
 HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
 HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
 HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
 HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
 HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
 HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
 HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, 
 memstore.png


 Currently the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12465) HBase master start fails due to incorrect file creations

2014-12-11 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243538#comment-14243538
 ] 

Jeffrey Zhong commented on HBASE-12465:
---

Ping [~saint@gmail.com] any thoughts on this? Thanks.

 HBase master start fails due to incorrect file creations
 

 Key: HBASE-12465
 URL: https://issues.apache.org/jira/browse/HBASE-12465
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
 Environment: Ubuntu
Reporter: Biju Nair
Assignee: Alicia Ying Shu
  Labels: hbase, hbase-bulkload

 - Start of HBase master fails due to the following error found in the log.
 2014-11-11 20:25:58,860 WARN org.apache.hadoop.hbase.backup.HFileArchiver: 
 Failed to archive class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileablePa
 th,file:hdfs:///hbase/.tmp/data/default/tbl/00820520f5cb7839395e83f40c8d97c2/e/52bf9eee7a27460c8d9e2a26fa43c918_SeqId_282271246_
  on try #1
 org.apache.hadoop.security.AccessControlException: Permission denied: 
 user=hbase,access=WRITE,inode=/hbase/.tmp/data/default/tbl/00820520f5cb7839395e83f40c8d97c2/e/52bf9eee7a27460c8d9e2a26fa43c918_SeqId_282271246_:devuser:supergroup:-rwxr-xr-x
 -  All the files that hbase master was complaining about are created under an 
 users user-id instead on hbase user resulting in incorrect access 
 permission for the master to act on.
 - Looks like this was due to bulk load done using LoadIncrementalHFiles 
 program.
 - HBASE-12052 is another scenario similar to this one. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-11 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14243631#comment-14243631
 ] 

Jeffrey Zhong commented on HBASE-10201:
---

[~saint@gmail.com] Besides [~Apache9] mentioned, we skip edits using seqId 
of each relating store, the #4(which is  #3) is only set after region is full 
recovered(i.e all WAL edits are already replayed).

{quote}
 If master crash and loss the information, then we will not skip any edits?
{quote}
yes, we'll lose the info and will replay more edits. 

 Port 'Make flush decisions per column family' to trunk
 --

 Key: HBASE-10201
 URL: https://issues.apache.org/jira/browse/HBASE-10201
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Ted Yu
Assignee: zhangduo
 Fix For: 1.0.0, 2.0.0

 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
 HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
 HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
 HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
 HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
 HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
 HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
 HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
 HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, 
 memstore.png


 Currently the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-10 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14241806#comment-14241806
 ] 

Jeffrey Zhong commented on HBASE-10201:
---

{quote}
because we are not doing DLR in 0.98 or for some other reason? This patch is 
unlikely to make it back to 0.98 I'd say.
{quote}
It's because we defer mvcc values clean up(by HBASE-11315) but anyway we should 
maintain the semantics that HStore file seqId is the largest flushed SeqId for 
the file.

{quote}
And do I need to change original log split policy to also use a 
familyName-seqId map to filter out cells that already flushed? 
{quote}
Yes, we should but you could do in a separate issue on this though.


 Port 'Make flush decisions per column family' to trunk
 --

 Key: HBASE-10201
 URL: https://issues.apache.org/jira/browse/HBASE-10201
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Ted Yu
Assignee: zhangduo
 Fix For: 1.0.0, 2.0.0

 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
 HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
 HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
 HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
 HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
 HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_18.patch, 
 HBASE-10201_2.patch, HBASE-10201_3.patch, HBASE-10201_4.patch, 
 HBASE-10201_5.patch, HBASE-10201_6.patch, HBASE-10201_7.patch, 
 HBASE-10201_8.patch, HBASE-10201_9.patch, compactions.png, count.png, io.png, 
 memstore.png


 Currently the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-10 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12485:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for all the reviews  comments! I've integrated the fix into branch-1  
master branch.

 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0

 Attachments: HBASE-12485-v2.patch, HBASE-12485-v2.patch, 
 HBASE-12485.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-09 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239958#comment-14239958
 ] 

Jeffrey Zhong commented on HBASE-10201:
---

This is a nice feature. I scan through the patch and below are my comments:

1) There may be a correctness issue for same version(same row key  version) 
updates. Because you use following code as store file flush id, we could end up 
multiple hstore files with exact same flush seq id. While HBase resolve same 
version updates by store files' seqid(flush id). Therefore, we may end up with 
incorrect results.  This issue may only happen in 0.98 though.
{code}
+  long oldestUnflushedSeqId = wal
+  .getEarliestMemstoreSeqNum(encodedRegionName);
{code} 
In order to fix the issue, we should use current store's max flushed seq id as 
its real hstore seq id. While we need to change HRegion.lastFlushSeqId to use 
oldestUnflushedSeqId to report back Master otherwise we may have data loss 
issue.

2)  We have a feature where we force a flush by 
hbase.regionserver.optionalcacheflushinterval or 
hbase.regionserver.flush.per.changes while I didn't see you handle both cases 
in selectStoresToFlush() function. This may cause HRegion.shouldFlush() always 
return true and end up with small hstore files.

3) For region server recovery, we have an optimization by using lastFlushSeqId 
reported by region servers to skip writing edits into recovered.edits files. 
With this feature, we may unnecessarily write much more data into 
recovered.edits. This issue doesn't happen in log replay case.

4) Relating to your FlushMarker question, FulshMarker(or similar 
RegionEventWALEdit) are used for region replica feature and reasoning on 
region/store state. As you can see(in WALEdit class), those special events are 
using special column family METAFAMILY which doesn't exist for data regions. 
You should handle those events specially in getFamilyNames() otherwise they may 
affect your book keeping on oldest un-flushed seqid.  


 Port 'Make flush decisions per column family' to trunk
 --

 Key: HBASE-10201
 URL: https://issues.apache.org/jira/browse/HBASE-10201
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Ted Yu
Assignee: zhangduo
Priority: Critical
 Fix For: 1.0.0, 2.0.0, 0.98.9

 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
 HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
 HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
 HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
 HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
 HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
 HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
 HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
 HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png


 Currently the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-09 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240461#comment-14240461
 ] 

Jeffrey Zhong commented on HBASE-10201:
---

{quote}
(and the format of zk data in distributed log replay)
{quote}
You don't have to change this because log replay already gets max seqId per 
store before sending edits for replay.

 Port 'Make flush decisions per column family' to trunk
 --

 Key: HBASE-10201
 URL: https://issues.apache.org/jira/browse/HBASE-10201
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Ted Yu
Assignee: zhangduo
 Fix For: 1.0.0, 2.0.0

 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
 HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
 HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
 HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
 HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
 HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
 HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
 HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
 HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png


 Currently the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12465) HBase master start fails due to incorrect file creations

2014-12-09 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14240556#comment-14240556
 ] 

Jeffrey Zhong commented on HBASE-12465:
---

This issue might be a user uses hbase tmp folder as Import tool temporary 
output folder while HBase will try to recreate(delete and then create) tmp 
folder during starts. Therefore it cause HMaster can't start.

[~saint@gmail.com] Do u think any error from checkTempDir inside 
HMaster#createInitialFileSystemLayout is fatal? If it's fatal then we don't 
need do any thing for the JIRA otherwise we catch the error/log it and move on.

 HBase master start fails due to incorrect file creations
 

 Key: HBASE-12465
 URL: https://issues.apache.org/jira/browse/HBASE-12465
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0
 Environment: Ubuntu
Reporter: Biju Nair
Assignee: Alicia Ying Shu
  Labels: hbase, hbase-bulkload

 - Start of HBase master fails due to the following error found in the log.
 2014-11-11 20:25:58,860 WARN org.apache.hadoop.hbase.backup.HFileArchiver: 
 Failed to archive class 
 org.apache.hadoop.hbase.backup.HFileArchiver$FileablePa
 th,file:hdfs:///hbase/.tmp/data/default/tbl/00820520f5cb7839395e83f40c8d97c2/e/52bf9eee7a27460c8d9e2a26fa43c918_SeqId_282271246_
  on try #1
 org.apache.hadoop.security.AccessControlException: Permission denied: 
 user=hbase,access=WRITE,inode=/hbase/.tmp/data/default/tbl/00820520f5cb7839395e83f40c8d97c2/e/52bf9eee7a27460c8d9e2a26fa43c918_SeqId_282271246_:devuser:supergroup:-rwxr-xr-x
 -  All the files that hbase master was complaining about are created under an 
 users user-id instead on hbase user resulting in incorrect access 
 permission for the master to act on.
 - Looks like this was due to bulk load done using LoadIncrementalHFiles 
 program.
 - HBASE-12052 is another scenario similar to this one. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-12-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239008#comment-14239008
 ] 

Jeffrey Zhong commented on HBASE-10201:
---

[~saint@gmail.com] Sure. Let me take a look at this patch! 

 Port 'Make flush decisions per column family' to trunk
 --

 Key: HBASE-10201
 URL: https://issues.apache.org/jira/browse/HBASE-10201
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Ted Yu
Assignee: zhangduo
Priority: Critical
 Fix For: 1.0.0, 2.0.0, 0.98.9

 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
 HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
 HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_10.patch, 
 HBASE-10201_11.patch, HBASE-10201_12.patch, HBASE-10201_13.patch, 
 HBASE-10201_13.patch, HBASE-10201_14.patch, HBASE-10201_15.patch, 
 HBASE-10201_16.patch, HBASE-10201_17.patch, HBASE-10201_2.patch, 
 HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch, 
 HBASE-10201_6.patch, HBASE-10201_7.patch, HBASE-10201_8.patch, 
 HBASE-10201_9.patch, compactions.png, count.png, io.png, memstore.png


 Currently the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-07 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237468#comment-14237468
 ] 

Jeffrey Zhong commented on HBASE-12485:
---

Thanks [~saint@gmail.com] for the comments!

{quote}
should be just 'return isSequenceIdFile(p);'
{quote}
That's a good point. I'll change the part when committing the patch.

{quote}
That is because if old style, its stale... not pertinent to this recovery?
{quote}
Yes.

{quote}
that is the reasoning?
{quote}
Yes.


 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0

 Attachments: HBASE-12485-v2.patch, HBASE-12485-v2.patch, 
 HBASE-12485.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-06 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12485:
--
Attachment: PHOENIX-1498-v2.patch

The v2 patch addressed [~saint@gmail.com] comments by using .seqid as 
seqid file name suffix.

 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0

 Attachments: HBASE-12485.patch, PHOENIX-1498-v2.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-06 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12485:
--
Attachment: HBASE-12485-v2.patch

 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0

 Attachments: HBASE-12485-v2.patch, HBASE-12485.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-06 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12485:
--
Attachment: (was: PHOENIX-1498-v2.patch)

 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0

 Attachments: HBASE-12485-v2.patch, HBASE-12485.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-05 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14236454#comment-14236454
 ] 

Jeffrey Zhong commented on HBASE-12485:
---


[~saint@gmail.com] if we use .sqeid' then hbck reports ERROR: Found 
lingering reference file error. It's due to we have a bug in the code: 
FSUtils#getTableStoreFilePathMap() where we didn't skip recovered.edits that 
is on same folder level as column family. I fixed the issue in the attached 
patch as following. 

{code}
--- hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
+++ hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
@@ -1508,6 +1508,9 @@ public abstract class FSUtils {
   FileStatus[] familyDirs = fs.listStatus(dd, familyFilter);
   for (FileStatus familyDir : familyDirs) {
 Path family = familyDir.getPath();
+if (family.getName().equals(HConstants.RECOVERED_EDITS_DIR)) {
+  continue;
+}
 // now in family, iterate over the StoreFiles and
{code}

While if we use .sqeid, the old hbck won't work. This will cause issue for 
rollback and during upgrade if a user uses old hbck. Should we still keep 
_seqid? what do you suggest? Thanks. 



 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0

 Attachments: HBASE-12485.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-04 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12485:
--
Summary: Maintain SeqId monotonically increasing  (was: Maintain SeqId 
monotonically increasing when Region Replica is on)

 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-04 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12485:
--
Attachment: (was: HBASE-12485.patch)

 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-04 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12485:
--
Attachment: HBASE-12485.patch

Resubmit for QA run

 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0

 Attachments: HBASE-12485.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12485) Maintain SeqId monotonically increasing

2014-12-04 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234729#comment-14234729
 ] 

Jeffrey Zhong commented on HBASE-12485:
---

Thanks [~saint@gmail.com] for the review! 
{quote}
Throw exception rather than WARN.
{quote}
This is a good point. If we do this then the region won't be opened anymore 
unless human intention(it might be also hard as it needs get rid of certain 
edits from recovered edits files). 

{quote}
dot prefix like other special files 
{quote}
I tried this before but hbck gives some errors. Let me try it again to see if I 
can make hbck happy. 


 Maintain SeqId monotonically increasing
 ---

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 1.0.0, 2.0.0

 Attachments: HBASE-12485.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12485) Maintain SeqId monotonically increasing when Region Replica is on

2014-12-03 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12485:
--
Status: Patch Available  (was: Open)

 Maintain SeqId monotonically increasing when Region Replica is on
 -

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12485.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12485) Maintain SeqId monotonically increasing when Region Replica is on

2014-12-03 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12485:
--
Attachment: HBASE-12485.patch

Submit Patch for QA run. The patch basically uses SeqId file to store region 
latest seqId during region close  open. Thanks.

 Maintain SeqId monotonically increasing when Region Replica is on
 -

 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12485.patch


 We added FLUSH, REGION CLOSE events into WAL, for each those events the 
 region SeqId is bumped. 
 The issue comes from region close operation. Because when opening a region, 
 we use flushed SeqId from store files while after store flush during region 
 close we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively 
 bump up SeqId. Therefore, the region opening SeqId is lower than it should 
 be.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12600) Remove REPLAY tag dependency in Distributed Replay Mode

2014-12-01 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12600:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the reviews! I've integrated the fix into branch-1  master branch.

 Remove REPLAY tag dependency in Distributed Replay Mode
 ---

 Key: HBASE-12600
 URL: https://issues.apache.org/jira/browse/HBASE-12600
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 2.0.0, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 2.0.0, 0.99.2

 Attachments: HBASE-12600.patch


 After HBASE-11315  HBASE-8763, each edit has a unique 'version' i.e. its 
 SequenceId(or old mvcc value). Therefore, we don't need replay tag to handle 
 out of order same version updates. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12600) Remove REPLAY tag dependency in Distributed Replay Mode

2014-11-29 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228953#comment-14228953
 ] 

Jeffrey Zhong commented on HBASE-12600:
---

Yes. Thanks [~saint@gmail.com] for the reviews! I checked that checkstyle 
errors seems un-related to my patch. 

 Remove REPLAY tag dependency in Distributed Replay Mode
 ---

 Key: HBASE-12600
 URL: https://issues.apache.org/jira/browse/HBASE-12600
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 2.0.0, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12600.patch


 After HBASE-11315  HBASE-8763, each edit has a unique 'version' i.e. its 
 SequenceId(or old mvcc value). Therefore, we don't need replay tag to handle 
 out of order same version updates. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12588) Need to fail writes when row lock can't be acquired

2014-11-29 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14229003#comment-14229003
 ] 

Jeffrey Zhong commented on HBASE-12588:
---

I agree with [~Apache9]. batchMutate is all right and we just need to make sure 
that our own code do check result for each update operation after a batchMutate 
call. Thanks.

 Need to fail writes when row lock can't be acquired
 ---

 Key: HBASE-12588
 URL: https://issues.apache.org/jira/browse/HBASE-12588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.8, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12588.patch


 Currently we don't fail write operations when can't acquiring row locks as 
 shown below in HRegion#doMiniBatchMutation. 
 {code}
 ...
 RowLock rowLock = null;
 try {
   rowLock = getRowLock(mutation.getRow(), shouldBlock);
 } catch (IOException ioe) {
   LOG.warn(Failed getting lock in batch put, row=
 + Bytes.toStringBinary(mutation.getRow()), ioe);
 }
 if (rowLock == null) {
   // We failed to grab another lock
   assert !shouldBlock : Should never fail to get lock when blocking;
   break; // stop acquiring more rows for this batch
 } else {
   acquiredRowLocks.add(rowLock);
 }
 ...
 {code}
 We saw this issue when there is meta corruption problem and checkRow fails 
 with error:
 {noformat}
 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out 
 of range for row lock on HRegion
 {noformat}
 While current code still continues with writes. In all cases, this is so 
 dangerous because row locks have to be acquired before update operations to 
 guarantee row update atomicity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12588) Need to fail writes when row lock can't be acquired

2014-11-28 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228652#comment-14228652
 ] 

Jeffrey Zhong commented on HBASE-12588:
---

[~saint@gmail.com] Yes, it's similar. Batch update partially failed due not 
to get RowLock because of meta data corruption in my case. 

In addition, I searched the code and it seems we have issue in function 
HBaseFsck#rebuildMeta where we don't check return code.

{code}
meta.batchMutate(puts.toArray(new Put[puts.size()]));
{code}  

 Need to fail writes when row lock can't be acquired
 ---

 Key: HBASE-12588
 URL: https://issues.apache.org/jira/browse/HBASE-12588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.8, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12588.patch


 Currently we don't fail write operations when can't acquiring row locks as 
 shown below in HRegion#doMiniBatchMutation. 
 {code}
 ...
 RowLock rowLock = null;
 try {
   rowLock = getRowLock(mutation.getRow(), shouldBlock);
 } catch (IOException ioe) {
   LOG.warn(Failed getting lock in batch put, row=
 + Bytes.toStringBinary(mutation.getRow()), ioe);
 }
 if (rowLock == null) {
   // We failed to grab another lock
   assert !shouldBlock : Should never fail to get lock when blocking;
   break; // stop acquiring more rows for this batch
 } else {
   acquiredRowLocks.add(rowLock);
 }
 ...
 {code}
 We saw this issue when there is meta corruption problem and checkRow fails 
 with error:
 {noformat}
 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out 
 of range for row lock on HRegion
 {noformat}
 While current code still continues with writes. In all cases, this is so 
 dangerous because row locks have to be acquired before update operations to 
 guarantee row update atomicity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12600) Remove REPLAY tag dependency in Distributed Replay Mode

2014-11-28 Thread Jeffrey Zhong (JIRA)
Jeffrey Zhong created HBASE-12600:
-

 Summary: Remove REPLAY tag dependency in Distributed Replay Mode
 Key: HBASE-12600
 URL: https://issues.apache.org/jira/browse/HBASE-12600
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.99.1, 2.0.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong


After HBASE-11315  HBASE-8763, each edit has a unique 'version' i.e. its 
SequenceId(or old mvcc value). Therefore, we don't need replay tag to handle 
out of order same version updates. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12600) Remove REPLAY tag dependency in Distributed Replay Mode

2014-11-28 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12600:
--
Attachment: HBASE-12600.patch

Submit the patch for QA run.

 Remove REPLAY tag dependency in Distributed Replay Mode
 ---

 Key: HBASE-12600
 URL: https://issues.apache.org/jira/browse/HBASE-12600
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 2.0.0, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12600.patch


 After HBASE-11315  HBASE-8763, each edit has a unique 'version' i.e. its 
 SequenceId(or old mvcc value). Therefore, we don't need replay tag to handle 
 out of order same version updates. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12600) Remove REPLAY tag dependency in Distributed Replay Mode

2014-11-28 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12600:
--
Status: Patch Available  (was: Open)

 Remove REPLAY tag dependency in Distributed Replay Mode
 ---

 Key: HBASE-12600
 URL: https://issues.apache.org/jira/browse/HBASE-12600
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.99.1, 2.0.0
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12600.patch


 After HBASE-11315  HBASE-8763, each edit has a unique 'version' i.e. its 
 SequenceId(or old mvcc value). Therefore, we don't need replay tag to handle 
 out of order same version updates. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12600) Remove REPLAY tag dependency in Distributed Replay Mode

2014-11-28 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14228658#comment-14228658
 ] 

Jeffrey Zhong commented on HBASE-12600:
---

[~enis] I want to get this in branch-1. Please check it. Thanks.

 Remove REPLAY tag dependency in Distributed Replay Mode
 ---

 Key: HBASE-12600
 URL: https://issues.apache.org/jira/browse/HBASE-12600
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 2.0.0, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12600.patch


 After HBASE-11315  HBASE-8763, each edit has a unique 'version' i.e. its 
 SequenceId(or old mvcc value). Therefore, we don't need replay tag to handle 
 out of order same version updates. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-26 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12533:
--
Attachment: HBASE-12533-v2.patch

I tested the patch in a secure env and verified the fix solves the issue. In 
the v2 patch, I amended the existing test case by adding a check to verify we 
don't left staging folders behind.

 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis
Assignee: Jeffrey Zhong
 Attachments: HBASE-12533-v2.patch, HBASE-12533.patch


 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12588) Need to fail writes when row lock can't be acquired

2014-11-26 Thread Jeffrey Zhong (JIRA)
Jeffrey Zhong created HBASE-12588:
-

 Summary: Need to fail writes when row lock can't be acquired
 Key: HBASE-12588
 URL: https://issues.apache.org/jira/browse/HBASE-12588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.1, 0.98.8
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong


Currently we don't fail write operations when can't acquiring row locks as 
shown below in HRegion#doMiniBatchMutation. 
{code}
...
RowLock rowLock = null;
try {
  rowLock = getRowLock(mutation.getRow(), shouldBlock);
} catch (IOException ioe) {
  LOG.warn(Failed getting lock in batch put, row=
+ Bytes.toStringBinary(mutation.getRow()), ioe);
}
if (rowLock == null) {
  // We failed to grab another lock
  assert !shouldBlock : Should never fail to get lock when blocking;
  break; // stop acquiring more rows for this batch
} else {
  acquiredRowLocks.add(rowLock);
}
...
{code}

We saw this issue when there is meta corruption problem and checkRow fails with 
error:
{noformat}
org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out of 
range for row lock on HRegion
{noformat}

While current code still continues with writes. In all cases, this is so 
dangerous because row locks have to be acquired before update operations to 
guarantee row update atomicity.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12577) Disable distributed log replay by default

2014-11-26 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12577:
--
Attachment: HBASE-12567.patch

Submit for QA run. Thanks.

 Disable distributed log replay by default
 -

 Key: HBASE-12577
 URL: https://issues.apache.org/jira/browse/HBASE-12577
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.99.2

 Attachments: HBASE-12567.patch


 Distributed log replay is an awesome feature, but due of HBASE-11094, the 
 rolling upgrade story from 0.98 is hard to explain / enforce. 
 The fix for HBASE-11094 only went into 0.98.4, meaning rolling upgrades from 
 0.98.4- might lose data during the upgrade. 
 I feel no matter how much documentation / warning we do, we cannot prevent 
 users from doing rolling upgrades from 0.98.4- to 1.0. And we do not want to 
 inconvenience the user by requiring a two step rolling upgrade.  
 Thus I think we should disable dist log replay for 1.0, and re-enable it 
 again for 1.1 (if rolling upgrade from 0.98 is not supported). 
 ie. undo: HBASE-10888



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12577) Disable distributed log replay by default

2014-11-26 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12577:
--
Status: Patch Available  (was: Open)

 Disable distributed log replay by default
 -

 Key: HBASE-12577
 URL: https://issues.apache.org/jira/browse/HBASE-12577
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.99.2

 Attachments: HBASE-12567.patch


 Distributed log replay is an awesome feature, but due of HBASE-11094, the 
 rolling upgrade story from 0.98 is hard to explain / enforce. 
 The fix for HBASE-11094 only went into 0.98.4, meaning rolling upgrades from 
 0.98.4- might lose data during the upgrade. 
 I feel no matter how much documentation / warning we do, we cannot prevent 
 users from doing rolling upgrades from 0.98.4- to 1.0. And we do not want to 
 inconvenience the user by requiring a two step rolling upgrade.  
 Thus I think we should disable dist log replay for 1.0, and re-enable it 
 again for 1.1 (if rolling upgrade from 0.98 is not supported). 
 ie. undo: HBASE-10888



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12588) Need to fail writes when row lock can't be acquired

2014-11-26 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12588:
--
Status: Patch Available  (was: Open)

 Need to fail writes when row lock can't be acquired
 ---

 Key: HBASE-12588
 URL: https://issues.apache.org/jira/browse/HBASE-12588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.1, 0.98.8
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12588.patch


 Currently we don't fail write operations when can't acquiring row locks as 
 shown below in HRegion#doMiniBatchMutation. 
 {code}
 ...
 RowLock rowLock = null;
 try {
   rowLock = getRowLock(mutation.getRow(), shouldBlock);
 } catch (IOException ioe) {
   LOG.warn(Failed getting lock in batch put, row=
 + Bytes.toStringBinary(mutation.getRow()), ioe);
 }
 if (rowLock == null) {
   // We failed to grab another lock
   assert !shouldBlock : Should never fail to get lock when blocking;
   break; // stop acquiring more rows for this batch
 } else {
   acquiredRowLocks.add(rowLock);
 }
 ...
 {code}
 We saw this issue when there is meta corruption problem and checkRow fails 
 with error:
 {noformat}
 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out 
 of range for row lock on HRegion
 {noformat}
 While current code still continues with writes. In all cases, this is so 
 dangerous because row locks have to be acquired before update operations to 
 guarantee row update atomicity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12588) Need to fail writes when row lock can't be acquired

2014-11-26 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12588:
--
Attachment: HBASE-12588.patch

 Need to fail writes when row lock can't be acquired
 ---

 Key: HBASE-12588
 URL: https://issues.apache.org/jira/browse/HBASE-12588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.8, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12588.patch


 Currently we don't fail write operations when can't acquiring row locks as 
 shown below in HRegion#doMiniBatchMutation. 
 {code}
 ...
 RowLock rowLock = null;
 try {
   rowLock = getRowLock(mutation.getRow(), shouldBlock);
 } catch (IOException ioe) {
   LOG.warn(Failed getting lock in batch put, row=
 + Bytes.toStringBinary(mutation.getRow()), ioe);
 }
 if (rowLock == null) {
   // We failed to grab another lock
   assert !shouldBlock : Should never fail to get lock when blocking;
   break; // stop acquiring more rows for this batch
 } else {
   acquiredRowLocks.add(rowLock);
 }
 ...
 {code}
 We saw this issue when there is meta corruption problem and checkRow fails 
 with error:
 {noformat}
 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out 
 of range for row lock on HRegion
 {noformat}
 While current code still continues with writes. In all cases, this is so 
 dangerous because row locks have to be acquired before update operations to 
 guarantee row update atomicity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12588) Need to fail writes when row lock can't be acquired

2014-11-26 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226993#comment-14226993
 ] 

Jeffrey Zhong commented on HBASE-12588:
---

Yes, you're right so updates do under row lock protection. Therefore, it might 
be all right then. In this case, caller won't know this unless it checks all 
mutation return status. 


 Need to fail writes when row lock can't be acquired
 ---

 Key: HBASE-12588
 URL: https://issues.apache.org/jira/browse/HBASE-12588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.8, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12588.patch


 Currently we don't fail write operations when can't acquiring row locks as 
 shown below in HRegion#doMiniBatchMutation. 
 {code}
 ...
 RowLock rowLock = null;
 try {
   rowLock = getRowLock(mutation.getRow(), shouldBlock);
 } catch (IOException ioe) {
   LOG.warn(Failed getting lock in batch put, row=
 + Bytes.toStringBinary(mutation.getRow()), ioe);
 }
 if (rowLock == null) {
   // We failed to grab another lock
   assert !shouldBlock : Should never fail to get lock when blocking;
   break; // stop acquiring more rows for this batch
 } else {
   acquiredRowLocks.add(rowLock);
 }
 ...
 {code}
 We saw this issue when there is meta corruption problem and checkRow fails 
 with error:
 {noformat}
 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out 
 of range for row lock on HRegion
 {noformat}
 While current code still continues with writes. In all cases, this is so 
 dangerous because row locks have to be acquired before update operations to 
 guarantee row update atomicity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-12053) SecurityBulkLoadEndPoint set 777 permission on input data files

2014-11-26 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong resolved HBASE-12053.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

Thanks for the comments! I've integrated the fix into 0.98,0.99  master 
branches.

 SecurityBulkLoadEndPoint set 777 permission on input data files 
 

 Key: HBASE-12053
 URL: https://issues.apache.org/jira/browse/HBASE-12053
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12053.patch


 We have code in SecureBulkLoadEndpoint#secureBulkLoadHFiles
 {code}
   LOG.trace(Setting permission for:  + p);
   fs.setPermission(p, PERM_ALL_ACCESS);
 {code}
 This is against the point we use staging folder for secure bulk load. 
 Currently we create a hidden staging folder which has ALL_ACCESS permission 
 and we  use doAs to move input files into staging folder. Therefore, we 
 should not set 777 permission on the original input data files but files in 
 staging folder after move. 
 This may comprise security setting especially when there is an error  we 
 move the file with 777 permission back. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-26 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12533:
--
   Resolution: Fixed
Fix Version/s: 0.99.2
   0.98.9
   2.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks for the review/comments! I've integrated the fix into 0.98,0.99  master 
branches.

 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis
Assignee: Jeffrey Zhong
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12533-v2.patch, HBASE-12533.patch


 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12588) Need to fail writes when row lock can't be acquired

2014-11-26 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227081#comment-14227081
 ] 

Jeffrey Zhong commented on HBASE-12588:
---

[~Apache9] it's similar cause. This call gives a wrong impression about the 
whole batch is atomically committed while the same function will fail in whole 
batch for other errors.

 Need to fail writes when row lock can't be acquired
 ---

 Key: HBASE-12588
 URL: https://issues.apache.org/jira/browse/HBASE-12588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.8, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12588.patch


 Currently we don't fail write operations when can't acquiring row locks as 
 shown below in HRegion#doMiniBatchMutation. 
 {code}
 ...
 RowLock rowLock = null;
 try {
   rowLock = getRowLock(mutation.getRow(), shouldBlock);
 } catch (IOException ioe) {
   LOG.warn(Failed getting lock in batch put, row=
 + Bytes.toStringBinary(mutation.getRow()), ioe);
 }
 if (rowLock == null) {
   // We failed to grab another lock
   assert !shouldBlock : Should never fail to get lock when blocking;
   break; // stop acquiring more rows for this batch
 } else {
   acquiredRowLocks.add(rowLock);
 }
 ...
 {code}
 We saw this issue when there is meta corruption problem and checkRow fails 
 with error:
 {noformat}
 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out 
 of range for row lock on HRegion
 {noformat}
 While current code still continues with writes. In all cases, this is so 
 dangerous because row locks have to be acquired before update operations to 
 guarantee row update atomicity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12588) Need to fail writes when row lock can't be acquired

2014-11-26 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14227281#comment-14227281
 ] 

Jeffrey Zhong commented on HBASE-12588:
---

Adding FAILURE status would be better. I'm thinking closing the JIRA as by 
design because it seems that's the behavior(i.e let partial updates go through) 
we want. 

 Need to fail writes when row lock can't be acquired
 ---

 Key: HBASE-12588
 URL: https://issues.apache.org/jira/browse/HBASE-12588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.8, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12588.patch


 Currently we don't fail write operations when can't acquiring row locks as 
 shown below in HRegion#doMiniBatchMutation. 
 {code}
 ...
 RowLock rowLock = null;
 try {
   rowLock = getRowLock(mutation.getRow(), shouldBlock);
 } catch (IOException ioe) {
   LOG.warn(Failed getting lock in batch put, row=
 + Bytes.toStringBinary(mutation.getRow()), ioe);
 }
 if (rowLock == null) {
   // We failed to grab another lock
   assert !shouldBlock : Should never fail to get lock when blocking;
   break; // stop acquiring more rows for this batch
 } else {
   acquiredRowLocks.add(rowLock);
 }
 ...
 {code}
 We saw this issue when there is meta corruption problem and checkRow fails 
 with error:
 {noformat}
 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out 
 of range for row lock on HRegion
 {noformat}
 While current code still continues with writes. In all cases, this is so 
 dangerous because row locks have to be acquired before update operations to 
 guarantee row update atomicity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12588) Need to fail writes when row lock can't be acquired

2014-11-26 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12588:
--
Resolution: Not a Problem
Status: Resolved  (was: Patch Available)

It seems that's the expected behavior we want: allowing partial updates of a 
batch and relying client to handle the partial update scenario.

 Need to fail writes when row lock can't be acquired
 ---

 Key: HBASE-12588
 URL: https://issues.apache.org/jira/browse/HBASE-12588
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.8, 0.99.1
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Attachments: HBASE-12588.patch


 Currently we don't fail write operations when can't acquiring row locks as 
 shown below in HRegion#doMiniBatchMutation. 
 {code}
 ...
 RowLock rowLock = null;
 try {
   rowLock = getRowLock(mutation.getRow(), shouldBlock);
 } catch (IOException ioe) {
   LOG.warn(Failed getting lock in batch put, row=
 + Bytes.toStringBinary(mutation.getRow()), ioe);
 }
 if (rowLock == null) {
   // We failed to grab another lock
   assert !shouldBlock : Should never fail to get lock when blocking;
   break; // stop acquiring more rows for this batch
 } else {
   acquiredRowLocks.add(rowLock);
 }
 ...
 {code}
 We saw this issue when there is meta corruption problem and checkRow fails 
 with error:
 {noformat}
 org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row out 
 of range for row lock on HRegion
 {noformat}
 While current code still continues with writes. In all cases, this is so 
 dangerous because row locks have to be acquired before update operations to 
 guarantee row update atomicity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12522) Backport WAL refactoring to branch-1

2014-11-21 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221464#comment-14221464
 ] 

Jeffrey Zhong commented on HBASE-12522:
---

{quote}
Or do you think Phoenix's secondary index needs to have WAL edits that span 
regions?
{quote}
WAL Edits can't span regions because our log SeqId is only guaranteed to 
monotonically increase by region. Local index doesn't span edits across 
regions. For transaction support, some high level support is needed but not at 
the WAL level.


 Backport WAL refactoring to branch-1
 

 Key: HBASE-12522
 URL: https://issues.apache.org/jira/browse/HBASE-12522
 Project: HBase
  Issue Type: Task
  Components: wal
Reporter: Sean Busbey
Assignee: Sean Busbey
 Fix For: 0.99.2


 backport HBASE-10378 to branch-1.
 This will let us remove the Deprecated stuff in master, allow some baking 
 time within the 1.x line, and will give us the option of pulling back follow 
 on performance improvements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-21 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221552#comment-14221552
 ] 

Jeffrey Zhong commented on HBASE-12533:
---

[~dubislv] Is that possible for you to try to patch to see if the issue is 
addressed? The patch should be able to apply to 0.98 code base as well. Thanks.

 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis
Assignee: Jeffrey Zhong
 Attachments: HBASE-12533.patch


 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-20 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220079#comment-14220079
 ] 

Jeffrey Zhong commented on HBASE-12533:
---

{quote}
Can you add a comment when we are calling the coprocessors to say that we are 
only calling the first region
{quote}
Yes, we're use the first region to call the prepareBulkLoad once. 

 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis
Assignee: Jeffrey Zhong
 Attachments: HBASE-12533.patch


 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11099) Two situations where we could open a region with smaller sequence number

2014-11-20 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220334#comment-14220334
 ] 

Jeffrey Zhong commented on HBASE-11099:
---

{quote}
Is this speculation or something from phoenix or so?
{quote}
Currently it's a possible scenario by checking the code

{quote}
this a 0.98 issue too?
{quote}
Yes, that's a 0.98 issue too. [~apurtell] This is a low risk fix. It's better 
to get it in 0.98 as well. Thanks.
  

 Two situations where we could open a region with smaller sequence number
 

 Key: HBASE-11099
 URL: https://issues.apache.org/jira/browse/HBASE-11099
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.99.1
Reporter: Jeffrey Zhong
Assignee: Stephen Yuan Jiang
 Fix For: 2.0.0, 0.99.2

 Attachments: HBASE-11099.v1-2.0.patch


 Recently I happened to run into code where we potentially could open region 
 with smaller sequence number:
 1) Inside function: HRegion#internalFlushcache. This is due to we change the 
 way WAL Sync where we use late binding(assign sequence number right before 
 wal sync).
 The flushSeqId may less than the change sequence number included in the flush 
 which may cause later region opening code to use a smaller than expected 
 sequence number when we reopen the region.
 {code}
 flushSeqId = this.sequenceId.incrementAndGet();
 ...
 mvcc.waitForRead(w);
 {code}
 2) HRegion#replayRecoveredEdits where we have following code:
 {code}
 ...
   if (coprocessorHost != null) {
 status.setStatus(Running pre-WAL-restore hook in coprocessors);
 if (coprocessorHost.preWALRestore(this.getRegionInfo(), key, 
 val)) {
   // if bypass this log entry, ignore it ...
   continue;
 }
   }
 ...
   currentEditSeqId = key.getLogSeqNum();
 {code} 
 If coprocessor skip some tail WALEdits, then the function will return smaller 
 currentEditSeqId. In the end, a region may also open with a smaller sequence 
 number. This may cause data loss because Master may record a larger flushed 
 sequence Id and some WALEdits maybe skipped during recovery if the region 
 fail again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12533) staging directories does not deleted after secure bulk load

2014-11-19 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218280#comment-14218280
 ] 

Jeffrey Zhong commented on HBASE-12533:
---

[~dubislv] What kind of folders are left in the staging folder? Could you show 
some examples? I'm assuming you saw issue that bulkload runs successfully while 
it still leave some folders in staging folder after the bulkload. 

 staging directories does not deleted after secure bulk load
 ---

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis

 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-19 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14218920#comment-14218920
 ] 

Jeffrey Zhong commented on HBASE-12533:
---

From the pasted left folder names and they're root staging folders used by 
bulk load. In the code, I run a small test against the 0.98 code and seems 
they're cleared after a bulkload.

 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis

 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-19 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong reassigned HBASE-12533:
-

Assignee: Jeffrey Zhong

 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis
Assignee: Jeffrey Zhong

 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-19 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219087#comment-14219087
 ] 

Jeffrey Zhong commented on HBASE-12533:
---

I think I found the root cause of the issue, which I think it's a serious one. 
Below is the culprit:

{code}
  public String prepareBulkLoad(final TableName tableName) throws IOException {
try {
  return

table.coprocessorService(SecureBulkLoadProtos.SecureBulkLoadService.class,
  EMPTY_START_ROW,
  LAST_ROW,
 ...
{code}

The prepareBulkLoad is fired up to hit all data regions so it will create same 
number of staging folders as the number of regions of the bulkloaded table 
while we only use the first one.

That's why you can see many staging folders are left. 

There are couple of bugs in the SecureBulkLoadEndpoint#cleanupBulkLoad. 1) fire 
same request to all data regions 2) It tries to firstly create an already 
existing folder and then delete it.  Too many unnecessary NN operations.


 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis
Assignee: Jeffrey Zhong

 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-19 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12533:
--
Attachment: HBASE-12533.patch

 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis
Assignee: Jeffrey Zhong
 Attachments: HBASE-12533.patch


 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-19 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12533:
--
Status: Patch Available  (was: Open)

 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis
Assignee: Jeffrey Zhong
 Attachments: HBASE-12533.patch


 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12533) staging directories are not deleted after secure bulk load

2014-11-19 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14219100#comment-14219100
 ] 

Jeffrey Zhong commented on HBASE-12533:
---

This issue also cause bulkload slow because it fire unnecessary RPC requests to 
hit all data regions to create/delete staging folders. 

 staging directories are not deleted after secure bulk load
 --

 Key: HBASE-12533
 URL: https://issues.apache.org/jira/browse/HBASE-12533
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.98.6
 Environment: CDH5.2 + Kerberos
Reporter: Andrejs Dubovskis
Assignee: Jeffrey Zhong
 Attachments: HBASE-12533.patch


 We using secure bulk load heavily in our environment. And it was working with 
 no problem during some time. But last week I found that clients hangs while 
 calling *doBulkLoad*
 After some investigation I found that HDFS keeps more than 1,000,000 
 directories in /tmp/hbase-staging directory.
 When directory's content was purged the load process runs successfully.
 According the [hbase 
 book|http://hbase.apache.org/book/ch08s03.html#hbase.secure.bulkload] 
 {code}
 HBase manages creation and deletion of this directory.
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11099) Two situations where we could open a region with smaller sequence number

2014-11-17 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215628#comment-14215628
 ] 

Jeffrey Zhong commented on HBASE-11099:
---

+1. Looks good to me as well.

 Two situations where we could open a region with smaller sequence number
 

 Key: HBASE-11099
 URL: https://issues.apache.org/jira/browse/HBASE-11099
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.99.1
Reporter: Jeffrey Zhong
Assignee: Stephen Yuan Jiang
 Fix For: 1.0.0, 2.0.0, 0.99.2

 Attachments: HBASE-11099.v1-2.0.patch


 Recently I happened to run into code where we potentially could open region 
 with smaller sequence number:
 1) Inside function: HRegion#internalFlushcache. This is due to we change the 
 way WAL Sync where we use late binding(assign sequence number right before 
 wal sync).
 The flushSeqId may less than the change sequence number included in the flush 
 which may cause later region opening code to use a smaller than expected 
 sequence number when we reopen the region.
 {code}
 flushSeqId = this.sequenceId.incrementAndGet();
 ...
 mvcc.waitForRead(w);
 {code}
 2) HRegion#replayRecoveredEdits where we have following code:
 {code}
 ...
   if (coprocessorHost != null) {
 status.setStatus(Running pre-WAL-restore hook in coprocessors);
 if (coprocessorHost.preWALRestore(this.getRegionInfo(), key, 
 val)) {
   // if bypass this log entry, ignore it ...
   continue;
 }
   }
 ...
   currentEditSeqId = key.getLogSeqNum();
 {code} 
 If coprocessor skip some tail WALEdits, then the function will return smaller 
 currentEditSeqId. In the end, a region may also open with a smaller sequence 
 number. This may cause data loss because Master may record a larger flushed 
 sequence Id and some WALEdits maybe skipped during recovery if the region 
 fail again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12485) Maintain SeqId monotonically increasing when Region Replica is on

2014-11-14 Thread Jeffrey Zhong (JIRA)
Jeffrey Zhong created HBASE-12485:
-

 Summary: Maintain SeqId monotonically increasing when Region 
Replica is on
 Key: HBASE-12485
 URL: https://issues.apache.org/jira/browse/HBASE-12485
 Project: HBase
  Issue Type: Sub-task
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong


We added FLUSH, REGION CLOSE events into WAL, for each those events the region 
SeqId is bumped. 

The issue comes from region close operation. Because when opening a region, we 
use flushed SeqId from store files while after store flush during region close 
we still write COMMIT_FLUSH, REGION_CLOSE events etc which respectively bump up 
SeqId. Therefore, the region opening SeqId is lower than it should be.  




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12319) Inconsistencies during region recovery due to close/open of a region during recovery

2014-11-11 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207395#comment-14207395
 ] 

Jeffrey Zhong commented on HBASE-12319:
---

The v2 patch attached is for 0.98 only. The issue is for branch-1  0.98 and 
let me try to commit it 0.98  branch-1 if you don't mind.  

 Inconsistencies during region recovery due to close/open of a region during 
 recovery
 

 Key: HBASE-12319
 URL: https://issues.apache.org/jira/browse/HBASE-12319
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.7, 0.99.1
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12319-v2.patch, HBASE-12319.patch


 In one of my test runs, I saw the following:
 {noformat}
 2014-10-14 13:45:30,782 DEBUG 
 [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] regionserver.HStore: loaded 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
  isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
 2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Found 3 recovered edits file(s) under 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
 .
 .
 2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Null or non-existent edits file: 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0198080
 {noformat}
 The above logs is from a regionserver, say RS2. From the initial analysis it 
 seemed like the master asked a certain regionserver to open the region (let's 
 say RS1) and for some reason asked it to close soon after. The open was still 
 proceeding on RS1 but the master reassigned the region to RS2. This also 
 started the recovery but it ended up seeing an inconsistent view of the 
 recovered-edits files (it reports missing files as per the logs above) since 
 the first regionserver (RS1) deleted some files after it completed the 
 recovery. When RS2 really opens the region, it might not see the recent data 
 that was written by flushes on hor9n10 during the recovery process. Reads of 
 that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-12319) Inconsistencies during region recovery due to close/open of a region during recovery

2014-11-11 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong resolved HBASE-12319.
---
   Resolution: Fixed
Fix Version/s: (was: 2.0.0)

I've integrated the fix into 0.98  branch-1. Thanks.

 Inconsistencies during region recovery due to close/open of a region during 
 recovery
 

 Key: HBASE-12319
 URL: https://issues.apache.org/jira/browse/HBASE-12319
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.7, 0.99.1
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 0.99.2, 0.98.8

 Attachments: HBASE-12319-v2.patch, HBASE-12319.patch


 In one of my test runs, I saw the following:
 {noformat}
 2014-10-14 13:45:30,782 DEBUG 
 [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] regionserver.HStore: loaded 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
  isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
 2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Found 3 recovered edits file(s) under 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
 .
 .
 2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Null or non-existent edits file: 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0198080
 {noformat}
 The above logs is from a regionserver, say RS2. From the initial analysis it 
 seemed like the master asked a certain regionserver to open the region (let's 
 say RS1) and for some reason asked it to close soon after. The open was still 
 proceeding on RS1 but the master reassigned the region to RS2. This also 
 started the recovery but it ended up seeing an inconsistent view of the 
 recovered-edits files (it reports missing files as per the logs above) since 
 the first regionserver (RS1) deleted some files after it completed the 
 recovery. When RS2 really opens the region, it might not see the recent data 
 that was written by flushes on hor9n10 during the recovery process. Reads of 
 that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12319) Inconsistencies during region recovery due to close/open of a region during recovery

2014-11-08 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12319:
--
Priority: Critical  (was: Major)

 Inconsistencies during region recovery due to close/open of a region during 
 recovery
 

 Key: HBASE-12319
 URL: https://issues.apache.org/jira/browse/HBASE-12319
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.7, 0.99.1
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12319.patch


 In one of my test runs, I saw the following:
 {noformat}
 2014-10-14 13:45:30,782 DEBUG 
 [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] regionserver.HStore: loaded 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
  isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
 2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Found 3 recovered edits file(s) under 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
 .
 .
 2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Null or non-existent edits file: 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0198080
 {noformat}
 The above logs is from a regionserver, say RS2. From the initial analysis it 
 seemed like the master asked a certain regionserver to open the region (let's 
 say RS1) and for some reason asked it to close soon after. The open was still 
 proceeding on RS1 but the master reassigned the region to RS2. This also 
 started the recovery but it ended up seeing an inconsistent view of the 
 recovered-edits files (it reports missing files as per the logs above) since 
 the first regionserver (RS1) deleted some files after it completed the 
 recovery. When RS2 really opens the region, it might not see the recent data 
 that was written by flushes on hor9n10 during the recovery process. Reads of 
 that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12319) Inconsistencies during region recovery due to close/open of a region during recovery

2014-11-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203690#comment-14203690
 ] 

Jeffrey Zhong commented on HBASE-12319:
---

Since this issue may cause data loss or inconsistent data read, I marked it as 
critical. The symptom of the issue is that a region open doesn't wait for the 
previous region close completes so the newly opened region may not open all 
stores files if the previous region close may flush more data to disk.

The test testOpenCloseRacing failure after the fix is a test issue.  During the 
test, the region is opened twice therefore after the fix the region is opened 
in another RS while AM returns the first RS the region previously is assigned 
to. Before the fix, the test case doesn't wait for previous region open cancel 
complete, the test case can see the second region assignment immediately. If 
you put a sleep after the final assertion in the test case, you will see the 
meta location will be updated again by the previous canceled region opening. 
Below is the log after I put a 60-secs sleep after the final assert and you can 
see region ff976daf00708ecad200b113349fc4b4 in OPEN state and still got 
another OPENED which was from the previous assignment.

{noformat}
2014-11-08 12:48:32,238 DEBUG [FifoRpcScheduler.handler1-thread-2] 
master.AssignmentManager(4077): Got transition OPENED for 
{ff976daf00708ecad200b113349fc4b4 state=PENDING_OPEN, ts=1415479712217, 
server=10.10.8.224,55613,1415479709023} from 10.10.8.224,55613,1415479709023
…
2014-11-08 12:48:32,936 DEBUG [FifoRpcScheduler.handler1-thread-4] 
master.AssignmentManager(4077): Got transition OPENED for 
{ff976daf00708ecad200b113349fc4b4 state=OPEN, ts=1415479712238, 
server=10.10.8.224,55613,1415479709023} from 10.10.8.224,55609,1415479708922
{noformat}

The v2 patch amend the test case and make sure that region opening 
cleanupFailedOpen wait for region close before returning 
NotServingRegionException. Thanks.


 Inconsistencies during region recovery due to close/open of a region during 
 recovery
 

 Key: HBASE-12319
 URL: https://issues.apache.org/jira/browse/HBASE-12319
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.7, 0.99.1
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12319.patch


 In one of my test runs, I saw the following:
 {noformat}
 2014-10-14 13:45:30,782 DEBUG 
 [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] regionserver.HStore: loaded 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
  isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
 2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Found 3 recovered edits file(s) under 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
 .
 .
 2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Null or non-existent edits file: 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0198080
 {noformat}
 The above logs is from a regionserver, say RS2. From the initial analysis it 
 seemed like the master asked a certain regionserver to open the region (let's 
 say RS1) and for some reason asked it to close soon after. The open was still 
 proceeding on RS1 but the master reassigned the region to RS2. This also 
 started the recovery but it ended up seeing an inconsistent view of the 
 recovered-edits files (it reports missing files as per the logs above) since 
 the first regionserver (RS1) deleted some files after it completed the 
 recovery. When RS2 really opens the region, it might not see the recent data 
 that was written by flushes on hor9n10 during the recovery process. Reads of 
 that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12319) Inconsistencies during region recovery due to close/open of a region during recovery

2014-11-08 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12319:
--
Attachment: HBASE-12319-v2.patch

 Inconsistencies during region recovery due to close/open of a region during 
 recovery
 

 Key: HBASE-12319
 URL: https://issues.apache.org/jira/browse/HBASE-12319
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.7, 0.99.1
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12319-v2.patch, HBASE-12319.patch


 In one of my test runs, I saw the following:
 {noformat}
 2014-10-14 13:45:30,782 DEBUG 
 [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] regionserver.HStore: loaded 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
  isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
 2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Found 3 recovered edits file(s) under 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
 .
 .
 2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Null or non-existent edits file: 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0198080
 {noformat}
 The above logs is from a regionserver, say RS2. From the initial analysis it 
 seemed like the master asked a certain regionserver to open the region (let's 
 say RS1) and for some reason asked it to close soon after. The open was still 
 proceeding on RS1 but the master reassigned the region to RS2. This also 
 started the recovery but it ended up seeing an inconsistent view of the 
 recovered-edits files (it reports missing files as per the logs above) since 
 the first regionserver (RS1) deleted some files after it completed the 
 recovery. When RS2 really opens the region, it might not see the recent data 
 that was written by flushes on hor9n10 during the recovery process. Reads of 
 that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12319) Inconsistencies during region recovery due to close/open of a region during recovery

2014-11-08 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-12319:
--
Status: Patch Available  (was: Reopened)

 Inconsistencies during region recovery due to close/open of a region during 
 recovery
 

 Key: HBASE-12319
 URL: https://issues.apache.org/jira/browse/HBASE-12319
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.1, 0.98.7
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12319-v2.patch, HBASE-12319.patch


 In one of my test runs, I saw the following:
 {noformat}
 2014-10-14 13:45:30,782 DEBUG 
 [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] regionserver.HStore: loaded 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
  isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
 2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Found 3 recovered edits file(s) under 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
 .
 .
 2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Null or non-existent edits file: 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0198080
 {noformat}
 The above logs is from a regionserver, say RS2. From the initial analysis it 
 seemed like the master asked a certain regionserver to open the region (let's 
 say RS1) and for some reason asked it to close soon after. The open was still 
 proceeding on RS1 but the master reassigned the region to RS2. This also 
 started the recovery but it ended up seeing an inconsistent view of the 
 recovered-edits files (it reports missing files as per the logs above) since 
 the first regionserver (RS1) deleted some files after it completed the 
 recovery. When RS2 really opens the region, it might not see the recent data 
 that was written by flushes on hor9n10 during the recovery process. Reads of 
 that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12053) SecurityBulkLoadEndPoint set 777 permission on input data files

2014-11-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203693#comment-14203693
 ] 

Jeffrey Zhong commented on HBASE-12053:
---

I've tested the patch in a secure env and worked. If no objection, I'll commit 
it later next week. Thanks.

 SecurityBulkLoadEndPoint set 777 permission on input data files 
 

 Key: HBASE-12053
 URL: https://issues.apache.org/jira/browse/HBASE-12053
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12053.patch


 We have code in SecureBulkLoadEndpoint#secureBulkLoadHFiles
 {code}
   LOG.trace(Setting permission for:  + p);
   fs.setPermission(p, PERM_ALL_ACCESS);
 {code}
 This is against the point we use staging folder for secure bulk load. 
 Currently we create a hidden staging folder which has ALL_ACCESS permission 
 and we  use doAs to move input files into staging folder. Therefore, we 
 should not set 777 permission on the original input data files but files in 
 staging folder after move. 
 This may comprise security setting especially when there is an error  we 
 move the file with 777 permission back. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12319) Inconsistencies during region recovery due to close/open of a region during recovery

2014-11-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203696#comment-14203696
 ] 

Jeffrey Zhong commented on HBASE-12319:
---

The v2 patch passed all tests against 0.98 branch.

 Inconsistencies during region recovery due to close/open of a region during 
 recovery
 

 Key: HBASE-12319
 URL: https://issues.apache.org/jira/browse/HBASE-12319
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.7, 0.99.1
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12319-v2.patch, HBASE-12319.patch


 In one of my test runs, I saw the following:
 {noformat}
 2014-10-14 13:45:30,782 DEBUG 
 [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] regionserver.HStore: loaded 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
  isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
 2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Found 3 recovered edits file(s) under 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
 .
 .
 2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Null or non-existent edits file: 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0198080
 {noformat}
 The above logs is from a regionserver, say RS2. From the initial analysis it 
 seemed like the master asked a certain regionserver to open the region (let's 
 say RS1) and for some reason asked it to close soon after. The open was still 
 proceeding on RS1 but the master reassigned the region to RS2. This also 
 started the recovery but it ended up seeing an inconsistent view of the 
 recovered-edits files (it reports missing files as per the logs above) since 
 the first regionserver (RS1) deleted some files after it completed the 
 recovery. When RS2 really opens the region, it might not see the recent data 
 that was written by flushes on hor9n10 during the recovery process. Reads of 
 that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12319) Inconsistencies during region recovery due to close/open of a region during recovery

2014-11-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203722#comment-14203722
 ] 

Jeffrey Zhong commented on HBASE-12319:
---

{quote}
I'm not inclined to do it over except for a blocker so this may wait for the 
next release.
{quote}
That's fine. This is an existing issue and there is no reason to hold off 
current release. It's also better to bake the fix for a little before releasing 
it. Once all good, I can hold to check the fix in once after 0.98.8 is out. 
Thanks.

 Inconsistencies during region recovery due to close/open of a region during 
 recovery
 

 Key: HBASE-12319
 URL: https://issues.apache.org/jira/browse/HBASE-12319
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.7, 0.99.1
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
Priority: Critical
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12319-v2.patch, HBASE-12319.patch


 In one of my test runs, I saw the following:
 {noformat}
 2014-10-14 13:45:30,782 DEBUG 
 [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] regionserver.HStore: loaded 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
  isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
 2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Found 3 recovered edits file(s) under 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
 .
 .
 2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Null or non-existent edits file: 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0198080
 {noformat}
 The above logs is from a regionserver, say RS2. From the initial analysis it 
 seemed like the master asked a certain regionserver to open the region (let's 
 say RS1) and for some reason asked it to close soon after. The open was still 
 proceeding on RS1 but the master reassigned the region to RS2. This also 
 started the recovery but it ended up seeing an inconsistent view of the 
 recovered-edits files (it reports missing files as per the logs above) since 
 the first regionserver (RS1) deleted some files after it completed the 
 recovery. When RS2 really opens the region, it might not see the recent data 
 that was written by flushes on hor9n10 during the recovery process. Reads of 
 that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >