[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-07-26 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558661#comment-16558661
 ] 

Zach York commented on HBASE-20952:
---

I definitely have some thoughts on this. I'll try to summarize and put it here, 
but in general making the interface as basic as possible would be the easiest 
to work with IMO.

> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20856) PITA having to set WAL provider in two places

2018-07-23 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553326#comment-16553326
 ] 

Zach York commented on HBASE-20856:
---

[~stack], [~busbey], or [~elserj] have any further comments? If not, I'll 
commit it tomorrow.

> PITA having to set WAL provider in two places
> -
>
> Key: HBASE-20856
> URL: https://issues.apache.org/jira/browse/HBASE-20856
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, wal
>Affects Versions: 3.0.0
>Reporter: stack
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20856.master.001.patch, 
> HBASE-20856.master.002.patch, HBASE-20856.master.003.patch
>
>
> Courtesy of [~elserj], I learn that changing WAL we need to set two places... 
> both hbase.wal.meta_provider and hbase.wal.provider. Operator should only 
> have to set it in one place; hbase.wal.meta_provider should pick up general 
> setting unless hbase.wal.meta_provider is explicitly set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20558) Backport HBASE-17854 to branch-1

2018-07-20 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20558:
--
   Resolution: Fixed
Fix Version/s: 1.4.6
   1.5.0
   Status: Resolved  (was: Patch Available)

> Backport HBASE-17854 to branch-1
> 
>
> Key: HBASE-20558
> URL: https://issues.apache.org/jira/browse/HBASE-20558
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Fix For: 1.5.0, 1.4.6
>
> Attachments: HBASE-20558.branch-1.001.patch, report.html
>
>
> As part of HBASE-20555, HBASE-17854 is the third patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20558) Backport HBASE-17854 to branch-1

2018-07-20 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551421#comment-16551421
 ] 

Zach York commented on HBASE-20558:
---

Ah! Perfect, thanks!

> Backport HBASE-17854 to branch-1
> 
>
> Key: HBASE-20558
> URL: https://issues.apache.org/jira/browse/HBASE-20558
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20558.branch-1.001.patch, report.html
>
>
> As part of HBASE-20555, HBASE-17854 is the third patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20558) Backport HBASE-17854 to branch-1

2018-07-20 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551385#comment-16551385
 ] 

Zach York commented on HBASE-20558:
---

Note, this was done on branch-1. I was planning on doing the same for branch-1.4

> Backport HBASE-17854 to branch-1
> 
>
> Key: HBASE-20558
> URL: https://issues.apache.org/jira/browse/HBASE-20558
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20558.branch-1.001.patch, report.html
>
>
> As part of HBASE-20555, HBASE-17854 is the third patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20558) Backport HBASE-17854 to branch-1

2018-07-20 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20558:
--
Attachment: report.html

> Backport HBASE-17854 to branch-1
> 
>
> Key: HBASE-20558
> URL: https://issues.apache.org/jira/browse/HBASE-20558
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20558.branch-1.001.patch, report.html
>
>
> As part of HBASE-20555, HBASE-17854 is the third patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20558) Backport HBASE-17854 to branch-1

2018-07-20 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551383#comment-16551383
 ] 

Zach York commented on HBASE-20558:
---

[~apurtell] I ran the compat-checker and it generated errors, most of them are 
renamed methods from the previous patches, but this also added a constructor 
which seems okay to add within a version. How do you judge whether something is 
okay? I'll attach the report.html

> Backport HBASE-17854 to branch-1
> 
>
> Key: HBASE-20558
> URL: https://issues.apache.org/jira/browse/HBASE-20558
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20558.branch-1.001.patch, report.html
>
>
> As part of HBASE-20555, HBASE-17854 is the third patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20555) Backport HBASE-18083 and related changes in branch-1

2018-07-20 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551268#comment-16551268
 ] 

Zach York commented on HBASE-20555:
---

[~apurtell] FYI, I'd like to get the remaining two backports in for 1.4.6 if 
possible since the first 2 are there. I plan to push #3 soon and will review #4 
soon after that. We should be able to get this in today or Monday at the latest.

> Backport HBASE-18083 and related changes in branch-1
> 
>
> Key: HBASE-20555
> URL: https://issues.apache.org/jira/browse/HBASE-20555
> Project: HBase
>  Issue Type: Umbrella
>  Components: HFile, snapshots
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
>
> This will be the umbrella JIRA for backporting HBASE-18083 of `Make 
> large/small file clean thread number configurable in HFileCleaner` from 
> HBase's branch-2 to HBase's branch-1 that will needs a total of 4 sub-tasks 
> that backport HBASE-16490, HBASE-17215, HBASE-17854 and then HBASE-18083
> The goal is to improve HFile cleaning performance that has been introduced in 
> branch-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20558) Backport HBASE-17854 to branch-1

2018-07-20 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551260#comment-16551260
 ] 

Zach York commented on HBASE-20558:
---

+1 I will commit in an hour if no objections.

> Backport HBASE-17854 to branch-1
> 
>
> Key: HBASE-20558
> URL: https://issues.apache.org/jira/browse/HBASE-20558
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20558.branch-1.001.patch
>
>
> As part of HBASE-20555, HBASE-17854 is the third patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20856) PITA having to set WAL provider in two places

2018-07-19 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549684#comment-16549684
 ] 

Zach York commented on HBASE-20856:
---

+1 Although that wrapped provider stuff is kinda ugly (not your fault though :) 
).

 

Can you reattach patch to try and get a clean testing run? The only failure was 
a timed out test.

> PITA having to set WAL provider in two places
> -
>
> Key: HBASE-20856
> URL: https://issues.apache.org/jira/browse/HBASE-20856
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability, wal
>Affects Versions: 3.0.0
>Reporter: stack
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0, 2.0.2, 2.2.0, 2.1.1
>
> Attachments: HBASE-20856.master.001.patch, 
> HBASE-20856.master.002.patch
>
>
> Courtesy of [~elserj], I learn that changing WAL we need to set two places... 
> both hbase.wal.meta_provider and hbase.wal.provider. Operator should only 
> have to set it in one place; hbase.wal.meta_provider should pick up general 
> setting unless hbase.wal.meta_provider is explicitly set.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-07-12 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16542259#comment-16542259
 ] 

Zach York commented on HBASE-20734:
---

Thanks for reviewing [~yuzhih...@gmail.com]. I am trying to rebase on master, 
but there are a ton of conflicts. I'll hopefully get a new patch up for that 
early next week as I will likely have to redo a lot of the changes on top of 
the master branch. I'll also toss it in review board.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-07-11 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540796#comment-16540796
 ] 

Zach York commented on HBASE-20734:
---

Yep, I'll work on getting a patch for master branch. It was just easier for me 
to test on cluster with branch-1.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18840) Add functionality to refresh meta table at master startup

2018-07-11 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-18840:
--
Attachment: HBASE-18840.HBASE-18477.007.patch

> Add functionality to refresh meta table at master startup
> -
>
> Key: HBASE-18840
> URL: https://issues.apache.org/jira/browse/HBASE-18840
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBASE-18840.HBASE-18477.001.patch, 
> HBASE-18840.HBASE-18477.002.patch, HBASE-18840.HBASE-18477.003 (2) (1).patch, 
> HBASE-18840.HBASE-18477.003 (2).patch, HBASE-18840.HBASE-18477.003.patch, 
> HBASE-18840.HBASE-18477.004.patch, HBASE-18840.HBASE-18477.005.patch, 
> HBASE-18840.HBASE-18477.006.patch, HBASE-18840.HBASE-18477.007.patch
>
>
> If a HBase cluster’s hbase:meta table is deleted or a cluster is started with 
> a new meta table, HBase needs the functionality to synchronize it’s metadata 
> from Storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477

2018-07-11 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20868:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

pushed to HBASE-18477

> Fix TestCheckTestClasses on HBASE-18477
> ---
>
> Key: HBASE-20868
> URL: https://issues.apache.org/jira/browse/HBASE-20868
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
> Attachments: HBASE-20868.HBASE-18477.001.patch, 
> HBASE-20868.HBASE-18477.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-07-11 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20734:
--
Status: Patch Available  (was: Open)

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-07-11 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20734:
--
Attachment: HBASE-20734.branch-1.001.patch

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-20734.branch-1.001.patch
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding

2018-07-10 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539329#comment-16539329
 ] 

Zach York commented on HBASE-20649:
---

Trying to get up to speed on this all. Overall looks like a handy upgrade tool!

 

[~busbey] Your steps are what we want to document as an operator?

 

It would be awesome if we could provide more info when running the specific 
tool (if it fails in root dir, suggest trying a major compaction if data 
encoding for the table is correct. If it fails in archive dir, see if any 
Snapshots reference these files).

Could we have a tool/script to help automate determining which snapshot is 
'dirty' and help to automatically clean it? It just seems like a lot of manual 
steps to get your cluster upgrade ready (imagine if you had a number of 
incremental snapshots).

> Validate HFiles do not have PREFIX_TREE DataBlockEncoding
> -
>
> Key: HBASE-20649
> URL: https://issues.apache.org/jira/browse/HBASE-20649
> Project: HBase
>  Issue Type: New Feature
>Reporter: Peter Somogyi
>Assignee: Balazs Meszaros
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-20649.master.001.patch, 
> HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, 
> HBASE-20649.master.004.patch, HBASE-20649.master.005.patch
>
>
> HBASE-20592 adds a tool to check column families on the cluster do not have 
> PREFIX_TREE encoding.
> Since it is possible that DataBlockEncoding was already changed but HFiles 
> are not rewritten yet we would need a tool that can verify the content of 
> hfiles in the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477

2018-07-10 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539278#comment-16539278
 ] 

Zach York commented on HBASE-20868:
---

[~yuzhih...@gmail.com] Can you take a look when you get a chance? It's a simple 
annotation fix.

> Fix TestCheckTestClasses on HBASE-18477
> ---
>
> Key: HBASE-20868
> URL: https://issues.apache.org/jira/browse/HBASE-20868
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
> Attachments: HBASE-20868.HBASE-18477.001.patch, 
> HBASE-20868.HBASE-18477.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477

2018-07-10 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20868:
--
Attachment: HBASE-20868.HBASE-18477.002.patch

> Fix TestCheckTestClasses on HBASE-18477
> ---
>
> Key: HBASE-20868
> URL: https://issues.apache.org/jira/browse/HBASE-20868
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
> Attachments: HBASE-20868.HBASE-18477.001.patch, 
> HBASE-20868.HBASE-18477.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477

2018-07-10 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20868:
--
Status: Patch Available  (was: Open)

> Fix TestCheckTestClasses on HBASE-18477
> ---
>
> Key: HBASE-20868
> URL: https://issues.apache.org/jira/browse/HBASE-20868
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
> Attachments: HBASE-20868.HBASE-18477.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477

2018-07-10 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20868:
--
Attachment: HBASE-20868.HBASE-18477.001.patch

> Fix TestCheckTestClasses on HBASE-18477
> ---
>
> Key: HBASE-20868
> URL: https://issues.apache.org/jira/browse/HBASE-20868
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
> Attachments: HBASE-20868.HBASE-18477.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477

2018-07-10 Thread Zach York (JIRA)
Zach York created HBASE-20868:
-

 Summary: Fix TestCheckTestClasses on HBASE-18477
 Key: HBASE-20868
 URL: https://issues.apache.org/jira/browse/HBASE-20868
 Project: HBase
  Issue Type: Sub-task
Affects Versions: HBASE-18477
Reporter: Zach York
Assignee: Zach York
 Fix For: HBASE-18477






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18840) Add functionality to refresh meta table at master startup

2018-07-09 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-18840:
--
Attachment: HBASE-18840.HBASE-18477.006.patch

> Add functionality to refresh meta table at master startup
> -
>
> Key: HBASE-18840
> URL: https://issues.apache.org/jira/browse/HBASE-18840
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBASE-18840.HBASE-18477.001.patch, 
> HBASE-18840.HBASE-18477.002.patch, HBASE-18840.HBASE-18477.003 (2) (1).patch, 
> HBASE-18840.HBASE-18477.003 (2).patch, HBASE-18840.HBASE-18477.003.patch, 
> HBASE-18840.HBASE-18477.004.patch, HBASE-18840.HBASE-18477.005.patch, 
> HBASE-18840.HBASE-18477.006.patch
>
>
> If a HBase cluster’s hbase:meta table is deleted or a cluster is started with 
> a new meta table, HBase needs the functionality to synchronize it’s metadata 
> from Storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-20787) Rebase the HBASE-18477 onto the current master to continue dev

2018-07-09 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York resolved HBASE-20787.
---
Resolution: Fixed

> Rebase the HBASE-18477 onto the current master to continue dev
> --
>
> Key: HBASE-20787
> URL: https://issues.apache.org/jira/browse/HBASE-20787
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-20787) Rebase the HBASE-18477 onto the current master to continue dev

2018-07-09 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York reopened HBASE-20787:
---

Rebasing again to pull in fixes for unit tests.

> Rebase the HBASE-18477 onto the current master to continue dev
> --
>
> Key: HBASE-20787
> URL: https://issues.apache.org/jira/browse/HBASE-20787
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-09 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537664#comment-16537664
 ] 

Zach York commented on HBASE-20836:
---

Pushed, thanks [~yuzhih...@gmail.com]

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch, 
> HBASE-20836.HBASE-18477.002.patch, HBASE-20836.HBASE-18477.003.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-09 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20836:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch, 
> HBASE-20836.HBASE-18477.002.patch, HBASE-20836.HBASE-18477.003.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18840) Add functionality to refresh meta table at master startup

2018-07-09 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-18840:
--
Attachment: HBASE-18840.HBASE-18477.005.patch

> Add functionality to refresh meta table at master startup
> -
>
> Key: HBASE-18840
> URL: https://issues.apache.org/jira/browse/HBASE-18840
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBASE-18840.HBASE-18477.001.patch, 
> HBASE-18840.HBASE-18477.002.patch, HBASE-18840.HBASE-18477.003 (2) (1).patch, 
> HBASE-18840.HBASE-18477.003 (2).patch, HBASE-18840.HBASE-18477.003.patch, 
> HBASE-18840.HBASE-18477.004.patch, HBASE-18840.HBASE-18477.005.patch
>
>
> If a HBase cluster’s hbase:meta table is deleted or a cluster is started with 
> a new meta table, HBase needs the functionality to synchronize it’s metadata 
> from Storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-09 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537540#comment-16537540
 ] 

Zach York commented on HBASE-20836:
---

It looks like that did the trick. Can you commit when you get the chance, 
[~yuzhih...@gmail.com]?

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch, 
> HBASE-20836.HBASE-18477.002.patch, HBASE-20836.HBASE-18477.003.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-09 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20836:
--
Attachment: HBASE-20836.HBASE-18477.003.patch

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch, 
> HBASE-20836.HBASE-18477.002.patch, HBASE-20836.HBASE-18477.003.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20557) Backport HBASE-17215 to branch-1

2018-07-09 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20557:
--
   Resolution: Fixed
Fix Version/s: 1.4.6
   1.5.0
   Status: Resolved  (was: Patch Available)

> Backport HBASE-17215 to branch-1
> 
>
> Key: HBASE-20557
> URL: https://issues.apache.org/jira/browse/HBASE-20557
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, master
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Fix For: 1.5.0, 1.4.6
>
> Attachments: HBASE-20557.branch-1.001.patch, 
> HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-17215 is the second patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20557) Backport HBASE-17215 to branch-1

2018-07-09 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537442#comment-16537442
 ] 

Zach York commented on HBASE-20557:
---

Pushed to branch-1 and branch-1.4. Checkcompatibility didn't unearth any issues 
(a few pre-existing it seems).

> Backport HBASE-17215 to branch-1
> 
>
> Key: HBASE-20557
> URL: https://issues.apache.org/jira/browse/HBASE-20557
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, master
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Fix For: 1.5.0, 1.4.6
>
> Attachments: HBASE-20557.branch-1.001.patch, 
> HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-17215 is the second patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20837) Make IDE configuration for import order match that in our checkstyle module

2018-07-09 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16537342#comment-16537342
 ] 

Zach York commented on HBASE-20837:
---

[~taklwu] as mentioned in the email thread, this is a best effort solution so 
let's add an .xml for IntelliJ to make it easier for IntelliJ users and then 
sync between the checkstyle and the formats.

> Make IDE configuration for import order match that in our checkstyle module
> ---
>
> Key: HBASE-20837
> URL: https://issues.apache.org/jira/browse/HBASE-20837
> Project: HBase
>  Issue Type: Improvement
>  Components: community
>Affects Versions: 3.0.0, 2.0.1, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.2.0
>
> Attachments: HBASE-20837.branch-1.001.patch, 
> HBASE-20837.branch-2.001.patch, HBASE-20837.master.001.patch, IDEA import 
> layout.png, hbase-intellij-formatter.xml
>
>
> While working on HBASE-20557 contribution, we figured out that the checkstyle 
> build target (ImportOrder's `groups` 
> [http://checkstyle.sourceforge.net/config_imports.html] ) was different from 
> the development supported IDE (e.g. IntelliJ and Eclipse) formatter, we would 
> provide a fix here to sync between 
> [dev-support/hbase_eclipse_formatter.xml|https://github.com/apache/hbase/blob/master/dev-support/hbase_eclipse_formatter.xml]
>  and 
> [hbase/checkstyle.xml|https://github.com/apache/hbase/blob/master/hbase-checkstyle/src/main/resources/hbase/checkstyle.xml]
> This might need to backport the changes of master to branch-1 and branch-2 as 
> well.
> Before this change, this is what checkstyle is expecting for import order
>  
> {code:java}
> import com.google.common.annotations.VisibleForTesting;
> import java.io.IOException;
> import java.util.ArrayList;
> import java.util.List;
> import java.util.Map;
> import org.apache.commons.logging.Log;
> import org.apache.commons.logging.LogFactory;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.hbase.classification.InterfaceAudience;
> import org.apache.hadoop.hbase.conf.ConfigurationObserver;{code}
>  
> And the proposed import order with the respect to HBASE-19262 and HBASE-19552 
> should be
>  
>    !IDEA import layout.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-02 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530569#comment-16530569
 ] 

Zach York commented on HBASE-20836:
---

[~yuzhih...@gmail.com] I'm not sure why the mvninstall and shadedjars is still 
failing... perhaps it's not applying the patch first because they are failing 
for the yetus interface audience error.

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch, 
> HBASE-20836.HBASE-18477.002.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-02 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530565#comment-16530565
 ] 

Zach York commented on HBASE-20836:
---

Added a new patch to add a private constructor.

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch, 
> HBASE-20836.HBASE-18477.002.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-02 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20836:
--
Attachment: HBASE-20836.HBASE-18477.002.patch

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch, 
> HBASE-20836.HBASE-18477.002.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-02 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530488#comment-16530488
 ] 

Zach York commented on HBASE-20836:
---

FYI [~te...@apache.org]

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-02 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20836:
--
Status: Patch Available  (was: Open)

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-02 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20836:
--
Attachment: HBASE-20836.HBASE-18477.001.patch

> Add Yetus annotation for ReadReplicaClustersTableNameUtil
> -
>
> Key: HBASE-20836
> URL: https://issues.apache.org/jira/browse/HBASE-20836
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: HBASE-18477
>
> Attachments: HBASE-20836.HBASE-18477.001.patch
>
>
> Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20836) Add Yetus annotation for ReadReplicaClustersTableNameUtil

2018-07-02 Thread Zach York (JIRA)
Zach York created HBASE-20836:
-

 Summary: Add Yetus annotation for ReadReplicaClustersTableNameUtil
 Key: HBASE-20836
 URL: https://issues.apache.org/jira/browse/HBASE-20836
 Project: HBase
  Issue Type: Sub-task
Affects Versions: HBASE-18477
Reporter: Zach York
Assignee: Zach York
 Fix For: HBASE-18477


Found via nightly builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters

2018-07-02 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530432#comment-16530432
 ] 

Zach York commented on HBASE-18477:
---

[~yuzhih...@gmail.com] Thanks for pointing out. I will address that.

> Umbrella JIRA for HBase Read Replica clusters
> -
>
> Key: HBASE-18477
> URL: https://issues.apache.org/jira/browse/HBASE-18477
> Project: HBase
>  Issue Type: New Feature
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase 
> Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope 
> doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf
>
>
> Recently, changes (such as HBASE-17437) have unblocked HBase to run with a 
> root directory external to the cluster (such as in Amazon S3). This means 
> that the data is stored outside of the cluster and can be accessible after 
> the cluster has been terminated. One use case that is often asked about is 
> pointing multiple clusters to one root directory (sharing the data) to have 
> read resiliency in the case of a cluster failure.
>  
> This JIRA is an umbrella JIRA to contain all the tasks necessary to create a 
> read-replica HBase cluster that is pointed at the same root directory.
>  
> This requires making the Read-Replica cluster Read-Only (no metadata 
> operation or data operations).
> Separating the hbase:meta table for each cluster (Otherwise HBase gets 
> confused with multiple clusters trying to update the meta table with their ip 
> addresses)
> Adding refresh functionality for the meta table to ensure new metadata is 
> picked up on the read replica cluster.
> Adding refresh functionality for HFiles for a given table to ensure new data 
> is picked up on the read replica cluster.
>  
> This can be used with any existing cluster that is backed by an external 
> filesystem.
>  
> Please note that this feature is still quite manual (with the potential for 
> automation later).
>  
> More information on this particular feature can be found here: 
> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20557) Backport HBASE-17215 to branch-1

2018-06-28 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526983#comment-16526983
 ] 

Zach York commented on HBASE-20557:
---

Ah you probably mean this: 
[https://github.com/apache/hbase/blob/master/dev-support/checkcompatibility.py] 
I'll try that.

 

> Backport HBASE-17215 to branch-1
> 
>
> Key: HBASE-20557
> URL: https://issues.apache.org/jira/browse/HBASE-20557
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, master
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20557.branch-1.001.patch, 
> HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-17215 is the second patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20557) Backport HBASE-17215 to branch-1

2018-06-28 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526976#comment-16526976
 ] 

Zach York commented on HBASE-20557:
---

[~apurtell] I'm leaning towards not including this on branch-1.4 since this by 
default creates two threads for deletion (one for large and one for small 
HFiles) so there is no way to only have a single thread deleting anymore. What 
are your thoughts?

Also regarding your comment on api checker is this a tool I can run? I'm not 
familiar with it. (I looked through the plugins but didn't see one that jumped 
out immediately).

> Backport HBASE-17215 to branch-1
> 
>
> Key: HBASE-20557
> URL: https://issues.apache.org/jira/browse/HBASE-20557
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, master
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20557.branch-1.001.patch, 
> HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-17215 is the second patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20557) Backport HBASE-17215 to branch-1

2018-06-28 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526889#comment-16526889
 ] 

Zach York commented on HBASE-20557:
---

+1, reviewed on PR.

> Backport HBASE-17215 to branch-1
> 
>
> Key: HBASE-20557
> URL: https://issues.apache.org/jira/browse/HBASE-20557
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, master
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20557.branch-1.001.patch, 
> HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-17215 is the second patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-28 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16526844#comment-16526844
 ] 

Zach York commented on HBASE-20789:
---

[~openinx]
{quote}bq. If the existingBlock has nextBlockOnDiskSize set , while cachedItem 
has nextBlockOnDiskSize(default = -1) unset, the comparison should be positive 
number ? 
 So there is a typo ?
{quote}
No, cachedItem will be smaller in that case and so the comparison will be -1. I 
think this is why you were having difficulty getting the tests to pass. Please 
flip the '>' back to a '<'

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.6, 2.0.2
>
> Attachments: 
> 0001-HBASE-20789-TestBucketCache-testCacheBlockNextBlockM.patch, 
> HBASE-20789.v1.patch, HBASE-20789.v2.patch, bucket-33718.out
>
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-18840) Add functionality to refresh meta table at master startup

2018-06-27 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525571#comment-16525571
 ] 

Zach York edited comment on HBASE-18840 at 6/27/18 8:38 PM:


Trying to rebase this patch on the latest master and it looks like this commit 
[1] removed the (quite helpful) mutateRegions method, but doesn't seem to get a 
reason for removing it. [~stack] or [~appy] do you have context on why it was 
removed and what the replacement is? Do I need to call add and delete 
separately now?

 

[1] 
https://github.com/apache/hbase/commit/8ec0aa0d709ced78331dd61d28c79f3433198227#diff-081750e39413c3b1930fc9952ed0d920L2081


was (Author: zyork):
Trying to rebase this patch on the latest master and it looks like this commit 
[1] removed the (quite helpful) mutateRegions method, but doesn't seem to get a 
reason for removing it. [~stack] or [~appy] do you have context on why it was 
removed and what the replacement is? Do I need to call add and delete 
separately now?

> Add functionality to refresh meta table at master startup
> -
>
> Key: HBASE-18840
> URL: https://issues.apache.org/jira/browse/HBASE-18840
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBASE-18840.HBASE-18477.001.patch, 
> HBASE-18840.HBASE-18477.002.patch, HBASE-18840.HBASE-18477.003 (2) (1).patch, 
> HBASE-18840.HBASE-18477.003 (2).patch, HBASE-18840.HBASE-18477.003.patch, 
> HBASE-18840.HBASE-18477.004.patch
>
>
> If a HBase cluster’s hbase:meta table is deleted or a cluster is started with 
> a new meta table, HBase needs the functionality to synchronize it’s metadata 
> from Storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18840) Add functionality to refresh meta table at master startup

2018-06-27 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525571#comment-16525571
 ] 

Zach York commented on HBASE-18840:
---

Trying to rebase this patch on the latest master and it looks like this commit 
[1] removed the (quite helpful) mutateRegions method, but doesn't seem to get a 
reason for removing it. [~stack] or [~appy] do you have context on why it was 
removed and what the replacement is? Do I need to call add and delete 
separately now?

> Add functionality to refresh meta table at master startup
> -
>
> Key: HBASE-18840
> URL: https://issues.apache.org/jira/browse/HBASE-18840
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBASE-18840.HBASE-18477.001.patch, 
> HBASE-18840.HBASE-18477.002.patch, HBASE-18840.HBASE-18477.003 (2) (1).patch, 
> HBASE-18840.HBASE-18477.003 (2).patch, HBASE-18840.HBASE-18477.003.patch, 
> HBASE-18840.HBASE-18477.004.patch
>
>
> If a HBase cluster’s hbase:meta table is deleted or a cluster is started with 
> a new meta table, HBase needs the functionality to synchronize it’s metadata 
> from Storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-27 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525381#comment-16525381
 ] 

Zach York commented on HBASE-20789:
---

[~openinx] Ah... I see that putIfAbsent now. I wonder why my testing didn't 
uncover that (I ran it many times. I must have just gotten lucky :) )

 

[~Apache9] caching an already cached block was present long before HBASE-20447 
(see [2]). It seems that to fix your memory leak case we need to add locking on 
the key. 

 

Does this need to be a putIfAbsent? What is the harm in replacing the key if it 
is in the ramCache and hasn't been persisted yet?

 

 [2] 
[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java#L440]

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Attachments: bucket-33718.out
>
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20799) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-27 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525321#comment-16525321
 ] 

Zach York commented on HBASE-20799:
---

[~apurtell] see HBASE-20789. This is already being tracked there. I'll take a 
look into that soon.

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20799
> URL: https://issues.apache.org/jira/browse/HBASE-20799
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.4.5
>Reporter: Andrew Purtell
>Priority: Major
>
> {noformat}
> [ERROR] testCacheBlockNextBlockMetadataMissing[1: blockSize=16,384, 
> bucketSizes=[I@29ee9faa](org.apache.hadoop.hbase.io.hfile.bucket.TestBucketCache)
>   Time elapsed: 0.066 s  <<< FAILURE!
> java.lang.AssertionError: expected: 
> java.nio.HeapByteBuffer but 
> was: java.nio.HeapByteBuffer
> at 
> org.apache.hadoop.hbase.io.hfile.bucket.TestBucketCache.testCacheBlockNextBlockMetadataMissing(TestBucketCache.java:424)
> {noformat}
> [~zyork] any idea what is going on here?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-18840) Add functionality to refresh meta table at master startup

2018-06-26 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-18840:
--
Attachment: HBASE-18840.HBASE-18477.004.patch

> Add functionality to refresh meta table at master startup
> -
>
> Key: HBASE-18840
> URL: https://issues.apache.org/jira/browse/HBASE-18840
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBASE-18840.HBASE-18477.001.patch, 
> HBASE-18840.HBASE-18477.002.patch, HBASE-18840.HBASE-18477.003 (2) (1).patch, 
> HBASE-18840.HBASE-18477.003 (2).patch, HBASE-18840.HBASE-18477.003.patch, 
> HBASE-18840.HBASE-18477.004.patch
>
>
> If a HBase cluster’s hbase:meta table is deleted or a cluster is started with 
> a new meta table, HBase needs the functionality to synchronize it’s metadata 
> from Storage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters

2018-06-26 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524199#comment-16524199
 ] 

Zach York commented on HBASE-18477:
---

[~busbey] I'm going to pick up this work again as I'd like to avoid long term 
code maintenance.

 

What are the remaining functionality/conceptual issues to be addressed?

 

Also I'm starting to think that it doesn't make sense for these features to be 
in a feature branch as none of them are being turned on by default and keeping 
them in a feature branch increases the code maintenance aspect of the feature 
(I'd like to spend more time actually improving it rather than rebasing :) ).

 

Thanks for everyone's reviews so far!

> Umbrella JIRA for HBase Read Replica clusters
> -
>
> Key: HBASE-18477
> URL: https://issues.apache.org/jira/browse/HBASE-18477
> Project: HBase
>  Issue Type: New Feature
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase 
> Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope 
> doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf
>
>
> Recently, changes (such as HBASE-17437) have unblocked HBase to run with a 
> root directory external to the cluster (such as in Amazon S3). This means 
> that the data is stored outside of the cluster and can be accessible after 
> the cluster has been terminated. One use case that is often asked about is 
> pointing multiple clusters to one root directory (sharing the data) to have 
> read resiliency in the case of a cluster failure.
>  
> This JIRA is an umbrella JIRA to contain all the tasks necessary to create a 
> read-replica HBase cluster that is pointed at the same root directory.
>  
> This requires making the Read-Replica cluster Read-Only (no metadata 
> operation or data operations).
> Separating the hbase:meta table for each cluster (Otherwise HBase gets 
> confused with multiple clusters trying to update the meta table with their ip 
> addresses)
> Adding refresh functionality for the meta table to ensure new metadata is 
> picked up on the read replica cluster.
> Adding refresh functionality for HFiles for a given table to ensure new data 
> is picked up on the read replica cluster.
>  
> This can be used with any existing cluster that is backed by an external 
> filesystem.
>  
> Please note that this feature is still quite manual (with the potential for 
> automation later).
>  
> More information on this particular feature can be found here: 
> https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-26 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524195#comment-16524195
 ] 

Zach York commented on HBASE-20789:
---

[~yuzhih...@gmail.com] None of those build links actually load for me...

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-20787) Rebase the HBASE-18477 onto the current master to continue dev

2018-06-26 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York resolved HBASE-20787.
---
Resolution: Fixed

> Rebase the HBASE-18477 onto the current master to continue dev
> --
>
> Key: HBASE-20787
> URL: https://issues.apache.org/jira/browse/HBASE-20787
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20787) Rebase the HBASE-18477 onto the current master to continue dev

2018-06-26 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524187#comment-16524187
 ] 

Zach York commented on HBASE-20787:
---

Did a force push to clean the branch up.

> Rebase the HBASE-18477 onto the current master to continue dev
> --
>
> Key: HBASE-20787
> URL: https://issues.apache.org/jira/browse/HBASE-20787
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20789) TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky

2018-06-26 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524184#comment-16524184
 ] 

Zach York commented on HBASE-20789:
---

Sorry the comment wasn't updated, I think I had updated the comment locally, 
but it must not have been pushed out.

 

Basically there are 3 cases here:

equality (0) -> these blocks are exactly the same, no issue.

(-1) -> The existing block has nextBlockOnDiskSize set so we will get 
performance gains by keeping that version.

(1) -> The new block has nextBlockOnDiskSize set so it makes sense to cache the 
new version

 

Please let me know if anything is unclear, I can try to clear it up and I can 
try to improve this logging.

Where is the test failing? AFAIK there shouldn't be much flakiness in this 
test, but let's fix it if there is.

Thanks for digging in!

> TestBucketCache#testCacheBlockNextBlockMetadataMissing is flaky
> ---
>
> Key: HBASE-20789
> URL: https://issues.apache.org/jira/browse/HBASE-20789
> Project: HBase
>  Issue Type: Bug
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
>
> The UT failed frequently in our internal branch-2... Will dig into the UT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20787) Rebase the HBASE-18477 onto the current master to continue dev

2018-06-25 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522979#comment-16522979
 ] 

Zach York commented on HBASE-20787:
---

I will also remove the various commits/reverts of the initial patch to simplify 
things.

> Rebase the HBASE-18477 onto the current master to continue dev
> --
>
> Key: HBASE-20787
> URL: https://issues.apache.org/jira/browse/HBASE-20787
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20787) Rebase the HBASE-18477 onto the current master to continue dev

2018-06-25 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20787:
--
Issue Type: Sub-task  (was: Task)
Parent: HBASE-18477

> Rebase the HBASE-18477 onto the current master to continue dev
> --
>
> Key: HBASE-20787
> URL: https://issues.apache.org/jira/browse/HBASE-20787
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: HBASE-18477
>Reporter: Zach York
>Assignee: Zach York
>Priority: Minor
> Fix For: HBASE-18477
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20787) Rebase the HBASE-18477 onto the current master to continue dev

2018-06-25 Thread Zach York (JIRA)
Zach York created HBASE-20787:
-

 Summary: Rebase the HBASE-18477 onto the current master to 
continue dev
 Key: HBASE-20787
 URL: https://issues.apache.org/jira/browse/HBASE-20787
 Project: HBase
  Issue Type: Task
Affects Versions: HBASE-18477
Reporter: Zach York
Assignee: Zach York
 Fix For: HBASE-18477






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-21 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York reassigned HBASE-20734:
-

Assignee: Zach York

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0
>
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-15 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514190#comment-16514190
 ] 

Zach York commented on HBASE-20734:
---

I looked into the code for this and the challenge is that Region has no concept 
of walFS and the regionDir is determined from the HRegionFileSystem... I'll 
continue to look into how we can do this. Hopefully without changing the Region 
contract.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) Custom hbase.wal.dir results in dataloss because we write recovered edits into a different place than where the recovering region server looks for them.

2018-06-15 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514187#comment-16514187
 ] 

Zach York commented on HBASE-20723:
---

+1 on the patch once the checkstyle is fixed. I'll push tonight if nobody 
objects.

> Custom hbase.wal.dir results in dataloss because we write recovered edits 
> into a different place than where the recovering region server looks for them.
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes a loss of writes 
> that had not flushed prior to loss of a server.
>  
> Reproduction:
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) Custom hbase.wal.dir results in dataloss because we write recovered edits into a different place than where the recovering region server looks for them.

2018-06-15 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20723:
--
Description: 
Description:

When custom hbase.wal.dir is configured the recovery system uses it in place of 
the HBase root dir and thus constructs an incorrect path for recovered edits 
when splitting WALs. This causes the recovery code in Region Servers to believe 
there are no recovered edits to replay, which causes a loss of writes that had 
not flushed prior to loss of a server.

 

Reproduction:

This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
1.1.2.2.6.3.2-14 

By default the underlying data is going to wasb://x@y/hbase 
 I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
/mnt.

hbase.wal.dir= hdfs://mycluster/walontest

hbase.wal.dir.perms=700

hbase.rootdir.perms=700

hbase.rootdir= 
wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase

Procedure to reproduce this issue:

1. create a table in hbase shell

2. insert a row in hbase shell

3. reboot the VM which hosts that region

4. scan the table in hbase shell and it is empty

Looking at the region server logs:
{code:java}
2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
wal.WALSplitter: This region's directory doesn't exist: 
hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
It is very likely that it was already split so it's safe to discard those edits.

{code}
The log split/replay ignored actual WAL due to WALSplitter is looking for the 
region directory in the hbase.wal.dir we specified rather than the 
hbase.rootdir.

Looking at the source code,
 
[https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java]
 it uses the rootDir, which is walDir, as the tableDir root path.

So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
even in different filesystem, then the #5 uses walDir as tableDir is apparently 
wrong.

CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.

  was:
This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
1.1.2.2.6.3.2-14 

By default the underlying data is going to wasb://x@y/hbase 
 I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
/mnt.

hbase.wal.dir= hdfs://mycluster/walontest

hbase.wal.dir.perms=700

hbase.rootdir.perms=700

hbase.rootdir= 
wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase

Procedure to reproduce this issue:

1. create a table in hbase shell

2. insert a row in hbase shell

3. reboot the VM which hosts that region

4. scan the table in hbase shell and it is empty

Looking at the region server logs:
{code:java}
2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
wal.WALSplitter: This region's directory doesn't exist: 
hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
It is very likely that it was already split so it's safe to discard those edits.

{code}
The log split/replay ignored actual WAL due to WALSplitter is looking for the 
region directory in the hbase.wal.dir we specified rather than the 
hbase.rootdir.

Looking at the source code,
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
 it uses the rootDir, which is walDir, as the tableDir root path.

So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
even in different filesystem, then the #5 uses walDir as tableDir is apparently 
wrong.

CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.


> Custom hbase.wal.dir results in dataloss because we write recovered edits 
> into a different place than where the recovering region server looks for them.
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> Description:
> When custom hbase.wal.dir is configured the recovery system uses it in place 
> of the HBase root dir and thus constructs an incorrect path for recovered 
> edits when splitting WALs. This causes the recovery code in Region Servers to 
> believe there are no recovered edits to replay, which causes 

[jira] [Updated] (HBASE-20723) Custom hbase.wal.dir results in dataloss because we write recovered edits into a different place than where the recovering region server looks for them.

2018-06-15 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20723:
--
Summary: Custom hbase.wal.dir results in dataloss because we write 
recovered edits into a different place than where the recovering region server 
looks for them.  (was: WALSplitter uses the rootDir, which is walDir, as the 
recovered edits root path)

> Custom hbase.wal.dir results in dataloss because we write recovered edits 
> into a different place than where the recovering region server looks for them.
> 
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the recovered edits root path

2018-06-15 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514172#comment-16514172
 ] 

Zach York commented on HBASE-20723:
---

Thanks for the summary [~busbey], one minor tweak:
{quote}which causes a loss writes that had not flushed prior to loss of a 
server.
{quote}
which causes a loss of writes that had not flushed prior to loss of a server.

 

[~elserj] I'll make a comment on the vote thread, but I do agree with your 
sentiment. Andrew has been doing good work with keeping the releases regular :)

> WALSplitter uses the rootDir, which is walDir, as the recovered edits root 
> path
> ---
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: Recovery, wal
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Critical
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, 
> 20723.v9.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-14 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513169#comment-16513169
 ] 

Zach York commented on HBASE-20734:
---

Before HBASE-20723 goes in, there is no chance of this happening, right? 
Currently it will fail if hbase.wal.dir is set to anything but the default. We 
could remove the headache of BC if we fixed this right the first time.

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513113#comment-16513113
 ] 

Zach York commented on HBASE-20723:
---

Removed 1.1.2 from affects since the backport isn't in the public repo. This 
affects 1.4.0+

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20723:
--
Affects Version/s: (was: 1.1.2)
   1.4.0
   1.4.1
   1.4.2
   1.4.3
   1.4.4
   2.0.0

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 2.0.0
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513107#comment-16513107
 ] 

Zach York commented on HBASE-20723:
---

[~yuzhih...@gmail.com] The patch looks good to me. Let's see what QA says.

 

[~elserj] Do you think this is enough to reroll a 1.4.5 RC? It isn't a default 
config, but still quite serious for those that set this config.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, 20723.v8.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20734) Colocate recovered edits directory with hbase.wal.dir

2018-06-14 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513097#comment-16513097
 ] 

Zach York commented on HBASE-20734:
---

doesn't split log only run before region opening so shouldn't rolling upgrade 
work? Or is there a case where the log is split, not applied, and tries to 
split/recover again?

> Colocate recovered edits directory with hbase.wal.dir
> -
>
> Key: HBASE-20734
> URL: https://issues.apache.org/jira/browse/HBASE-20734
> Project: HBase
>  Issue Type: Improvement
>  Components: MTTR, Recovery, wal
>Reporter: Ted Yu
>Priority: Major
>
> During investigation of HBASE-20723, I realized that we wouldn't get the best 
> performance when hbase.wal.dir is configured to be on different (fast) media 
> than hbase rootdir w.r.t. recovered edits since recovered edits directory is 
> currently under rootdir.
> Such setup may not result in fast recovery when there is region server 
> failover.
> This issue is to find proper (hopefully backward compatible) way in 
> colocating recovered edits directory with hbase.wal.dir .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513067#comment-16513067
 ] 

Zach York commented on HBASE-20723:
---

Okay, let's do the quick fix here and look to HBASE-20734 for the longer term 
solution. In regards to your existing code, can you change all other 
occurrences of rootDir (where it means walDir) to walDir to avoid confusion in 
the mean time?

 

Let me start working on seeing if moving recovered edits to the WAL dir fixes 
HBASE-20734.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513048#comment-16513048
 ] 

Zach York commented on HBASE-20723:
---

It's great that we have these JMX values, but unless an admin happens to look 
at these, there isn't anything calling out a potential issue. If we log the 
replay_num_ops for each recovery attempt, then maybe it would be useful.
{quote}Once this logic error is fixed, I am not aware of other scenario where 
the message should be WARN.
{quote}
If only software development were that simple :)... There is no guarantee that 
this can't break again (obviously we will do our best) or a different edge case 
will break something like this. Unless there is a way to check with 100% 
certainty that this is expected behavior, this log line is still useful. Though 
I would feel better about leaving it at info if it were possible to see from 
some other log line that things might not be operating correctly. It seems too 
many assumptions are being made about the recovered edits directory.

 

Related to the actual change...

I've been thinking about this a bit and why does it make sense for WALs to be 
under hbase.wal.dir, but for recovered.edits (basically the split log) to be 
under the root directory. It seems to me that both should be under the 
hbase.wal.dir.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, 20723.v7.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-14 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16513019#comment-16513019
 ] 

Zach York commented on HBASE-20723:
---

[~yuzhih...@gmail.com] I did reproduce the issue on my side as well. Let me 
review your patch as well.

 

I think going forward we need two things to prevent something like this 
happening again:
 # Tests that utilize hbase.wal.dir (on a different FS and path) to validate 
that edits are able to be replayed and logs are split from a user level (put, 
kill RS, restart RS, check to ensure edit is present in table).
 # Improve on this log messaging around here. There should be some indication 
of the number of records replayed or something as the current logging is easy 
to miss... Considering this log means that edits won't be applied for that 
region, this should at the very least be a WARN to indicate something 
potentially wrong happened.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Assignee: Ted Yu
>Priority: Major
> Attachments: 20723.v1.txt, 20723.v2.txt, 20723.v3.txt, 20723.v4.txt, 
> 20723.v5.txt, 20723.v5.txt, 20723.v6.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-13 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511807#comment-16511807
 ] 

Zach York commented on HBASE-20723:
---

[~yuzhih...@gmail.com] I think I'm starting to see your point. Let me do a few 
tests tomorrow to confirm.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Priority: Major
> Attachments: 20723.v1.txt, logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-13 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511758#comment-16511758
 ] 

Zach York commented on HBASE-20723:
---

[~taklwu] Yes hbase.wal.dir is set in your experiments. 

 

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Priority: Major
> Attachments: logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-13 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511647#comment-16511647
 ] 

Zach York commented on HBASE-20723:
---

[~busbey] That's what I expected, but Stephen's experiment seems to prove 
otherwise. In both reproduction steps, the HDFS datanodes aren't actually going 
away (just the RS), so I think we can rule out the replication factor.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Priority: Major
> Attachments: logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
>  I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code:java}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20723) WALSplitter uses the rootDir, which is walDir, as the tableDir root path.

2018-06-12 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510654#comment-16510654
 ] 

Zach York commented on HBASE-20723:
---

[~yuzhih...@gmail.com] you're welcome to try that change, but as you can see 
from the log, it is already looking in the walDir. (rootdir == walDir here).

 

[~rpednekar] The WALSplitter is tasked with splitting logs (WALs). Why wouldn't 
it be looking in the hbase.wal.dir?

>From my understanding, the recovered edits should be in:
hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648/recovered.edits
However, that directory doesn't exist...

The one thing that one of my colleagues figured out recently is that edits 
aren't actually persisted to the WAL until they either reach a certain size or 
a time limit has elapsed that triggers the hsync() or hflush(). Since the VM 
didn't exit correctly, I'm assuming this is what happened. Can you try loading 
more data in (still under the flush size/interval), but enough to cause a hsync 
to the WAL file and see if you have the same issue?

 

[~stack] You mentioned you also ran into this issue... Can you provide any more 
info on your reproduction?

 

As [~apurtell] mentioned on the original JIRA, we tested this thoroughly when 
making the original change and have had many customers run with this setting 
without issue. It's possible that the patch was backported incorrectly to the 
Azure version, but it seems like this might be expected behavior when the 
number of writes are below the threshold required to sync/flush to the WAL file 
stream.

> WALSplitter uses the rootDir, which is walDir, as the tableDir root path.
> -
>
> Key: HBASE-20723
> URL: https://issues.apache.org/jira/browse/HBASE-20723
> Project: HBase
>  Issue Type: Bug
>  Components: hbase
>Affects Versions: 1.1.2
>Reporter: Rohan Pednekar
>Priority: Major
> Attachments: logs.zip
>
>
> This is an Azure HDInsight HBase cluster with HDP 2.6. and HBase 
> 1.1.2.2.6.3.2-14 
> By default the underlying data is going to wasb://x@y/hbase 
> I tried to move WAL folders to HDFS, which is the SSD mounted on each VM at 
> /mnt.
> hbase.wal.dir= hdfs://mycluster/walontest
> hbase.wal.dir.perms=700
> hbase.rootdir.perms=700
> hbase.rootdir= 
> wasb://XYZ[@hbaseperf.core.net|mailto:duohbase5ds...@duohbaseperf.blob.core.windows.net]/hbase
> Procedure to reproduce this issue:
> 1. create a table in hbase shell
> 2. insert a row in hbase shell
> 3. reboot the VM which hosts that region
> 4. scan the table in hbase shell and it is empty
> Looking at the region server logs:
> {code}
> 2018-06-12 22:08:40,455 INFO  [RS_LOG_REPLAY_OPS-wn2-duohba:16020-0-Writer-1] 
> wal.WALSplitter: This region's directory doesn't exist: 
> hdfs://mycluster/walontest/data/default/tb1/b7fd7db5694eb71190955292b3ff7648. 
> It is very likely that it was already split so it's safe to discard those 
> edits.
> {code}
> The log split/replay ignored actual WAL due to WALSplitter is looking for the 
> region directory in the hbase.wal.dir we specified rather than the 
> hbase.rootdir.
> Looking at the source code,
>  
> [https://github.com/hortonworks/hbase-release/blob/HDP-2.6.3.20-tag/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java#L519]
>  it uses the rootDir, which is walDir, as the tableDir root path.
> So if we use HBASE-17437, waldir and hbase rootdir are in different path or 
> even in different filesystem, then the #5 uses walDir as tableDir is 
> apparently wrong.
> CC: [~zyork], [~yuzhih...@gmail.com] Attached the logs for quick review.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17437) Support specifying a WAL directory outside of the root directory

2018-06-12 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510471#comment-16510471
 ] 

Zach York commented on HBASE-17437:
---

Looking briefly at the code it seems that rootdir should be renamed to walDir 
(to more accurately describe what it does), but the code looks correct to me. 
WalDir is being passed in as rootDir ever place WALSplitter is getting 
initialized.

> Support specifying a WAL directory outside of the root directory
> 
>
> Key: HBASE-17437
> URL: https://issues.apache.org/jira/browse/HBASE-17437
> Project: HBase
>  Issue Type: Improvement
>  Components: Filesystem Integration, wal
>Affects Versions: 1.2.4
>Reporter: Yishan Yang
>Assignee: Zach York
>Priority: Major
>  Labels: patch
> Fix For: 1.4.0, 2.0.0
>
> Attachments: HBASE-17437.branch-1.001.patch, 
> HBASE-17437.branch-1.002.patch, HBASE-17437.branch-1.003.patch, 
> HBASE-17437.branch-1.004.patch, HBASE-17437.master.001.patch, 
> HBASE-17437.master.002.patch, HBASE-17437.master.003.patch, 
> HBASE-17437.master.004.patch, HBASE-17437.master.005.patch, 
> HBASE-17437.master.006.patch, HBASE-17437.master.007.patch, 
> HBASE-17437.master.008.patch, HBASE-17437.master.009.patch, 
> HBASE-17437.master.010.patch, HBASE-17437.master.011.patch, 
> HBASE-17437.master.012.patch, hbase-17437-branch-1.2.patch, 
> hbase-17437-master.patch
>
>
> Currently, the WAL and the StoreFiles need to be on the same FileSystem. Some 
> FileSystems (such as Amazon S3) don’t support append or consistent writes. 
> These two properties are imperative for the WAL in order to avoid loss of 
> writes. However, StoreFiles don’t necessarily need the same consistency 
> guarantees (since writes are cached locally and if writes fail, they can 
> always be replayed from the WAL).
>  
> This JIRA aims to allow users to configure a log directory (for WALs) that is 
> outside of the root directory or even in a different FileSystem. The default 
> value will still put the log directory under the root directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17437) Support specifying a WAL directory outside of the root directory

2018-06-12 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510454#comment-16510454
 ] 

Zach York commented on HBASE-17437:
---

Yeah please open a new issue. Feel free to assign it to me and I can take a 
look. It's hard to judge what is happening just with the single log line. 
Please also specify which branch you were running with.

> Support specifying a WAL directory outside of the root directory
> 
>
> Key: HBASE-17437
> URL: https://issues.apache.org/jira/browse/HBASE-17437
> Project: HBase
>  Issue Type: Improvement
>  Components: Filesystem Integration, wal
>Affects Versions: 1.2.4
>Reporter: Yishan Yang
>Assignee: Zach York
>Priority: Major
>  Labels: patch
> Fix For: 1.4.0, 2.0.0
>
> Attachments: HBASE-17437.branch-1.001.patch, 
> HBASE-17437.branch-1.002.patch, HBASE-17437.branch-1.003.patch, 
> HBASE-17437.branch-1.004.patch, HBASE-17437.master.001.patch, 
> HBASE-17437.master.002.patch, HBASE-17437.master.003.patch, 
> HBASE-17437.master.004.patch, HBASE-17437.master.005.patch, 
> HBASE-17437.master.006.patch, HBASE-17437.master.007.patch, 
> HBASE-17437.master.008.patch, HBASE-17437.master.009.patch, 
> HBASE-17437.master.010.patch, HBASE-17437.master.011.patch, 
> HBASE-17437.master.012.patch, hbase-17437-branch-1.2.patch, 
> hbase-17437-master.patch
>
>
> Currently, the WAL and the StoreFiles need to be on the same FileSystem. Some 
> FileSystems (such as Amazon S3) don’t support append or consistent writes. 
> These two properties are imperative for the WAL in order to avoid loss of 
> writes. However, StoreFiles don’t necessarily need the same consistency 
> guarantees (since writes are cached locally and if writes fail, they can 
> always be replayed from the WAL).
>  
> This JIRA aims to allow users to configure a log directory (for WALs) that is 
> outside of the root directory or even in a different FileSystem. The default 
> value will still put the log directory under the root directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20556) Backport HBASE-16490 to branch-1

2018-06-12 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510084#comment-16510084
 ] 

Zach York commented on HBASE-20556:
---

Pushed to branch-1 and branch-1.4. I reran the modified tests to ensure the 
patch was working as expected.

> Backport HBASE-16490 to branch-1
> 
>
> Key: HBASE-20556
> URL: https://issues.apache.org/jira/browse/HBASE-20556
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, snapshots
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Fix For: 1.5.0, 1.4.6
>
> Attachments: HBASE-20556.branch-1.001.patch, 
> HBASE-20556.branch-1.002.patch, HBASE-20556.branch-1.003.patch, 
> HBASE-20556.branch-1.004.patch
>
>
> As part of HBASE-20555, HBASE-16490 is the first patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20556) Backport HBASE-16490 to branch-1

2018-06-12 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20556:
--
   Resolution: Fixed
Fix Version/s: 1.4.6
   1.5.0
   Status: Resolved  (was: Patch Available)

> Backport HBASE-16490 to branch-1
> 
>
> Key: HBASE-20556
> URL: https://issues.apache.org/jira/browse/HBASE-20556
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, snapshots
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Fix For: 1.5.0, 1.4.6
>
> Attachments: HBASE-20556.branch-1.001.patch, 
> HBASE-20556.branch-1.002.patch, HBASE-20556.branch-1.003.patch, 
> HBASE-20556.branch-1.004.patch
>
>
> As part of HBASE-20555, HBASE-16490 is the first patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20555) Backport HBASE-18083 and related changes in branch-1

2018-05-31 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497235#comment-16497235
 ] 

Zach York commented on HBASE-20555:
---

Thanks! I'll make sure to keep the default comment in mind while reviewing.

> Backport HBASE-18083 and related changes in branch-1
> 
>
> Key: HBASE-20555
> URL: https://issues.apache.org/jira/browse/HBASE-20555
> Project: HBase
>  Issue Type: Umbrella
>  Components: HFile, snapshots
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
>
> This will be the umbrella JIRA for backporting HBASE-18083 of `Make 
> large/small file clean thread number configurable in HFileCleaner` from 
> HBase's branch-2 to HBase's branch-1 that will needs a total of 4 sub-tasks 
> that backport HBASE-16490, HBASE-17215, HBASE-17854 and then HBASE-18083
> The goal is to improve HFile cleaning performance that has been introduced in 
> branch-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20555) Backport HBASE-18083 and related changes in branch-1

2018-05-31 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497189#comment-16497189
 ] 

Zach York commented on HBASE-20555:
---

[~apurtell] Any concerns with any of these backports being applied to 
branch-1.4? I will hold off on committing to branch-1.4 until I hear from you.

> Backport HBASE-18083 and related changes in branch-1
> 
>
> Key: HBASE-20555
> URL: https://issues.apache.org/jira/browse/HBASE-20555
> Project: HBase
>  Issue Type: Umbrella
>  Components: HFile, snapshots
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
>
> This will be the umbrella JIRA for backporting HBASE-18083 of `Make 
> large/small file clean thread number configurable in HFileCleaner` from 
> HBase's branch-2 to HBase's branch-1 that will needs a total of 4 sub-tasks 
> that backport HBASE-16490, HBASE-17215, HBASE-17854 and then HBASE-18083
> The goal is to improve HFile cleaning performance that has been introduced in 
> branch-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20665) "Already cached block XXX" message should be DEBUG

2018-05-31 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16496949#comment-16496949
 ] 

Zach York commented on HBASE-20665:
---

I swore I fixed this in a recent commit, but I guess I must have missed it in 
the latest version. Thanks for calling this out.

> "Already cached block XXX" message should be DEBUG
> --
>
> Key: HBASE-20665
> URL: https://issues.apache.org/jira/browse/HBASE-20665
> Project: HBase
>  Issue Type: Task
>  Components: BlockCache
>Affects Versions: 1.2.0, 2.0.0
>Reporter: Sean Busbey
>Priority: Minor
>  Labels: beginner
> Fix For: 3.0.0, 2.1.0, 1.5.0
>
>
> Testing a local cluster that relies on the LruBlockCache for a scan-heavy 
> workload and I'm getting a bunch of log entries at WARN
> {code}
> 2018-05-30 12:28:47,192 WARN org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
> Cached an already cached block: df01f5bf6a244f6bb1a626b927377fff_54780812 
> cb:df01f5bf6a244f6bb1a626b927377fff_54780812. This is harmless and can happen 
> in rare cases (see HBASE-8547)
> {code}
> As the log message notes (and the code confirms) this is a harmless result of 
> contention for getting a block into the CHM. the message should be at DEBUG.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20556) Backport HBASE-16490 to branch-1

2018-05-25 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491358#comment-16491358
 ] 

Zach York commented on HBASE-20556:
---

+1 I will commit if nobody has anymore comments.

> Backport HBASE-16490 to branch-1
> 
>
> Key: HBASE-20556
> URL: https://issues.apache.org/jira/browse/HBASE-20556
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, snapshots
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20556.branch-1.001.patch, 
> HBASE-20556.branch-1.002.patch, HBASE-20556.branch-1.003.patch, 
> HBASE-20556.branch-1.004.patch
>
>
> As part of HBASE-20555, HBASE-16490 is the first patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20608) Remove build option of error prone profile for branch-1 after HBASE-12350

2018-05-22 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484722#comment-16484722
 ] 

Zach York commented on HBASE-20608:
---

LGTM Andrew, also thanks for pointing out where these configs are kept!

Also, noted on your comments about automated testing. In the future I will run 
the tests myself.

> Remove build option of error prone profile for branch-1 after HBASE-12350
> -
>
> Key: HBASE-20608
> URL: https://issues.apache.org/jira/browse/HBASE-20608
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Mike Drob
>Priority: Major
>
> After HBASE-12350, error prone profile was introduced/backported to branch-1 
> and branch-2. However, branch-1 is still building with JDK 7 and is 
> incompatible with this error prone profile such that `mvn test-compile` 
> failed since then. 
> Open this issue to track the removal of `-PerrorProne` in the build command 
> (in Jenkins)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20608) Remove build option of error prone profile for branch-1 after HBASE-12350

2018-05-22 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484678#comment-16484678
 ] 

Zach York commented on HBASE-20608:
---

Can we remove/revert this change in branch-1 until we come up with the long 
term solution? This is blocking precommit in branch-1 and I'm uncomfortable 
committing with precommit in this state.

Also please let me know if I can be of any help in fixing this. Unfortunately, 
I don't have much knowledge of HBase's Jenkins/Yetus setup, but could 
potentially learn with some help.

> Remove build option of error prone profile for branch-1 after HBASE-12350
> -
>
> Key: HBASE-20608
> URL: https://issues.apache.org/jira/browse/HBASE-20608
> Project: HBase
>  Issue Type: Task
>  Components: build
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Mike Drob
>Priority: Major
>
> After HBASE-12350, error prone profile was introduced/backported to branch-1 
> and branch-2. However, branch-1 is still building with JDK 7 and is 
> incompatible with this error prone profile such that `mvn test-compile` 
> failed since then. 
> Open this issue to track the removal of `-PerrorProne` in the build command 
> (in Jenkins)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20556) Backport HBASE-16490 to branch-1

2018-05-22 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16484340#comment-16484340
 ] 

Zach York commented on HBASE-20556:
---

[~mdrob] and [~busbey] do we want *HBASE-20608* to go in before this so we can 
ensure a clean run or are we okay with ignoring the failures? It appears that 
this has been failing for some time.

Also, I'm not sure if either of you have context on the unit test 'failures', 
but it looks like the tests hit a hard limit on memory (since I doubt we are 
calling system.exit() in our test code). Is there something we can do to fix 
that (also likely not related to this change)? Is memory something we control 
at the test level or overall test execution level?

> Backport HBASE-16490 to branch-1
> 
>
> Key: HBASE-20556
> URL: https://issues.apache.org/jira/browse/HBASE-20556
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, snapshots
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20556.branch-1.001.patch, 
> HBASE-20556.branch-1.002.patch, HBASE-20556.branch-1.003.patch
>
>
> As part of HBASE-20555, HBASE-16490 is the first patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20556) Backport HBASE-16490 to branch-1

2018-05-16 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477728#comment-16477728
 ] 

Zach York commented on HBASE-20556:
---

[~taklwu] can you retrigger the unit tests (reattach the patch)? It looks like 
there was a surefire error that caused the failure.

> Backport HBASE-16490 to branch-1
> 
>
> Key: HBASE-20556
> URL: https://issues.apache.org/jira/browse/HBASE-20556
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, snapshots
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20556.branch-1.002.patch
>
>
> As part of HBASE-20555, HBASE-16490 is the first patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20556) Backport HBASE-16490 to branch-1

2018-05-10 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20556:
--
Status: Patch Available  (was: Open)

> Backport HBASE-16490 to branch-1
> 
>
> Key: HBASE-20556
> URL: https://issues.apache.org/jira/browse/HBASE-20556
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, snapshots
>Affects Versions: 1.4.4, 1.4.5
>Reporter: Tak Lon (Stephen) Wu
>Assignee: Tak Lon (Stephen) Wu
>Priority: Major
> Attachments: HBASE-20556.branch-1.002.patch
>
>
> As part of HBASE-20555, HBASE-16490 is the first patch that is needed for 
> backporting HBASE-18083



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20447) Only fail cacheBlock if block collisions aren't related to next block metadata

2018-05-10 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20447:
--
Attachment: HBASE-20447.master.004.patch

> Only fail cacheBlock if block collisions aren't related to next block metadata
> --
>
> Key: HBASE-20447
> URL: https://issues.apache.org/jira/browse/HBASE-20447
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache, BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20447.branch-1.001.patch, 
> HBASE-20447.branch-1.002.patch, HBASE-20447.branch-1.003.patch, 
> HBASE-20447.branch-1.004.patch, HBASE-20447.branch-1.005.patch, 
> HBASE-20447.branch-1.006.patch, HBASE-20447.master.001.patch, 
> HBASE-20447.master.002.patch, HBASE-20447.master.003.patch, 
> HBASE-20447.master.004.patch
>
>
> This is the issue I was originally having here: 
> [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]
>  
> When we pread, we don't force the read to read all of the next block header.
> However, when we get into a race condition where two opener threads try to
> cache the same block and one thread read all of the next block header and the 
> other one didn't, it will fail the open process. This is especially important
> in a splitting case where it will potentially fail the split process.
> Instead, in the caches, we should only fail if the required blocks are 
> different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20447) Only fail cacheBlock if block collisions aren't related to next block metadata

2018-05-09 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20447:
--
Attachment: HBASE-20447.master.003.patch

> Only fail cacheBlock if block collisions aren't related to next block metadata
> --
>
> Key: HBASE-20447
> URL: https://issues.apache.org/jira/browse/HBASE-20447
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache, BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20447.branch-1.001.patch, 
> HBASE-20447.branch-1.002.patch, HBASE-20447.branch-1.003.patch, 
> HBASE-20447.branch-1.004.patch, HBASE-20447.branch-1.005.patch, 
> HBASE-20447.branch-1.006.patch, HBASE-20447.master.001.patch, 
> HBASE-20447.master.002.patch, HBASE-20447.master.003.patch
>
>
> This is the issue I was originally having here: 
> [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]
>  
> When we pread, we don't force the read to read all of the next block header.
> However, when we get into a race condition where two opener threads try to
> cache the same block and one thread read all of the next block header and the 
> other one didn't, it will fail the open process. This is especially important
> in a splitting case where it will potentially fail the split process.
> Instead, in the caches, we should only fail if the required blocks are 
> different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20447) Only fail cacheBlock if block collisions aren't related to next block metadata

2018-05-09 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20447:
--
Attachment: HBASE-20447.branch-1.006.patch

> Only fail cacheBlock if block collisions aren't related to next block metadata
> --
>
> Key: HBASE-20447
> URL: https://issues.apache.org/jira/browse/HBASE-20447
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache, BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20447.branch-1.001.patch, 
> HBASE-20447.branch-1.002.patch, HBASE-20447.branch-1.003.patch, 
> HBASE-20447.branch-1.004.patch, HBASE-20447.branch-1.005.patch, 
> HBASE-20447.branch-1.006.patch, HBASE-20447.master.001.patch, 
> HBASE-20447.master.002.patch
>
>
> This is the issue I was originally having here: 
> [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]
>  
> When we pread, we don't force the read to read all of the next block header.
> However, when we get into a race condition where two opener threads try to
> cache the same block and one thread read all of the next block header and the 
> other one didn't, it will fail the open process. This is especially important
> in a splitting case where it will potentially fail the split process.
> Instead, in the caches, we should only fail if the required blocks are 
> different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20204) Add locking to RefreshFileConnections in BucketCache

2018-05-09 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20204:
--
Fix Version/s: 1.4.5
   1.5.0
   2.1.0
   3.0.0

> Add locking to RefreshFileConnections in BucketCache
> 
>
> Key: HBASE-20204
> URL: https://issues.apache.org/jira/browse/HBASE-20204
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.5
>
> Attachments: HBASE-20204.master.001.patch, 
> HBASE-20204.master.002.patch, HBASE-20204.master.003.patch, 
> HBASE-20204.master.004.patch
>
>
> This is a follow-up to HBASE-20141 where [~anoop.hbase] suggested adding 
> locking for refreshing channels.
> I have also seen this become an issue when a RS has to abort and it locks on 
> trying to flush out the remaining data to the cache (since cache on write was 
> turned on).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20204) Add locking to RefreshFileConnections in BucketCache

2018-05-09 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20204:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Add locking to RefreshFileConnections in BucketCache
> 
>
> Key: HBASE-20204
> URL: https://issues.apache.org/jira/browse/HBASE-20204
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 1.4.5
>
> Attachments: HBASE-20204.master.001.patch, 
> HBASE-20204.master.002.patch, HBASE-20204.master.003.patch, 
> HBASE-20204.master.004.patch
>
>
> This is a follow-up to HBASE-20141 where [~anoop.hbase] suggested adding 
> locking for refreshing channels.
> I have also seen this become an issue when a RS has to abort and it locks on 
> trying to flush out the remaining data to the cache (since cache on write was 
> turned on).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20204) Add locking to RefreshFileConnections in BucketCache

2018-05-09 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469542#comment-16469542
 ] 

Zach York commented on HBASE-20204:
---

Pushed to master, branch-2, branch-1, branch-1.4.

I didn't push to branch-2.0 because of [~stack]'s email saying to refrain from 
pushing to 2.0 (though this is a fairly small bug fix).

> Add locking to RefreshFileConnections in BucketCache
> 
>
> Key: HBASE-20204
> URL: https://issues.apache.org/jira/browse/HBASE-20204
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBASE-20204.master.001.patch, 
> HBASE-20204.master.002.patch, HBASE-20204.master.003.patch, 
> HBASE-20204.master.004.patch
>
>
> This is a follow-up to HBASE-20141 where [~anoop.hbase] suggested adding 
> locking for refreshing channels.
> I have also seen this become an issue when a RS has to abort and it locks on 
> trying to flush out the remaining data to the cache (since cache on write was 
> turned on).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20204) Add locking to RefreshFileConnections in BucketCache

2018-05-08 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20204:
--
Attachment: HBASE-20204.master.004.patch

> Add locking to RefreshFileConnections in BucketCache
> 
>
> Key: HBASE-20204
> URL: https://issues.apache.org/jira/browse/HBASE-20204
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Attachments: HBASE-20204.master.001.patch, 
> HBASE-20204.master.002.patch, HBASE-20204.master.003.patch, 
> HBASE-20204.master.004.patch
>
>
> This is a follow-up to HBASE-20141 where [~anoop.hbase] suggested adding 
> locking for refreshing channels.
> I have also seen this become an issue when a RS has to abort and it locks on 
> trying to flush out the remaining data to the cache (since cache on write was 
> turned on).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20447) Only fail cacheBlock if block collisions aren't related to next block metadata

2018-04-25 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20447:
--
Attachment: HBASE-20447.branch-1.005.patch

> Only fail cacheBlock if block collisions aren't related to next block metadata
> --
>
> Key: HBASE-20447
> URL: https://issues.apache.org/jira/browse/HBASE-20447
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache, BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20447.branch-1.001.patch, 
> HBASE-20447.branch-1.002.patch, HBASE-20447.branch-1.003.patch, 
> HBASE-20447.branch-1.004.patch, HBASE-20447.branch-1.005.patch, 
> HBASE-20447.master.001.patch, HBASE-20447.master.002.patch
>
>
> This is the issue I was originally having here: 
> [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]
>  
> When we pread, we don't force the read to read all of the next block header.
> However, when we get into a race condition where two opener threads try to
> cache the same block and one thread read all of the next block header and the 
> other one didn't, it will fail the open process. This is especially important
> in a splitting case where it will potentially fail the split process.
> Instead, in the caches, we should only fail if the required blocks are 
> different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20447) Only fail cacheBlock if block collisions aren't related to next block metadata

2018-04-25 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453171#comment-16453171
 ] 

Zach York commented on HBASE-20447:
---

I have attached a new patch that fixes the master patch. I figured out that it 
was an incorrect forward port w.r.t returnBlock nothing broken with shared 
memory [~anoop.hbase] (sorry for pulling you in on that).

> Only fail cacheBlock if block collisions aren't related to next block metadata
> --
>
> Key: HBASE-20447
> URL: https://issues.apache.org/jira/browse/HBASE-20447
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache, BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20447.branch-1.001.patch, 
> HBASE-20447.branch-1.002.patch, HBASE-20447.branch-1.003.patch, 
> HBASE-20447.branch-1.004.patch, HBASE-20447.master.001.patch, 
> HBASE-20447.master.002.patch
>
>
> This is the issue I was originally having here: 
> [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]
>  
> When we pread, we don't force the read to read all of the next block header.
> However, when we get into a race condition where two opener threads try to
> cache the same block and one thread read all of the next block header and the 
> other one didn't, it will fail the open process. This is especially important
> in a splitting case where it will potentially fail the split process.
> Instead, in the caches, we should only fail if the required blocks are 
> different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20447) Only fail cacheBlock if block collisions aren't related to next block metadata

2018-04-25 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-20447:
--
Attachment: HBASE-20447.master.002.patch

> Only fail cacheBlock if block collisions aren't related to next block metadata
> --
>
> Key: HBASE-20447
> URL: https://issues.apache.org/jira/browse/HBASE-20447
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache, BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20447.branch-1.001.patch, 
> HBASE-20447.branch-1.002.patch, HBASE-20447.branch-1.003.patch, 
> HBASE-20447.branch-1.004.patch, HBASE-20447.master.001.patch, 
> HBASE-20447.master.002.patch
>
>
> This is the issue I was originally having here: 
> [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]
>  
> When we pread, we don't force the read to read all of the next block header.
> However, when we get into a race condition where two opener threads try to
> cache the same block and one thread read all of the next block header and the 
> other one didn't, it will fail the open process. This is especially important
> in a splitting case where it will potentially fail the split process.
> Instead, in the caches, we should only fail if the required blocks are 
> different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20447) Only fail cacheBlock if block collisions aren't related to next block metadata

2018-04-24 Thread Zach York (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451373#comment-16451373
 ] 

Zach York commented on HBASE-20447:
---

TestBucketCache passes locally. Reattaching to retry.

> Only fail cacheBlock if block collisions aren't related to next block metadata
> --
>
> Key: HBASE-20447
> URL: https://issues.apache.org/jira/browse/HBASE-20447
> Project: HBase
>  Issue Type: Bug
>  Components: BlockCache, BucketCache
>Affects Versions: 1.4.3, 2.0.0
>Reporter: Zach York
>Assignee: Zach York
>Priority: Major
> Fix For: 3.0.0, 2.1.0, 1.5.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20447.branch-1.001.patch, 
> HBASE-20447.branch-1.002.patch, HBASE-20447.branch-1.003.patch, 
> HBASE-20447.branch-1.004.patch, HBASE-20447.master.001.patch
>
>
> This is the issue I was originally having here: 
> [http://mail-archives.apache.org/mod_mbox/hbase-dev/201802.mbox/%3CCAN+qs_Pav=md_aoj4xji+kcnetubg2xou2ntxv1g6m8-5vn...@mail.gmail.com%3E]
>  
> When we pread, we don't force the read to read all of the next block header.
> However, when we get into a race condition where two opener threads try to
> cache the same block and one thread read all of the next block header and the 
> other one didn't, it will fail the open process. This is especially important
> in a splitting case where it will potentially fail the split process.
> Instead, in the caches, we should only fail if the required blocks are 
> different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2   3   4   5   6   7   8   >