[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2021-10-13 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-941241199


   merged and if we need a new documentation, we can have a followup JIRA


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2021-10-12 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-941241199


   merged and if we need a new documentation, we can have a followup JIRA


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2021-09-09 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-916299365


   @joshelser  should we wait till add the documentation in this commit ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2021-01-14 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-760640589


   I have fixed the conflicts, probably will push it in two days and see if you 
guys have any comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2020-12-28 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-751833447


   thanks Duo for your understanding and comments ;) and happy holidays! 
   
   @z-york your point on idempotent is good, but at this point will you agree 
that we create a follow up on first add exception and discuss/handle this data 
loss issues in a different JIRA ?  will you reconsider to change your -1 vote ? 
(btw I can rebase it after we agree this change)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2020-12-01 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-736772662


   >  Is clumsy operator deleting the meta location znode by mistake a valid 
failure mode ?
   no this is a special case that we have been supporting, where the HBase 
cluster freshly restarts on top of only flushed HFiles and does not come with 
WAL or ZK. and we admitted that it's a bit different from the community stand 
points that WAL and ZK must be both pre-existed when master or/and RSs start on 
existing HFiles to resume the states left from any procedures. 
   
   > What about adding extra step before assign where we wait asking Master a 
question about the cluster state such as if any of the RSs that are checking in 
have Regions on them; i.e. if Regions already assigned, if an already 'up' 
cluster? Would that help?
   
   having extra step to check if RSs has any assigned may help, but I don't 
know if we can do that before the server manager find any region server is 
online. 
   
   > You fellows don't want to have to run a script beforehand? ZK is up and 
just put an empty location up or ask Master or hbck2 to do it for you? 
   I think HBCK/HBCK2 is performing online repairing, there are few concerns 
we're having 
   1. if the master is not up and running, then we cannot proceed 
   2. even if the master is up, the repairing on hundreds or thousand of 
regions implies long scanning time, which IMO we can save this time by just 
reloading it from existing meta. 
   3. having an additional steps/scripts to start a HBase cluster in the 
mentioned cloud use case seem a manual/semi-automated step we don't find a good 
fit to hold and maintain them.
   
   Personally, it's fine to me with throwing exception as Duo suggested, and on 
our side we need to find a way to continue if we see this exception. then we 
improve it in the future when we need to completely getting rid of the extra 
step on hbck. 
   
   So, for this PR, if we don't hear any other critical suggestion, maybe I 
will leave it "close" as unresolved, do you guys agree ? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2020-08-24 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-679418281


   so.how can we get consensus on this PR or this series of idempotent 
issues (for InitMetaProcedure) ? I don't mind to break them into more tasks as 
Zach has created those followup bugs (HBASE-24922 and HBASE-24923), but I don't 
see a clear agreement among everyone from whether we should continue the 
bootstrap or fail hard on the bootstrap when we find the meta table in 
InitMetaProcedure. 
   
   how do we move further? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2020-08-14 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-674265891


   I apologized for the dev@ email, but I was thinking differently overnight 
about your suggestion (sorry that I reread many times until I found the gap 
this morning)
   
   > You can see the code in finisheActiveMasterInitialization and also the 
code in AssignmentManager.start. In AssignmentManager.start, we will try to 
load the start of meta region from zookeeper, and if there is none, we will 
create a region node in offline state. And in finishActiveMasterInitialization, 
if we find that the state of meta region node is offline, we will schedule 
InitMetaProcedure.
   So what you need to do here, is to put the meta region znode to zookeeper, 
before you restart the hbase cluster. So we will not schedule InitMetaProcedure 
again.
   
   
   didn't the coming up master region that store the meta location in 
[HBASE-24408](https://issues.apache.org/jira/browse/HBASE-24408) and 
[PR#1746](https://github.com/apache/hbase/pull/1746/commits/976d0c4e5b732a23773bd306f79e8017344b58f3)
 solve our conflict of interests that we don't need to relaying on ZK for 
getting the server name (old host) for the meta region ? such even if we don't 
have the ZK, we can move on and don't submit the InitMetaProcedure because the 
state of the meta region is not `OFFLINE`. 
   
   if you confirm above, I may say bring this PR and keep highlighting the 
zookeeper discussion is my mistake and I should have learnt the master region 
ahead of this PR. (then we just need to move to the coming up version, and we 
can still restart on the cloud use cases)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2020-08-13 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673880783


   > how we could start a cluster with no data on zookeeper? 
   
   IMO the title of the google design may be going on the cloud use cases that 
has been restarting on just HFiles without WAL and without Zookeeper (but all 
the user tables are flushed and disabled before terminating it). I knew that 
may not be mentioned in the [book 
tutorial](https://hbase.apache.org/book.html), and it may be a good time to 
clarify how that cases are actually working and some users has been using in 
HBase-1.4.x and HBase maybe before 2.1.7. Then we can see what the gaps maybe 
now in branch-2.2+ to support it back (basically, that's the intention of 
having this PR and [PR#2113](https://github.com/apache/hbase/pull/2113) )
   
   > As you want to start the HMaster and recover from the inconsistency
   
   what does `inconsistency` mean here? I see your point that using 
`InitMetaProcedure#INIT_META_WRITE_FS_LAYOUT` to `indicate` inconsistency, but 
if we don't delete meta and just starting the cluster, IMO HBCK `-detail` will 
show a clean result without any inconsistency? we may not hit any 
`inconsistency` when getting into `InitMetaProcedure`. For this topic, I may 
just start a email on if `InitMetaProcedure` should delete meta without 
checking `partial` and `consistency`, please bear with me, and this may be the 
only thing I want a quick discussion instead of a long design doc on the cloud 
use cases. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2020-08-13 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673751605


   > should not depend on the data on zookeeper.
   
   I agreed with you that we may not be ready to totally skip relying on the 
data stored on zookeeper, that's definitely a boarder discussion on what HBase 
currently depends on Zookeeper (branch-2 and master), especially if data on 
Zookeeper could be ephemeral or removed. (I thought we're in the progress of 
moving data into ROOT region, aren't we ? e.g. 
[Proc-v2](https://issues.apache.org/jira/browse/HBASE-20610)
   
   Also, my initial goal is that the meta data/directory should not be deleted 
if possible, and we're trying to provide a persisted condition not to always 
delete meta if it's not `partial` (protected by the ZK data).
   
   sorry that I may be newbie on the proc-v2 and zk data, should we start a 
thread on the dev@ list to discuss about the following ? (my goal is to find a 
consensus how we can move this PR to either completes it or not fixed)
   
   1. should we delete meta directory when HM starts ? 
   2. after 2.2+, should not depend on the data on zookeeper and have more of 
the info into proc-v2 in the master region?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2020-08-12 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-673253206


   first of all, thanks Duo again. 
   
   > I think for the scenario here, we just need to write the cluster id and 
other things to zookeeper? Just make sure that the current code in HBase will 
not consider us as a fresh new cluster. We do not need to rebuild meta?
   
   So, let me confirm your suggestion, that means if we add one more field in 
ZNode, e.g. a boolean `completedMetaBoostrap`, if we find both `clusterId` and 
`completedMetaBoostrap` in ZK, we will not delete meta directory ?
   
   followup if ZK Znode data is used to determine if this is a fresh new 
cluster, can we skip the delete meta directory if `clusterId` and 
`completedMetaBoostrap` are never set but we found meta directory?  this is the 
cloud use cases which we don't have ZK to make the decision; such we don't know 
if the meta is partial, and IMO, we should just leave the meta directory and if 
anything bad happens, the operator can still run HBCK. (if we do the other way 
around and always delete the meta, then we're losing the possibility the 
cluster can heal itself, and we cannot confirm if this is partial, doesn't it?)
   
> For the InitMetaProcedure, the assumption is that, if we found that the 
meta table directory is there, then it means the procedure itself has crashed 
before finishing the creation of meta table, i.e, the meta table is 'partial'. 
So it is safe to just remove it and create again. I think this is a very common 
trick in distributed system for handling failures?
   
   do you mean `idempotent` is trick ? `InitMetaProcedure` may be idempotent 
and can make `hbase:meta` online (as a empty table), but I don't think if the 
cluster/HM itself is `idempotent` automatically; and yeah, it can rebuild the 
data content of the original meta with the help of HBCK, but just if HM 
continues the flow with some existing data, e.g. the namespace table (sorry for 
branch-2 we have namespace table) and HM restart with a empty meta, based on 
the experiment I did, the cluster hangs and HM cannot be initialized. 
   
   if we step back to just think on the definition of `partial` meta, it would 
be great if the meta table itself can tell if it's partial, because it's still 
a table in HBase and HFiles are immutable. e.g. can we tell if a user table is 
partial by looking at its data? I may be wrong, but it seems like we're not 
able to tell from HFiles itself, and we need ZK and WAL to define it.  
   
   So, again, IMO data content in a table is sensitive especially the meta 
table, I'm proposing not to delete meta if possible here (it's also like 
running a hbck to delete and rebuild). 
   
   Based on our discussion here, IMO we have two proposal mentioned to define 
`partial meta` . 
   
   1. add a boolean in WAL like a proc-level data
   2. write a boolean in ZNode to tell if the bootstrap completes
   *. no matter we choose 1) and 2) above, we have an additional condition, if 
we don't find any WAL or ZK about this condition, we should not delete the meta 
table. 
   
   seems 2) + *) should be the simplest solution, what do you guys think?  
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hbase] taklwu commented on pull request #2237: HBASE-24833: Bootstrap should not delete the META table directory if …

2020-08-11 Thread GitBox


taklwu commented on pull request #2237:
URL: https://github.com/apache/hbase/pull/2237#issuecomment-672204387


   Thanks @Apache9 , I want to agree with you to have a HBCK option, but one 
concern I have and keep struggling about making this automated instead of HBCK 
options. If one HBase cluster has hundred of tables with thousand of regions, 
how would the operator recovery the cluster? does he/she (offline/online) 
repair the meta table by scanning the storage on each region ? (instead we can 
just load the meta without rebuilding it?) 
   
   Tbh, I felt bad to bring this meta table issue because normal HBase cluster 
does not assume Zookeeper (and WAL) could be gone after the cluster starts and 
restarts. 
   
   for this PR/JIRA, mainly, I'm questioning what a `partial meta` should be, 
any thoughts ? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org