Thanks for following up Chunhui. That make sense. We will need HBCK to to aware of that. First easy fix might be just to display a warning. Second one will be to handle the situation.
So we only have the meta issue remaining now ;) JM 2013/12/17 Chunhui Shen <[email protected]> > About the online merge: > > > HBCK will report an error now after the online merge, > because the files of merging regions still remain on HDFS which will be > cleaned by CatalogJanitor later. > > > In the merge process, we create file references instead of moving files > together because the latter will break Table Snapshot. > Thus, we couldn't remove these files until the merged region complete > compaction. > > > Thanks for the feedback. > > > I will enhance HBCK to handle this case. > > > > > > > > > > > > > > > > At 2013-12-18 03:21:42,"Jean-Marc Spaggiari" <[email protected]> > wrote: > >So. Some feedback. > > > >0.94.x give "Status: OK" in HBCK. > > > >Did a distcp between the 2 clusters, removed splitlog since I'm not able > to > >change the owner to my HBase user, did the upgrade, started. > > > >I can see all my tables correctly, able to scan them. > > > >HBCK reports all the tables as okay, even the hbase:meta table, however, > >I'm getting this: > >"ERROR: Empty REGIONINFO_QUALIFIER found in hbase:meta" > > > >Ran hbck with -fixEmptyMetaCells > >Reran it. All clear now. > > > >Now, I played with the online merge, and I'm still getting errors but they > >seems to just be bad timing. > > > >tl;tr; jump to the arrow below. > > > >There is initially 4 regions in the table. I merge the 2 first one > >together. That creates a 3 region table. I merge again the 2 first one > >together. I wait few minutes, and I run HBCK. > > > >ERROR: Region { meta => null, hdfs => > > >hdfs://hbasetest1:9000/hbase/data/default/dns/c6569a72cc3c2750d14976ab85f02315, > >deployed => } on HDFS, but not listed in hbase:meta or deployed on any > >region server > >ERROR: Region { meta => null, hdfs => > > >hdfs://hbasetest1:9000/hbase/data/default/dns/efa630782e1d603fbc239a11ab292957, > >deployed => } on HDFS, but not listed in hbase:meta or deployed on any > >region server > > > >I merged those 4 regions: > >merge_region 'bb65f685cdefc4f2491d246f376fc1f0', > >'d02ce8e3fa1a200c7f034b349acf8cc8' > >merge_region 'efa630782e1d603fbc239a11ab292957', > >'c6569a72cc3c2750d14976ab85f02315' > > > >And here is the HDFS content after the merge: > >drwxr-xr-x - hbase hbase 0 2013-12-17 13:35 > >/hbase/data/default/dns/c6569a72cc3c2750d14976ab85f02315 > >drwxr-xr-x - hbase hbase 0 2013-12-17 13:35 > >/hbase/data/default/dns/d5b74aaa2853b00b0ad0f20f60c74398 > >drwxr-xr-x - hbase hbase 0 2013-12-17 13:46 > >/hbase/data/default/dns/efa630782e1d603fbc239a11ab292957 > >drwxr-xr-x - hbase hbase 0 2013-12-17 13:46 > >/hbase/data/default/dns/f2e0764d4e9dea8bfc0aeed9da3da5f7 > > > >And the table in the WebUI: > >dns,,1387305985379.f2e0764d4e9dea8bfc0aeed9da3da5f7. > >dns,theafronews.ca,1379202071281.d5b74aaa2853b00b0ad0f20f60c74398. > > > >Regions efa630782e1d603fbc239a11ab292957 and > >c6569a72cc3c2750d14976ab85f02315 should not be there anymore. > > > >Waiting even longer, they are now removed and hbck reports everything is > >correct. > > > >I know there is some people which are running hbck -repair as a cron job. > >If that occurs while the regions just got merged, it might re-create the > >entries in the meta based on the hdfs content and they will have overlaps > >and duplicates > > > >===> So to summarize, seems that merge append pretty quickly, but it waits > >for the CatalogJanitor to remove the directories left over by the process. > >I think the merge process should remove those files and not rely on the > >catalog janitor. I did the test multiple times. First time took about 30 > >seconds for the janitor to clear the paths. But the 2nd time it took 4 > >minutes for the janitor to run and to clear the files... > > > >One last small thing. There is no more a split button in the WebUI. When > >you don't want to split based on a specific key, it's not trivial that you > >have to go into the empty field and press enter. > > > >JM > > > >2013/12/17 Stack <[email protected]> > > > >> On Tue, Dec 17, 2013 at 7:38 AM, Jean-Marc Spaggiari < > >> [email protected]> wrote: > >> > >> > Sorry about that mates, I know I'm late. I was fighting against snappy > >> > codec for the last few days and was not able to correctly startup my > >> 0.96.1 > >> > version. > >> > > >> > So since it's already over, I have done a reduce phase test. > >> > Verified the signature, checked the documentation and the CHANGES.txt > >> file. > >> > distcp 2TB from a 0.94.x/hadoop 1.0.3 cluster to 0.96.1/hadoop 2.2.0. > Ran > >> > the migration tool. > >> > online merged an entire table to a single region. > >> > > >> > > >> Thank you JMS. > >> > >> > >> > >> > At the end of all of that I have some inconsistencies in the system > >> > reported by HBCK. (Extra regions, empty regioninfo_qualifier in the > meta, > >> > etc.). > >> > > >> > > >> > I will redo all the steps I did one by one and run HBCK between each > to > >> see > >> > where it failed and report what I found. Next step will be to enable > >> > replication between my 0.94 and my 0.96 clusters. > >> > > >> > >> That'd be really helpful. Thanks. > >> > >> St.Ack > >> >
