Thanks - well it is now on the CDH community forum too. Jonathan Hsieh pretty much described what I see in his comment on HBASE-12332 https://issues.apache.org/jira/browse/HBASE-12332?focusedCommentId=14241478&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14241478
On Wed, Oct 12, 2016 at 7:51 PM, Huaxiang Sun <h...@cloudera.com> wrote: > Hi Tim,, > > Just read more details, it may not be related with the issue we fixed (mob > compaction related). > I am doing a similar test to see if I can reproduce it. > > Thanks, > Huaxiang > > On Oct 12, 2016, at 10:29 AM, Tim Robertson <timrobertson...@gmail.com> > wrote: > > > > Thanks Ted, Huaxiang > > > > I'll move this to a Cloudera forum and comment back here if it appears > > unrelated. > > > > On Wed, Oct 12, 2016 at 7:24 PM, Huaxiang Sun <h...@cloudera.com > <mailto:h...@cloudera.com>> wrote: > > > >> By the way, I forgot the forum link: http://community.cloudera.com < > http://community.cloudera.com/> < > >> http://community.cloudera.com/ <http://community.cloudera.com/>> > >> > >> Thanks, > >> Huaxiang > >> > >>> On Oct 12, 2016, at 10:10 AM, Huaxiang Sun <h...@cloudera.com <mailto: > h...@cloudera.com>> wrote: > >>> > >>> Hi Tim, > >>> > >>> I believe that it runs into an issue which is specific to cloudera > >> release we fixed recently. For details, could you discuss it in cdh > forum? > >>> Copy me(h...@cloudera.com <mailto:h...@cloudera.com> <mailto: > h...@cloudera.com <mailto:h...@cloudera.com>>) in the forum so I > >> can explain more there. > >>> > >>> Thanks, > >>> Huaxiang > >>> > >>>> On Oct 12, 2016, at 8:13 AM, Ted Yu <yuzhih...@gmail.com <mailto: > yuzhih...@gmail.com> <mailto: > >> yuzhih...@gmail.com <mailto:yuzhih...@gmail.com>>> wrote: > >>>> > >>>> Have you looked at HBASE-16578 ? > >>>> > >>>> Cheers > >>>> > >>>>> On Oct 12, 2016, at 8:02 AM, Tim Robertson < > timrobertson...@gmail.com <mailto:timrobertson...@gmail.com> > >> <mailto:timrobertson...@gmail.com <mailto:timrobertson...@gmail.com>>> > wrote: > >>>>> > >>>>> Hi devs, > >>>>> [Had a quick chat with Lars G. about this and before opening a Jira I > >>>>> thought I'd raise it here first] > >>>>> > >>>>> We have just experienced data loss in HBase 1.0.0-cdh5.4.10. > >>>>> > >>>>> Before I dig into this further, I'd like to just ask if anyone has > seen > >>>>> this before? > >>>>> > >>>>> The initial state was a table (tim_test) built with MOB support and a > >> few > >>>>> 10's million rows and 10's billions of cells. > >>>>> > >>>>> I wanted to rename the table to get this into production and did so > as > >>>>> follows: > >>>>> > >>>>> snapshot 'tim_test', 'tim_test-snapshot' > >>>>> clone_snapshot 'tim_test-snapshot', 'prod_b_map' > >>>>> > >>>>> At this stage the application all looked good, and so I continued > with: > >>>>> > >>>>> delete_snapshot 'tim_test-snapshot' > >>>>> disable 'tim_test' > >>>>> drop ‘tim_test’ > >>>>> > >>>>> Then things went... awry and data just started dropping out in the > app. > >>>>> Before long, all MOB data seemingly is gone. > >>>>> > >>>>> The references in the new table MOB folder appear to point to the > >> source > >>>>> table (e.g. > >>>>> /hbase/mobdir/data/default/prod_b_map/ba42a2e8e9b669d9fc85bdfeed2f5f > >> 2a/EPSG_4326/tim_test=14bf5f1737ac65c34615ed97c0b7de06- > >> d41d8cd98f00b204e9800998ecf8427e20161006ff8baa70d21f408caefe > 8ae6318dfba2). > >>>>> > >>>>> The RS logs full of ERROR like: > >>>>> > >>>>> 2016-10-12 15:19:14,640 ERROR org.apache.hadoop.hbase. > >> regionserver.HStore: > >>>>> The mob file > >>>>> d41d8cd98f00b204e9800998ecf8427e20161006b59865f80e604781a79e > >> bfa2ddd66b48 > >>>>> could not be found in the locations > >>>>> [hdfs://ha-nn/hbase/mobdir/data/default/tim_test/ > >> 14bf5f1737ac65c34615ed97c0b7de06/EPSG_4326 <hdfs://ha-nn/hbase/mobdir/ > <hdfs://ha-nn/hbase/mobdir/> > >> data/default/tim_test/14bf5f1737ac65c34615ed97c0b7de06/EPSG_4326>, > >>>>> hdfs://ha-nn/hbase/archive/data/default/tim_test/ > <hdfs://ha-nn/hbase/archive/data/default/tim_test/> > >> 14bf5f1737ac65c34615ed97c0b7de06/EPSG_4326] > <hdfs://ha-nn/hbase/archive/ > >> data/default/tim_test/14bf5f1737ac65c34615ed97c0b7de06/EPSG_4326]> > >>>>> > >>>>> What I don't know is: > >>>>> 1) was this running a background task to copy the MOB data when the > >>>>> snapshot was cloned and I just deleted the source before the copy was > >>>>> complete? > >>>>> - or > >>>>> 2) when running "snapshot and clone" it just references the source > MOB > >>>>> data until a (?) change? > >>>>> 3) snapshot and clone just doesn't support MOB? > >>>>> > >>>>> Can anyone shed some light on this easily before I dig into it > please? > >>>>> > >>>>> While this situation exists (at least in 1.0.0) might it be good to > get > >>>>> info about data loss for MOB tables into the snapshot clone docs? > >>>>> > >>>>> Thanks, > >>>>> Tim > >