HBASE-4838 ports HBASE-2856 to 0.92 FYI
On Thu, Feb 2, 2012 at 4:46 PM, Cosmin Lehene <cleh...@adobe.com> wrote: > (sorry for the damaged subject :)) > > > Hey Jon, > We have two column families. > There are no filters and there's a full table scan. We're not skipping > rows. > I did see however a single time that we had one qualifier "fault" in the > job counters (it was missing, and it wasn't supposed to be missing). > However that was only once and it doesn't happen when we encounter missing > rows. > > We're getting this behavior consistently although I couldn't figure a way > to reproduce it. I'll try running multiple instances of the job in > parallel to figure out if that would affect the outcome. > I'll probably have to add more debugging for the affected rows and dig > deeper. > > HBASE-2856 is a pretty large issue - do you think it could be related to > what I'm seeing? If so it could help me reproduce it. > > Thanks, > Cosmin > > > > > On 2/1/12 11:30 PM, "Jonathan Hsieh" <j...@cloudera.com> wrote: > > >Cosmin, > > > >How many column families to you have in this table? Are you using any > >filters in you HBase scans? Are you using skip rows that may not have > >qualifiers present? > > > >There are a few known issues with multi-CF atomicity and a recent one > >about > >flushes that may be related to this problem. There HBASE-2856, a fix > >having to do with flushes which is pretty intricate and only in 0.92. > > > >Jon. > > > >On Wed, Feb 1, 2012 at 8:46 PM, Cosmin Lehene <cleh...@adobe.com> wrote: > > > >> We have a MR job that runs every few minutes on some time series data > >> which is continuously updated (never deleted). > >> Every few (in the range of tens to hundreds) runs the map task that > >>covers > >> the last region will get fewer input records (off by 500-5000 rows) > >>without > >> any splits happening. This lower number of input records could persist > >>for > >> a few MR runs, but will eventually get back to the "correct" value. > >> > >> This drop can be seen both in the "map input records" metric but it's > >> correlated with the metrics that get computed by the MR job (so it's > >>not a > >> MR counter bug). > >> > >> There are no exceptions in the MR job, or in the region server and this > >> doesn't seem to be correlated with any compaction, split or region > >>movement. > >> The only "variable" in this scenario is that new data gets injected > >> continuously (and the actual MR job which is idempotent) > >> > >> This entire puzzle takes place on HBase 0.90.5 ish (12 dec 2011) on > >>top > >> of Hadoop cdh3u2. > >> > >> Cosmin > >> > >> > >> > >> > > > > > >-- > >// Jonathan Hsieh (shay) > >// Software Engineer, Cloudera > >// j...@cloudera.com > >