Thanks Ted! I wonder if it would make more sense to port it to 0.90.X or upgrade to 0.92.
Cosmin On 2/2/12 5:03 PM, "Ted Yu" <yuzhih...@gmail.com> wrote: >HBASE-4838 ports HBASE-2856 to 0.92 > >FYI > >On Thu, Feb 2, 2012 at 4:46 PM, Cosmin Lehene <cleh...@adobe.com> wrote: > >> (sorry for the damaged subject :)) >> >> >> Hey Jon, >> We have two column families. >> There are no filters and there's a full table scan. We're not skipping >> rows. >> I did see however a single time that we had one qualifier "fault" in the >> job counters (it was missing, and it wasn't supposed to be missing). >> However that was only once and it doesn't happen when we encounter >>missing >> rows. >> >> We're getting this behavior consistently although I couldn't figure a >>way >> to reproduce it. I'll try running multiple instances of the job in >> parallel to figure out if that would affect the outcome. >> I'll probably have to add more debugging for the affected rows and dig >> deeper. >> >> HBASE-2856 is a pretty large issue - do you think it could be related to >> what I'm seeing? If so it could help me reproduce it. >> >> Thanks, >> Cosmin >> >> >> >> >> On 2/1/12 11:30 PM, "Jonathan Hsieh" <j...@cloudera.com> wrote: >> >> >Cosmin, >> > >> >How many column families to you have in this table? Are you using any >> >filters in you HBase scans? Are you using skip rows that may not have >> >qualifiers present? >> > >> >There are a few known issues with multi-CF atomicity and a recent one >> >about >> >flushes that may be related to this problem. There HBASE-2856, a fix >> >having to do with flushes which is pretty intricate and only in 0.92. >> > >> >Jon. >> > >> >On Wed, Feb 1, 2012 at 8:46 PM, Cosmin Lehene <cleh...@adobe.com> >>wrote: >> > >> >> We have a MR job that runs every few minutes on some time series data >> >> which is continuously updated (never deleted). >> >> Every few (in the range of tens to hundreds) runs the map task that >> >>covers >> >> the last region will get fewer input records (off by 500-5000 rows) >> >>without >> >> any splits happening. This lower number of input records could >>persist >> >>for >> >> a few MR runs, but will eventually get back to the "correct" value. >> >> >> >> This drop can be seen both in the "map input records" metric but it's >> >> correlated with the metrics that get computed by the MR job (so it's >> >>not a >> >> MR counter bug). >> >> >> >> There are no exceptions in the MR job, or in the region server and >>this >> >> doesn't seem to be correlated with any compaction, split or region >> >>movement. >> >> The only "variable" in this scenario is that new data gets injected >> >> continuously (and the actual MR job which is idempotent) >> >> >> >> This entire puzzle takes place on HBase 0.90.5 ish (12 dec 2011) on >> >>top >> >> of Hadoop cdh3u2. >> >> >> >> Cosmin >> >> >> >> >> >> >> >> >> > >> > >> >-- >> >// Jonathan Hsieh (shay) >> >// Software Engineer, Cloudera >> >// j...@cloudera.com >> >>