Re: Inconsistent rows exported/counted when looking at a set, unchanged past time frame.

Ted Yu Tue, 20 Feb 2018 21:16:20 -0800

If you look at
https://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_rn_fixed_in_58.html#fixed_issues585
, you would see the following:


HBASE-15378 - Scanner cannot handle heartbeat message with no results

which fixed what you observed in previous release.

FYI

On Tue, Feb 20, 2018 at 9:07 PM, Andrew Kettmann <
[email protected]> wrote:

> Josh,
>
> We upgraded from CDH 5.8.0 -> 5.8.5 seems to have fixed the issue. 3
> Rowcounts in a row that were not consistent before on a static table are
> now consistent. We are doing some further testing but it looks like you
> called it with:
>
> 'scans on RegionServers stop prematurely before all of the data is read'
>
> Thanks for the pointer in that direction, I was bashing my face against
> this for two weeks trying to figure out this inconsistency. I appreciate
> the clue!
>
> Andrew Kettmann
> Consultant, Platform Services Group
>
> -----Original Message-----
> From: Josh Elser [mailto:[email protected]]
> Sent: Monday, February 12, 2018 11:59 AM
> To: [email protected]
> Subject: Re: Inconsistent rows exported/counted when looking at a set,
> unchanged past time frame.
>
> Hi Andrew,
>
> Yes. The answer is, of course, that you should see consistent results from
> HBase if there are no mutations in flight to that table. Whether you're
> reading "current" or "back-in-time", as long as you're not dealing with raw
> scans (where compactions may persist delete tombstones), this should hold
> just the same.
>
> Are you modifying older cells with newer data when you insert data?
> Remember that MAX_VERSIONS for a table defaults to 1. Consider the
> following:
>
> * Timestamps are of the form "tX", and t1 < t2 < t3 < ..
> * You are querying from the time range: [t1, t5].
> * You have a cell for "row1" with at t3 with value "foo".
> * RowCounter over [t1, t5] would return "1"
> * Your ingest writes a new cell for "row1" of "bar" at t6.
> * RowCounter over [t1, t5] would return "0" normally, or "1" is you use
> RAW scans ***
> * A compaction would run over the region containing "row1"
> * RowCounter over [t1, t5] would return "0" (RAW or normal)
>
> It's also possible that you're hitting some sort of bug around missing
> records at query time. I'm not sure what the CDH versions you're using line
> up to, but there have certainly been issues in the past around query-time
> data loss (e.g. scans on RegionServers stop prematurely before all of the
> data is read).
>
> Good luck!
>
> *** Going off of memory here. I think this is how it works, but you should
> be able to test easily ;)
>
> On 2/9/18 5:30 PM, Andrew Kettmann wrote:
> > A simpler question would be this:
> >
> > Given:
> >
> >
> >    *   a set timeframe in the past (2-3 days roughly a year ago)
> >    *   we are NOT removing records from the table at all
> >    *   We ARE inserting into this table actively
> >
> > Should I expect two consecutive runs of the rowcounter mapreduce job to
> return an identical number?
> >
> >
> > Andrew Kettmann
> > Consultant, Platform Services Group
> >
> > From: Andrew Kettmann
> > Sent: Thursday, February 08, 2018 11:35 AM
> > To: [email protected]
> > Subject: Inconsistent rows exported/counted when looking at a set,
> unchanged past time frame.
> >
> > First the version details:
> >
> > Running HBASE/Yarn/HDFS using Cloudera manager 5.12.1.
> > Hbase: Version 1.2.0-cdh5.8.0
> > HDFS/YARN: Hadoop 2.6.0-cdh5.8.0
> > Hbck and hdfs fsck return healthy
> >
> > 15 nodes, sized down recently from 30 (other service requirements
> > reduced. Solr, etc)
> >
> >
> > The simplest example of the inconsistency is using rowcounter. If I run
> the same mapreduce job twice in a row, I get different counts:
> >
> > hbase org.apache.hadoop.hbase.mapreduce.Driver rowcounter
> > -Dmapreduce.map.speculative=false TABLENAME --starttime=1485907200000
> > --endtime=1486058400000
> >
> > Looking at org.apache.hadoop.hbase.mapreduce.RowCounter$
> RowCounterMapper$Counters:
> > Run 1: 4876683
> > Run 2: 4866351
> >
> > Similarly with exports of the same date/time. Consecutive runs of the
> export get different results:
> > hbase org.apache.hadoop.hbase.mapreduce.Export \
> > -Dmapred.map.tasks.speculative.execution=false \
> > -Dmapred.reduce.tasks.speculative.execution=false \ TABLENAME \
> > HDFSPATH 1 1485907200000 1486058400000
> >
> >  From Map Input/output records:
> > Run 1: 4296778
> > Run 2: 4297307
> >
> > None of the results show anything for spilled records, no failed maps.
> Sometimes the row count increases, sometimes it decreases. We aren’t using
> any row filter queries, we just want to export chunks of the data for a
> specific time range. This table is actively being read/written to, but I am
> asking about a date range in early 2017 in this case, so that should have
> no impact I would have thought. Another point is that the rowcount job and
> the export return ridiculously different numbers. There should be no older
> versions of rows involved as we are set to only keep the newest, and I can
> confirm that there are rows that are consistently missing from the exports.
> Table definition is below.
> >
> > hbase(main):001:0> describe 'TABLENAME'
> > Table TABLENAME is ENABLED
> > TABLENAME
> > COLUMN FAMILIES DESCRIPTION
> > {NAME => 'text', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW',
> > REPLICATION_SCOPE => '0', COMPRESSION => 'SNAPPY', VERSIONS => '1',
> > MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'FALSE',
> > BLO CKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
> > 1 row(s) in 0.2800 seconds
> >
> > Any advice/suggestions would be greatly appreciated, are some of my
> assumptions wrong regarding import/export and that it should be consistent
> given consistent date/times?
> >
> >
> > Andrew Kettmann
> > Platform Services Group
> >
>

Re: Inconsistent rows exported/counted when looking at a set, unchanged past time frame.

Reply via email to