Awesome! Thanks for the feedback! J-D
On Thu, Nov 17, 2011 at 11:07 PM, Stuti Awasthi <stutiawas...@hcl.com> wrote: > Hi JD, > I have applied the patch and tested it also, its working fine now. :) Thanks > > -----Original Message----- > From: Stuti Awasthi > Sent: Friday, November 18, 2011 11:27 AM > To: user@hbase.apache.org > Subject: RE: Facing Issues with RowCounter > > Ok. > Thanks for update. Il check the patch else I can write my own MR for row > count. > > Cheers > Stuti > > -----Original Message----- > From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of Jean-Daniel > Cryans > Sent: Friday, November 18, 2011 3:37 AM > To: user@hbase.apache.org > Subject: Re: Facing Issues with RowCounter > > Ah! Took me a moment to figure it out, it's: > > https://issues.apache.org/jira/browse/HBASE-4295 "rowcounter does not return > the correct number of rows in certain circumstances" > > What made me think about it is that your counters do say that rows were taken > into input, but none counted because the values are empty. > That was the problem in 4295. > > The patch is currently only in the tip of the 0.90 branch, so unless you > patch it yourself you'll have to wait for 0.90.5 (which may or may not get > released, depends if someone wants to do it). > > J-D > > On Wed, Nov 16, 2011 at 9:27 PM, Stuti Awasthi <stutiawas...@hcl.com> wrote: >> Hi JD, >> >> Table 'Keyword' contains 'Set' column family with 7 rows. Here is the output >> of scan : >> >> hbase(main):001:0> scan 'Keyword',{COLUMNS=>['Set']} ROW >> COLUMN+CELL >> Apache column=Set:Fuse, >> timestamp=1321506922206, value= >> Apache column=Set:Hadoop, >> timestamp=1321506922206, value= >> Apache column=Set:Hive, >> timestamp=1321506922206, value= >> Apache column=Set:MySql, >> timestamp=1321506922206, value= >> Apache column=Set:PHP, >> timestamp=1321506922206, value= >> Fuse column=Set:Apache, >> timestamp=1321506922206, value= >> Fuse column=Set:Hdfs, >> timestamp=1321506922209, value= >> Hadoop column=Set:Apache, >> timestamp=1321506922209, value= >> Hadoop column=Set:Hive, >> timestamp=1321506922212, value= >> Hdfs column=Set:Fuse, >> timestamp=1321506922212, value= >> Hive column=Set:Apache, >> timestamp=1321506922212, value= >> Hive column=Set:Hadoop, >> timestamp=1321506922214, value= >> MySql column=Set:Apache, >> timestamp=1321506922214, value= >> MySql column=Set:PHP, >> timestamp=1321506922216, value= >> PHP column=Set:Apache, >> timestamp=1321506922216, value= >> PHP column=Set:MySql, >> timestamp=1321506922218, value= >> 7 row(s) in 0.4120 seconds >> >> This output is not shown in RowCounter MR job. >> >> -----Original Message----- >> From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of >> Jean-Daniel Cryans >> Sent: Wednesday, November 16, 2011 11:09 PM >> To: user@hbase.apache.org >> Subject: Re: Facing Issues with RowCounter >> >> What I can decrypt from those outputs is that you have a total of 7 rows, >> and none of them have data in the "Set" column family. Is it the case or >> not? Without more info from you, it's hard to tell. >> >> J-D >> >> On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi <stutiawas...@hcl.com> wrote: >>> Hi, >>> I tried to use MR RowCounter to count the rows of a table with specific >>> column family. But it is not displaying correct result. >>> >>> Command (Only Table Name as argument ): Hbase/hbase-0.90.3/bin/hbase >>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Output : >>> 11/11/16 13:04:31 INFO mapred.TaskRunner: Task >>> 'attempt_local_0001_m_000000_0' done. >>> 11/11/16 13:04:32 INFO mapred.JobClient: map 100% reduce 0% >>> 11/11/16 13:04:32 INFO mapred.JobClient: Job complete: job_local_0001 >>> 11/11/16 13:04:32 INFO mapred.JobClient: Counters: 6 >>> 11/11/16 13:04:32 INFO mapred.JobClient: >>> org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counter >>> s >>> 11/11/16 13:04:32 INFO mapred.JobClient: ROWS=7 >>> 11/11/16 13:04:32 INFO mapred.JobClient: FileSystemCounters >>> 11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_READ=2373099 >>> 11/11/16 13:04:32 INFO mapred.JobClient: >>> FILE_BYTES_WRITTEN=2411923 >>> 11/11/16 13:04:32 INFO mapred.JobClient: Map-Reduce Framework >>> 11/11/16 13:04:32 INFO mapred.JobClient: Map input records=7 >>> 11/11/16 13:04:32 INFO mapred.JobClient: Spilled Records=0 >>> 11/11/16 13:04:32 INFO mapred.JobClient: Map output records=0 >>> >>> Command (TableName, ColumnFamily): Hbase/hbase-0.90.3/bin/hbase >>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Set >>> >>> Output : >>> 11/11/16 13:05:33 INFO mapred.TaskRunner: Task >>> 'attempt_local_0001_m_000000_0' done. >>> 11/11/16 13:05:34 INFO mapred.JobClient: map 100% reduce 0% >>> 11/11/16 13:05:34 INFO mapred.JobClient: Job complete: job_local_0001 >>> 11/11/16 13:05:34 INFO mapred.JobClient: Counters: 5 >>> 11/11/16 13:05:34 INFO mapred.JobClient: FileSystemCounters >>> 11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_READ=2373107 >>> 11/11/16 13:05:34 INFO mapred.JobClient: >>> FILE_BYTES_WRITTEN=2411939 >>> 11/11/16 13:05:34 INFO mapred.JobClient: Map-Reduce Framework >>> 11/11/16 13:05:34 INFO mapred.JobClient: Map input records=7 >>> 11/11/16 13:05:34 INFO mapred.JobClient: Spilled Records=0 >>> 11/11/16 13:05:34 INFO mapred.JobClient: Map output records=0 >>> >>> Table Describe command Output is : >>> TABLE => {{NAME => 'Keyword', FAMILIES => [{NAME => 'Info', >>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => >>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', >>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'Set', >>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => >>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', >>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} >>> >>> Am I executing in wrong way or this is some bug ? >>> >>> Regards, >>> Stuti Awasthi >>> HCL Comnet Systems and Services Ltd >>> F-8/9 Basement, Sec-3,Noida. >>> >>> >>> ________________________________ >>> ::DISCLAIMER:: >>> --------------------------------------------------------------------- >>> - >>> ------------------------------------------------- >>> >>> The contents of this e-mail and any attachment(s) are confidential and >>> intended for the named recipient(s) only. >>> It shall not attach any liability on the originator or HCL or its >>> affiliates. Any views or opinions presented in this email are solely those >>> of the author and may not necessarily reflect the opinions of HCL or its >>> affiliates. >>> Any form of reproduction, dissemination, copying, disclosure, >>> modification, distribution and / or publication of this message >>> without the prior written consent of the author of this e-mail is >>> strictly prohibited. If you have received this email in error please delete >>> it and notify the sender immediately. Before opening any mail and >>> attachments please check them for viruses and defect. >>> >>> --------------------------------------------------------------------- >>> - >>> ------------------------------------------------- >>> >> >