The -limit passed to HBaseStorage is the limit per mapper reading from HBase. If you want to limit overall records, also use LIMIT:
fields = LIMIT fields 5; On Wed, Mar 13, 2013 at 7:48 AM, kiran chitturi <[email protected]>wrote: > Hi! > > I am using Pig 0.10.0 with Hbase in distributed mode to read the records > and I have used this command below. > > fields = load 'hbase://documents' using > org.apache.pig.backend.hadoop.hbase.HBaseStorage('field:fields_j','-loadKey > true -limit 5') as (rowkey, fields:map[]); > > I want pig to limit the records to only 5 but it is quite different. Please > see the logs below. > > Input(s): > Successfully read 250 records (16520 bytes) from: "hbase://documents" > > Output(s): > Successfully stored 250 records (19051 bytes) in: > "hdfs://LucidN1:50001/tmp/temp1510040776/tmp1443083789" > > Counters: > > Total records written : 250 > > Total bytes written : 19051 > > Spillable Memory Manager spill count : 0 > > Total bags proactively spilled: 0 > > Total records proactively spilled: 0 > > Job DAG: > > job_201303121846_0056 > > > > 2013-03-13 14:43:10,186 [main] WARN > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 250 time(s). > > 2013-03-13 14:43:10,186 [main] INFO > > > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > > - Success! > > 2013-03-13 14:43:10,210 [main] INFO > > org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input > paths > > to process : 51 > > 2013-03-13 14:43:10,211 [main] INFO > > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total > > input paths to process : 51 > > > Am I using the 'limit' keyword the wrong way ? > > Please let me know your suggestions. > > Thanks, > -- > Kiran Chitturi > > <http://www.linkedin.com/in/kiranchitturi> > -- *Note that I'm no longer using my Yahoo! email address. Please email me at [email protected] going forward.*
