Please see the following two constants defined in TableInputFormat : /** Column Family to Scan */
public static final String SCAN_COLUMN_FAMILY = "hbase.mapreduce.scan.column.family"; /** Space delimited list of columns and column families to scan. */ public static final String SCAN_COLUMNS = "hbase.mapreduce.scan.columns"; CellCounter accepts these parameters. You can play with CellCounter to see how they work. FYI On Mon, Jul 2, 2018 at 4:01 AM, revolutionisme <[email protected]> wrote: > Hi, > > I am using HBase with Spark and as I have wide columns (> 10000) I wanted > to > use the "setbatch(num)" option to not read all the columns for a row but in > batches. > > I can create a scan and set the batch size I want with > TableInputFormat.SCAN_BATCHSIZE, but I am a bit confused how this would > work > with more than 1 column family. > > Any help is appreciated. > > PS: Also any documentation or inputs on newAPIHadoopRDD would be really > appreciated as well. > > Thanks & Regards, > Biplob > > > > -- > Sent from: http://apache-hbase.679495.n3.nabble.com/HBase-User- > f4020416.html >
