[ https://issues.apache.org/jira/browse/HBASE-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Meil updated HBASE-8571: ----------------------------- Description: Maybe it's just me, but I've been looking on trunk and I don't see where either RowCounter or CopyTable MapReduce can adjust the setCaching setting on the Scan instance. Example from RowCounter... {code} Job job = new Job(conf, NAME + "_" + tableName); job.setJarByClass(RowCounter.class); Scan scan = new Scan(); scan.setCacheBlocks(false); Set<byte []> qualifiers = new TreeSet<byte[]>(Bytes.BYTES_COMPARATOR); if (startKey != null && !startKey.equals("")) { scan.setStartRow(Bytes.toBytes(startKey)); } if (endKey != null && !endKey.equals("")) { scan.setStopRow(Bytes.toBytes(endKey)); } scan.setFilter(new FirstKeyOnlyFilter()); if (sb.length() > 0) { for (String columnName : sb.toString().trim().split(" ")) { String [] fields = columnName.split(":"); if(fields.length == 1) { scan.addFamily(Bytes.toBytes(fields[0])); } else { byte[] qualifier = Bytes.toBytes(fields[1]); qualifiers.add(qualifier); scan.addColumn(Bytes.toBytes(fields[0]), qualifier); } } } // specified column may or may not be part of first key value for the row. // Hence do not use FirstKeyOnlyFilter if scan has columns, instead use // FirstKeyValueMatchingQualifiersFilter. if (qualifiers.size() == 0) { scan.setFilter(new FirstKeyOnlyFilter()); } else { scan.setFilter(new FirstKeyValueMatchingQualifiersFilter(qualifiers)); } job.setOutputFormatClass(NullOutputFormat.class); TableMapReduceUtil.initTableMapperJob(tableName, scan, RowCounterMapper.class, ImmutableBytesWritable.class, Result.class, job); job.setNumReduceTasks(0); return job; {code} TableMapReduceUtil only serializes the Scan into the job, it doesn't adjust any of the settings. Maybe I'm missing something, but this seems like a problem. was: Maybe it's just me, but I've been looking on trunk and I don't see where either RowCounter or CopyTable MapReduce can adjust the setCaching setting. Example from RowCounter... {code} Job job = new Job(conf, NAME + "_" + tableName); job.setJarByClass(RowCounter.class); Scan scan = new Scan(); scan.setCacheBlocks(false); Set<byte []> qualifiers = new TreeSet<byte[]>(Bytes.BYTES_COMPARATOR); if (startKey != null && !startKey.equals("")) { scan.setStartRow(Bytes.toBytes(startKey)); } if (endKey != null && !endKey.equals("")) { scan.setStopRow(Bytes.toBytes(endKey)); } scan.setFilter(new FirstKeyOnlyFilter()); if (sb.length() > 0) { for (String columnName : sb.toString().trim().split(" ")) { String [] fields = columnName.split(":"); if(fields.length == 1) { scan.addFamily(Bytes.toBytes(fields[0])); } else { byte[] qualifier = Bytes.toBytes(fields[1]); qualifiers.add(qualifier); scan.addColumn(Bytes.toBytes(fields[0]), qualifier); } } } // specified column may or may not be part of first key value for the row. // Hence do not use FirstKeyOnlyFilter if scan has columns, instead use // FirstKeyValueMatchingQualifiersFilter. if (qualifiers.size() == 0) { scan.setFilter(new FirstKeyOnlyFilter()); } else { scan.setFilter(new FirstKeyValueMatchingQualifiersFilter(qualifiers)); } job.setOutputFormatClass(NullOutputFormat.class); TableMapReduceUtil.initTableMapperJob(tableName, scan, RowCounterMapper.class, ImmutableBytesWritable.class, Result.class, job); job.setNumReduceTasks(0); return job; {code} TableMapReduceUtil only serializes the Scan into the job, it doesn't adjust any of the settings. Maybe I'm missing something, but this seems like a problem. > CopyTable and RowCounter don't seem to use setCaching setting > ------------------------------------------------------------- > > Key: HBASE-8571 > URL: https://issues.apache.org/jira/browse/HBASE-8571 > Project: HBase > Issue Type: Bug > Reporter: Doug Meil > > Maybe it's just me, but I've been looking on trunk and I don't see where > either RowCounter or CopyTable MapReduce can adjust the setCaching setting on > the Scan instance. > Example from RowCounter... > {code} > Job job = new Job(conf, NAME + "_" + tableName); > job.setJarByClass(RowCounter.class); > Scan scan = new Scan(); > scan.setCacheBlocks(false); > Set<byte []> qualifiers = new TreeSet<byte[]>(Bytes.BYTES_COMPARATOR); > if (startKey != null && !startKey.equals("")) { > scan.setStartRow(Bytes.toBytes(startKey)); > } > if (endKey != null && !endKey.equals("")) { > scan.setStopRow(Bytes.toBytes(endKey)); > } > scan.setFilter(new FirstKeyOnlyFilter()); > if (sb.length() > 0) { > for (String columnName : sb.toString().trim().split(" ")) { > String [] fields = columnName.split(":"); > if(fields.length == 1) { > scan.addFamily(Bytes.toBytes(fields[0])); > } else { > byte[] qualifier = Bytes.toBytes(fields[1]); > qualifiers.add(qualifier); > scan.addColumn(Bytes.toBytes(fields[0]), qualifier); > } > } > } > // specified column may or may not be part of first key value for the row. > // Hence do not use FirstKeyOnlyFilter if scan has columns, instead use > // FirstKeyValueMatchingQualifiersFilter. > if (qualifiers.size() == 0) { > scan.setFilter(new FirstKeyOnlyFilter()); > } else { > scan.setFilter(new FirstKeyValueMatchingQualifiersFilter(qualifiers)); > } > job.setOutputFormatClass(NullOutputFormat.class); > TableMapReduceUtil.initTableMapperJob(tableName, scan, > RowCounterMapper.class, ImmutableBytesWritable.class, Result.class, > job); > job.setNumReduceTasks(0); > return job; > {code} > TableMapReduceUtil only serializes the Scan into the job, it doesn't adjust > any of the settings. > Maybe I'm missing something, but this seems like a problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira