[ https://issues.apache.org/jira/browse/HBASE-26108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Huaxiang Sun resolved HBASE-26108. ---------------------------------- Fix Version/s: 2.4.5 3.0.0-alpha-2 2.3.6 Resolution: Fixed > add option to disable scanMetrics in TableSnapshotInputFormat > ------------------------------------------------------------- > > Key: HBASE-26108 > URL: https://issues.apache.org/jira/browse/HBASE-26108 > Project: HBase > Issue Type: Improvement > Affects Versions: 2.3.5 > Reporter: Huaxiang Sun > Assignee: Huaxiang Sun > Priority: Major > Fix For: 2.3.6, 3.0.0-alpha-2, 2.4.5 > > > When running spark job with TableSnapshotInputFormat, we found that scan is > very slower. We found that scanMetrics is hardcoded as enabled, spark's > newAPIHadoopRDD uses DummyReporter in hadoop, which causes the following > exception and 80% cpu time is spent on this exception handling. > Need to provide an option to disable scanMetrics. > java.base@11.0.5/java.lang.Throwable.fillInStackTrace(Native Method) > java.base@11.0.5/java.lang.Throwable.fillInStackTrace(Throwable.java:787) => > holding Monitor(java.util.MissingResourceException@258206255}) > java.base@11.0.5/java.lang.Throwable.<init>(Throwable.java:292) > java.base@11.0.5/java.lang.Exception.<init>(Exception.java:84) > java.base@11.0.5/java.lang.RuntimeException.<init>(RuntimeException.java:80) > java.base@11.0.5/java.util.MissingResourceException.<init>(MissingResourceException.java:85) > java.base@11.0.5/java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:2055) > java.base@11.0.5/java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1689) > java.base@11.0.5/java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1593) > java.base@11.0.5/java.util.ResourceBundle.getBundle(ResourceBundle.java:1284) > app//org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37) > app//org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56) > => holding Monitor(java.lang.Class@545605549}) > app//org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterGroupName(ResourceBundles.java:77) > app//org.apache.hadoop.mapreduce.counters.CounterGroupFactory.newGroup(CounterGroupFactory.java:94) > app//org.apache.hadoop.mapreduce.counters.AbstractCounters.getGroup(AbstractCounters.java:227) > app//org.apache.hadoop.mapreduce.counters.AbstractCounters.findCounter(AbstractCounters.java:154) > app//org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl$DummyReporter.getCounter(TaskAttemptContextImpl.java:110) > app//org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.getCounter(TaskAttemptContextImpl.java:76) > org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.updateCounters(TableRecordReaderImpl.java:311) > org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat$TableSnapshotRegionRecordReader.nextKeyValue(TableSnapshotInputFormat.java:167) -- This message was sent by Atlassian Jira (v8.3.4#803005)