[ https://issues.apache.org/jira/browse/HBASE-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906030#action_12906030 ]
ryan rawson commented on HBASE-2959: ------------------------------------ well my mobile gmail client clearly triggers something bad with jira... So the old get code also began at the start of a row, this is to capture the delete family case. If we got rid of delete family perhaps we could skip more... > Scanning always starts at the beginning of a row > ------------------------------------------------ > > Key: HBASE-2959 > URL: https://issues.apache.org/jira/browse/HBASE-2959 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.20.4, 0.20.5, 0.20.6, 0.89.20100621 > Reporter: Benoit Sigoure > Priority: Blocker > > In HBASE-2248, the code in {{HRegion#get}} was changed like so: > {code} > - private void get(final Store store, final Get get, > - final NavigableSet<byte []> qualifiers, List<KeyValue> result) > - throws IOException { > - store.get(get, qualifiers, result); > + /* > + * Do a get based on the get parameter. > + */ > + private List<KeyValue> get(final Get get) throws IOException { > + Scan scan = new Scan(get); > + > + List<KeyValue> results = new ArrayList<KeyValue>(); > + > + InternalScanner scanner = null; > + try { > + scanner = getScanner(scan); > + scanner.next(results); > + } finally { > + if (scanner != null) > + scanner.close(); > + } > + return results; > } > {code} > So instead of doing a {{get}} straight on the {{Store}}, we now open a > scanner. The problem is that we eventually end up in {{ScanQueryMatcher}} > where the constructor does: {{this.startKey = > KeyValue.createFirstOnRow(scan.getStartRow());}}. This entails that if we > have a very wide row (thousands of columns), the scanner will need to go > through thousands of {{KeyValue}}'s before finding the right entry, because > it always starts from the beginning of the row, whereas before it was much > more straightforward. > This problem was under the radar for a while because the overhead isn't too > unreasonable, but later on, {{incrementColumnValue}} was changed to do a > {{get}} under the hood. At StumbleUpon we do thousands of ICV per second, so > thousand of times per second we're scanning some really wide rows. When a > row is contented, this results in all the IPC threads being stuck on > acquiring a row lock, while one thread is doing the ICV (albeit slowly due to > the excessive scanning). When all IPC threads are stuck, the region server > is unable to serve more requests. > As a nice side effect, fixing this bug will make {{get}} and > {{incrementColumnValue}} faster, as well as the first call to {{next}} on a > scanner. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.