[ https://issues.apache.org/jira/browse/ACCUMULO-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14709810#comment-14709810 ]
ASF GitHub Bot commented on ACCUMULO-3959: ------------------------------------------ Github user dhutchis commented on a diff in the pull request: https://github.com/apache/accumulo/pull/45#discussion_r37785744 --- Diff: core/src/main/java/org/apache/accumulo/core/client/BatchScanner.java --- @@ -16,19 +16,20 @@ */ package org.apache.accumulo.core.client; +import org.apache.accumulo.core.data.Range; + import java.util.Collection; import java.util.concurrent.TimeUnit; -import org.apache.accumulo.core.data.Range; - /** * Implementations of BatchScanner support efficient lookups of many ranges in accumulo. + * BatchScanners are also appropriate for large, single ranges, + * as a BatchScanner will break those ranges up into separate RPCs + * provided the range spans more than one tablet + * and there are sufficiently many scan threads available. * - * Use this when looking up lots of ranges and you expect each range to contain a small amount of data. Also only use this when you do not care about the - * returned data being in sorted order. - * - * If you want to lookup a few ranges and expect those ranges to contain a lot of data, then use the Scanner instead. Also, the Scanner will return data in - * sorted order, this will not. + * Only use this when you do not care about returned data being in sorted order. --- End diff -- Correct, I see that the <p> tag is necessary from the online javadoc at http://accumulo.apache.org/1.7/apidocs/org/apache/accumulo/core/client/BatchScanner.html Will fix tonight when I return to my laptop. I don't think my editor (IntelliJ with the Eclipse code formatter plugin) adds the HTML tags automatically. On Mon, Aug 24, 2015 at 2:25 PM, Keith Turner <notificati...@github.com> wrote: > In core/src/main/java/org/apache/accumulo/core/client/BatchScanner.java > <https://github.com/apache/accumulo/pull/45#discussion_r37784571>: > > > * > > - * Use this when looking up lots of ranges and you expect each range to contain a small amount of data. Also only use this when you do not care about the > > - * returned data being in sorted order. > > - * > > - * If you want to lookup a few ranges and expect those ranges to contain a lot of data, then use the Scanner instead. Also, the Scanner will return data in > > - * sorted order, this will not. > > + * Only use this when you do not care about returned data being in sorted order. > > This was already broken before your patch, but I think javadoc need <p> > markup for paragraphs. Not sure it will render as intended w/o it. > > Did you format these changes? > > — > Reply to this email directly or view it on GitHub > <https://github.com/apache/accumulo/pull/45/files#r37784571>. > > Confusing wording on BatchScanner javadoc > ----------------------------------------- > > Key: ACCUMULO-3959 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3959 > Project: Accumulo > Issue Type: Improvement > Components: docs > Affects Versions: 1.6.3, 1.7.0 > Reporter: Dylan Hutchison > Assignee: Dylan Hutchison > Priority: Minor > Labels: docuentation > Fix For: 1.6.4, 1.7.1 > > > The following sentence in the [BatchScanner > Javadoc|https://accumulo.apache.org/1.7/apidocs/org/apache/accumulo/core/client/BatchScanner.html] > has confused my colleagues into using Scanners and wondering why performance > doesn't scale. > bq. If you want to lookup a few ranges and expect those ranges to contain a > lot of data, then use the Scanner instead. > Also regarding this next sentence, from what I see of the BatchScanner it > will break up "large Range objects" that span multiple extents (tablets) into > multiple ranges, possibly one for each tablet. > bq. Use this when looking up lots of ranges and you expect each range to > contain a small amount of data. > If the client is okay with unsorted order and it is okay with using multiple > threads, then isn't it always a better decision to use a BatchScanner than > regular Scanner? In the worst case, one Range over a single row, the > BatchScanner will perform the same as a regular Scanner, ya? -- This message was sent by Atlassian JIRA (v6.3.4#6332)