I modified the apilyzer config in the 1.8 branch to include iterators. It spit out the following problems with types.
I think we could move IteratorUtil out of org.apache.accumulo.core.iterators. Also we could possibly move the iterators in org.apache.accumulo.core.iterators.system somewhere else (or declare it as an exception). Could possibly deprecate IteratorEnvironment. getConfig() and add a replacement method that returns a Map. Problems : CONTEXT TYPE FIELD/METHOD NON-PUBLIC REFERENCE METHOD_RETURN org.apache.accumulo.core.iterators.IteratorEnvironment getConfig(...) org.apache.accumulo.core.conf.AccumuloConfiguration METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil parseIterConf(...) org.apache.accumulo.core.conf.AccumuloConfiguration METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.data.impl.KeyExtent METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.conf.AccumuloConfiguration METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.data.impl.KeyExtent METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.conf.AccumuloConfiguration METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.data.impl.KeyExtent METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.conf.AccumuloConfiguration METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.data.impl.KeyExtent METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.conf.AccumuloConfiguration METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.data.impl.KeyExtent METHOD_PARAM org.apache.accumulo.core.iterators.IteratorUtil loadIterators(...) org.apache.accumulo.core.conf.AccumuloConfiguration CTOR_PARAM org.apache.accumulo.core.iterators.system.MapFileIterator (...) org.apache.accumulo.core.conf.AccumuloConfiguration METHOD_RETURN org.apache.accumulo.core.iterators.system.MapFileIterator getSample(...) org.apache.accumulo.core.file.FileSKVIterator METHOD_PARAM org.apache.accumulo.core.iterators.system.MapFileIterator getSample(...) org.apache.accumulo.core.sample.impl.SamplerConfigurationImpl CTOR_PARAM org.apache.accumulo.core.iterators.system.MultiIterator (...) org.apache.accumulo.core.data.impl.KeyExtent CTOR_PARAM org.apache.accumulo.core.iterators.system.SampleIterator (...) org.apache.accumulo.core.sample.Sampler CTOR_PARAM org.apache.accumulo.core.iterators.system.SequenceFileIterator (...) org.apache.hadoop.io.SequenceFile$Reader METHOD_RETURN org.apache.accumulo.core.iterators.system.SequenceFileIterator getSample(...) org.apache.accumulo.core.file.FileSKVIterator METHOD_PARAM org.apache.accumulo.core.iterators.system.SequenceFileIterator getSample(...) org.apache.accumulo.core.sample.impl.SamplerConfigurationImpl FIELD org.apache.accumulo.core.iterators.system.VisibilityFilter cache org.apache.commons.collections.map.LRUMap FIELD org.apache.accumulo.core.iterators.user.TransformingIterator log org.slf4j.Logger Total : 22 On Mon, May 9, 2016 at 4:54 PM, Christopher <[email protected]> wrote: > Hey Keith. Just wanted to ping you on this to see if you've had a chance to > look. > > On Fri, Mar 25, 2016 at 12:17 PM Keith Turner <[email protected]> wrote: > > > On Thu, Mar 24, 2016 at 8:27 PM, Christopher <[email protected]> > wrote: > > > > > It seems there's a general agreement that we treat it like public API > and > > > we should just call it public API. > > > > > > I just don't know how we're going to actually say this is public API, > > > without addressing the issue of the other iterators in the same > package, > > > unless we're okay with going back to a more complicated API definition > > > which calls out specific classes instead of whole packages. > > > > > > > Keith, you spent the most time thinking about how to convey the public > > API > > > in the README. What do you think about how to actually make this > happen? > > > > > > I will look into it at the end of next week and try to come up with a > > strategy for incorporating it into the public API. Need to analyze what > > types are used by the existing iterators. In 1.7.0 I tried to make API > > types only reference other API types. Things that violated this type > > constraint were deprecated and a replacement that met the type constraint > > was introduced. Maybe we can follow this path with iterators in 1.8.0, > not > > sure. > > > > > > > > > > Also, what would we want to say about the 1.6 and 1.7 branches API? > Maybe > > > if we can guarantee that there aren't any changes in those > > > classes/packages, we can just add them to the README as a new > declaration > > > of API (but not actually an addition in the semver sense), but if there > > are > > > any changes, that's going to complicate things, because semver > guarantees > > > should mean no public API changes between bugfix releases. I guess I > > should > > > actually check to see if there have been any changes... > > > > > > On Thu, Mar 24, 2016 at 7:15 PM Keith Turner <[email protected]> wrote: > > > > > > > On Thu, Mar 24, 2016 at 5:54 PM, Billie Rinaldi < > > > [email protected]> > > > > wrote: > > > > > > > > > On Thu, Mar 24, 2016 at 1:15 PM, Christopher <[email protected]> > > > > wrote: > > > > > > > > > > > We do have the opportunity to move to a new improved API, if > > somebody > > > > > were > > > > > > to put time into it. I guess that's true whether we put this in > the > > > > > public > > > > > > API officially or not. > > > > > > > > > > > > > > > Agreed. Even if we do create a new API, we can't change or drop > the > > > > > existing API without breaking a lot of people's code. I feel like > > SKVI > > > > is > > > > > in a category of things that we treat as though they're in the > public > > > > API, > > > > > so we might as well say it is. > > > > > > > > > > > > > +1 to that > > > > > > > > Ensuring what exists is stable is a separate issue from creating a > new > > > > iterator API. > > > > > > > > > > > > > > > > > > > > > > > > I think maybe the hardest part is that we don't > > > > > > really want to put just the interface in the API... but it exists > > in > > > a > > > > > > package with a bunch of other classes which probably shouldn't be > > > > public > > > > > > API. So, some thought needs to be put into *how* we're going to > do > > > it, > > > > > too. > > > > > > > > > > > > On Thu, Mar 24, 2016 at 3:27 PM William Slacum < > [email protected]> > > > > > wrote: > > > > > > > > > > > > > It should be public API. It's one of the core reasons for > > choosing > > > > > > Accumulo > > > > > > > over a similar project like HBase or Cassandra. Allegedly, Jeff > > > "Mean > > > > > > Gene" > > > > > > > Dean said we got the concept correct as well :) > > > > > > > > > > > > > > Personally I hate the current API from a usability standpoint > > (ie, > > > > the > > > > > > > generic types are useless and already encoded in the name, it > > > > > needlessly > > > > > > > diverges from the standard java Iterator calling standards), > but > > > > it's a > > > > > > > strong, identifying feature we have. > > > > > > > > > > > > > > On Thu, Mar 24, 2016 at 2:50 PM, Christopher < > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > > > Accumulators, > > > > > > > > > > > > > > > > What are the pros and cons that you can see for moving the > > > > > > > > SortedKeyValueIterator into the public API? > > > > > > > > > > > > > > > > Right now, I think there's still some need for improvement in > > the > > > > > > > Iterator > > > > > > > > API, and many of the iterators may not be stable enough to > > really > > > > > > > recommend > > > > > > > > people use without some serious caveats (because we may not > be > > > able > > > > > to > > > > > > > keep > > > > > > > > their API stable very easily). So, there's a con. > > > > > > > > > > > > > > > > In the pros side, iterators are a core feature of Accumulo, > and > > > > > nearly > > > > > > > all > > > > > > > > of Accumulo's distributed processing capabilities are > dependent > > > > upon > > > > > > > them. > > > > > > > > It is reasonable to expect users to take advantage of them, > and > > > > we've > > > > > > at > > > > > > > > least tried to be cautious about changing the iterators in > > > > > incompatible > > > > > > > > ways, even if they aren't in the public API. > > > > > > > > > > > > > > > > Recently, this came up when we stripped out all the > non-public > > > API > > > > > > > javadocs > > > > > > > > from the website. (reported by Dan Blum on the user list on > > March > > > > > 4th: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/accumulo-user/201603.mbox/%3C066a01d17658%24bc9dc1b0%2435d94510%24%40bbn.com%3E > > > > > > > > ) > > > > > > > > > > > > > > > > What would it take for us to feel comfortable moving them to > > the > > > > > public > > > > > > > > API? Do we need a better interface first, or should we > isolate > > > the > > > > > > other > > > > > > > > iterators into another package (some of that has already been > > > > done), > > > > > or > > > > > > > > should we wait for a proper public API package (2.0?) to > > provide > > > > this > > > > > > > > interface in? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
