On 24/03/2015 16:22, William Burns wrote: > On Tue, Mar 24, 2015 at 11:06 AM, Radim Vansa <rva...@redhat.com> wrote: >> On 03/24/2015 03:38 PM, Pedro Ruivo wrote: >>> Hi, >>> >>> comments inline... >>> >>> On 03/24/2015 01:57 PM, Radim Vansa wrote: >>>> On 03/24/2015 02:22 PM, William Burns wrote: >>>>> I am nearing completion of the new multi get command [1], allowing for >>>>> more efficient retrieval of multiple keys at the same time. Thanks >>>>> again to Radim for a lot of the initial work. >>>>> >>>>> In doing so though I want to make sure I get feedback on how we want >>>>> the API to actually look, since this is much more difficult to change >>>>> down the line. >>>>> >>>>> There are a few things in particular I wanted to discuss. >>>>> >>>>> 1. The actual name of the method. The main suggestions I have seen >>>>> are getAll or getMany. I do like the naming of the former, however it >>>>> seems a bit misleading (remember API is getAll(Set) since we are >>>>> really getting a subset. So I am thinking the possibilities at this >>>>> point are getAllOf, getAllFor or getMany. I am leaning maybe towards >>>>> the last one (getMany). I am open to any suggestions though. >>>> Actually Tristan suggested the name multiGet() - I would prefer this one >>>> in the end, and adding multiPut() that would just do putAll (to have the >>>> API symmetric) and deprecate the putAll() method. multiRemove and others >>>> can follow later, and the naming is straightforward. >>>> I would object against getAll() since this sounds like retrieving all >>>> entries from the cache, not just the specified keys. >>>> >>> Why a Set as parameter? >>> >>> If we have a Collection as parameter, it would allow a user to specify >>> easily the keys he want, like multiGet(Arrays.asList(k1, k2, k3)). >> It's Set as the collection should not contain duplicities. However, it's >> a bit questionable whether the collection should be copied for (async) >> internal processing or whether Infinispan takes ownership of this map >> (first option is more safe, second is faster). > I would say it shouldn't be copied, but I could put a comment on the > Javadoc if the Set is modified concurrently while the operation is > running the results returned will be nondeterministic. The JCache contract for this is:
Map<K, V> getAll(Set<? extends K> keys); Nothing is said about whether "keys" should be "safe", and I think that is absolutely fine for us as well (i.e. no copying, if you change the set, you get what you deserve). > >> Radim >> >>>>> 2. What should the return value be for this method. Currently this >>>>> returns a Map<K, V> which makes sense when we retain these values in >>>>> the local transactional context and is pretty nice and clean for end >>>>> users. >>>>> >>>>> The other idea is to use a streaming API possibly returning an >>>>> Iterable<CacheEntry<K, V>>. The streaming return value doesn't seem >>>>> as intuitive to me, but opens us up for more efficient usage >>>>> especially when there may be a lot of keys passed in (also results can >>>>> be processed concurrently while retrieval is still occurring). >>>>> >>>>> I would lean towards returning the Map<K, V> value, however the next >>>>> point could change that. >>>> I think that Iterable<CacheEntry<K, V>> is too confusing for the end >>>> user, I would stick to the Map. If you want lazy loading, the Map (and >>>> it's EntrySet) could be made lazy by a flag. > The point wasn't that it was lazy, but rather that we could limit how > many values are maintained on the caller side to prevent excess memory > usage. > >>> my vote is to use the Map<K,V> interface. The Iterable will force the >>> user to fetch the data using the iterator order, while the Map allow him >>> to use any order he wants (i.e. fetch particular keys via map.get(()). >>> >>> BTW, it can be good to wrap the Map in a unmodifiableMap(). > I wonder if this really matters. Each invocation would return its own > map, so if they modified it, it shouldn't matter. I would make sure > to document that the Map is not backing map in that updates to the > cache or this map are not reflected in the other. Agree on all points. An iterable variant would probably more closely resemble the Java 8 streams, so I wouldn't overcomplicated this specific API. >>>>> 3. Where this method should live. Right now this method is on the >>>>> BasicCache interface which is a parent of both Cache (embedded) and >>>>> RemoteCache (remote). Unfortunately for remote clients to be as >>>>> efficient as possible we really need a Streaming API so that we don't >>>>> have multiple copies of data in memory at multiple places at the same >>>>> time. For that reason I suggest we use the streaminig API for both or >>>>> have a different API for remote vs embedded. >>>>> >>> IMO, we should have different API for embedded and remote. > +1 That is what I was thinking as well that we make the embedded API > nicer to use. We can try to figure out the specifics of the remote > API later then. > >>> BasicCache does not seems a good place to put it. It can be placed in >>> AdvancedCache. > +1 I was thinking this may be the safest for now and we can always > move it to a higher class later if needed, however we can't move it > down easily. +1 for AdvancedCache for now, for Infinispan 8 Galder and I will discuss interfaces next week. Tristan -- Tristan Tarrant Infinispan Lead JBoss, a division of Red Hat _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev