On Fri, May 24, 2019 at 5:32 AM Reuven Lax <re...@google.com> wrote: > > On Thu, May 23, 2019 at 1:53 PM Ahmet Altay <al...@google.com> wrote: >> >> >> >> On Thu, May 23, 2019 at 1:38 PM Lukasz Cwik <lc...@google.com> wrote: >>> >>> >>> >>> On Thu, May 23, 2019 at 11:37 AM Rui Wang <ruw...@google.com> wrote: >>>>> >>>>> A few obvious problems with this code: >>>>> 1. Removing the elements already processed from the bag requires >>>>> clearing and rewriting the entire bag. This is O(n^2) in the number of >>>>> input trades. >>>> >>>> why it's not O(2 * n) to clearing and rewriting trade state? >>>> >>>>> >>>>> public interface SortedMultimapState<K, V> extends State { >>>>> // Add a value to the map. >>>>> void put(K key, V value); >>>>> // Get all values for a given key. >>>>> ReadableState<Iterable<V>> get(K key); >>>>> // Return all entries in the map. >>>>> ReadableState<Iterable<KV<K, V>>> allEntries(); >>>>> // Return all entries in the map with keys <= limit. returned elements >>>>> are sorted by the key. >>>>> ReadableState<Iterable<KV<K, V>>> entriesUntil(K limit); >>>>> >>>>> // Remove all values with the given key; >>>>> void remove(K key); >>>>> // Remove all entries in the map with keys <= limit. >>>>> void removeUntil(K limit); >>>> >>>> Will removeUntilExcl(K limit) also useful? It will remove all entries in >>>> the map with keys < limit. >>>> >>>>> >>>>> Runners will sort based on the encoded value of the key. In order to make >>>>> this easier for users, I propose that we introduce a new tag on Coders >>>>> PreservesOrder. A Coder that contains this tag guarantees that the >>>>> encoded value preserves the same ordering as the base Java type. >>>> >>>> >>>> Could you clarify what is "encoded value preserves the same ordering as >>>> the base Java type"? >>> >>> >>> Lets say A and B represent two different instances of the same Java type >>> like a double, then A < B (using the languages comparison operator) iff >>> encode(A) < encode(B) (note the encoded versions are compared >>> lexicographically) >> >> >> Since coders are shared across SDKs, do we expect A < B iff e(A) < e(P) >> property to hold for all languages we support? What happens A, B sort >> differently in different languages? > > > That would have to be the property of the coder (which means that this > property probably needs to be represented in the portability representation > of the coder). I imagine the common use cases will be for simple coders like > int, long, string, etc., which are likely to sort the same in most languages.
The standard coders for both double and integral types do not respect the natural ordering (consider negative values). KV coders violate the "natural" lexicographic ordering on components as well. I think implicitly sorting on encoded value would yield many surprises. (The state, of course, could take a order-preserving, bytes (string?)-producing callable as a parameter of course). (As for naming, I'd probably call this OrderedBagState or something like that...rather than Map which tends to imply random access.)