In reply to Jon Lang,

What I'm proposing here in the general case, is a generic collection type, "Interval" say, that can represent a discontinuous interval of an ordered type. A simple way of defining such a type is that it is a "Set of Pair of Ordered", where each Pair defines a continuous interval in terms of 2 ordered end-points, and the whole set is discontinuous (or replace Pair with 2-element Seq).

So a "key" of an Interval is simply a 2-element sequence of values of the same ordered type. Each continuous sub-interval has no concept of a direction, and the first element of the pair must order before or be equal to the second. Now, with a basic non-normalized representation, it is possible that several set elements/keys may have overlapping ranges, eg 6=>10,9=>11, but if the Interval is normalized then you could insert said 2 elements but afterwards there would just be a single 6=>11 element/key in concept.

The value identity of an Interval is such that two interval are identical only if they consist of the same pair elements when normalized. It is likely in practice that more than half of all Interval values used consist of just a single continuous element.

And so, with this basic structure it is easy to do union, intersection, difference, exclusion, is-overlap, is-subset, is-disjoint, etc with two Interval to derive another. Testing membership in an Interval of a plain value is just seeing if it matches a point within a continuous sub-interval of the Interval.

So, and Interval is set-like, but it is not enumerable (except for being able to enumerate the set of continuous sub-intervals), and it also has a sense of being ordered since its sub-interval elements (especially when normalized) have a concept of being relatively ordered; or rather a set of continuous sub-intervals of the same type are mutually ordered.

Whether something has to be enumerable to .does(Set), I don't know. Maybe we want some role that is wider than Set which says unordered and no duplicates but not enumerable, and Set would compose that and add enumerable.

On a tangent, I propose for Perl's immutable collection types in general (eg, Set, Bag, Seq, Mapping) that they still support analogies of the mutating operators like insert|delete|etc, but these analogies just return a new collection that is the same as the argument collection but for having another or a fewer element. This can also be efficient by making said collection types lazy, so say the new collection value just points to the previous one saying "I'm that but for this change", and only actually eat the other value when it is useful ... like copy on write. A rationale for supporting this rather than just having people use the mutable versions is that often, with a bit of cost for laziness, your code for implementing these types and operations on them can be a lot simpler due to their immutability, and save a lot of copying work by just referencing internals of each other, like copy on write. For that matter, initial construction of set-like types could be lazy, just keeping their argument list initially and hashing out keys from them later when we actually need to use them; this could help performance if hashing or determining element identity is expensive and potentially never needs doing. I plan to demonstrate this as part of my next major Set::Relation module reimplementation, in a few weeks, though I don't expect a demonstration is necessarily required for people to understand it.

-- Darren Duncan

Jon Lang wrote:
I'm not sure that it [i]does[/i] make sense for Range to do Set: in
S02, Range is among the immutable types that are said to do
Positional, while Set is among the ones that are said to do
Associative; and I'm leery about mixing Positional and Associative
without good reason.

More generally, though, both Positional and Associative carry an
unstated assumption that the object in question will be using a
discrete type such as Int or Str as the index or key.  In the
abstract, what we're talking about here is how one would handle an
object that uses a continuous type such as Num as the index or key.
As mentioned before, "loop through the members" (or, in the case of
Junctions, "autothread the members") simply isn't feasible with a
continuous key.

One question is whether Intervals should be Positional (i.e.,
list-like) or Associative (i.e., Set-like).  The former has the
advantage that Ranges are Positional, meaning that Intervals would
conform closely to Ranges, which is the intuitive result of having
them share a constructor.  The latter is what we both jumped to when
we first considered the idea of Intervals in their own right: we want
the Set operations, which Positional objects don't have; and indices
aren't particularly useful for Intervals.

As well, your concept about "disjoint intervals" maps quite nicely to
a Set of Nums (or other continuous types such as Rat or Instant); with
that concept, I'm not sure that we would actually _need_ an Interval
type: we could just have the .. constructor return a Set of Num (or
Rat, or Instant, etc.) when the step is zero, and a Range object
otherwise.  This, of course, assumes that Set is up to the task of
handling the continuous nature of Num, and doesn't merely collect a
bunch of discrete points on the Num line.

So start by addressing the issue of how to handle continuous indices
and/or keys.  Are lists conceptually compatible with the notion of
continuous indices, or is the whole idea too much of a divergence from
the underlying concept?  A basic assumption about lists is that if you
provide custom indices for a list, they can be mapped one-to-one to
non-negative integers; and continuous values, by definition, cannot be
so mapped.  So I'd say that whether or not continuous indices in
general are viable, they are _not_ viable for List.

Keys, OTOH, don't have any such requirement; so continuous keys may
very well be doable.  If they _are_ doable, you have to ask questions
such as "how do I assign values to a continuous interval of keys?"  To
truly be robust, we ought also answer this question in terms of
multidimensional keys, which can be a nightmare.

Reply via email to