On Thu, Oct 26, 2017 at 1:14 PM, Marten van Kerkwijk <m.h.vankerkw...@gmail.com> wrote: > Hi Nathaniel, > > Thanks for the link. The plans sounds great! You'll not be surprised > to hear I'm particularly interested in the units aspect (and, no, I > don't mind at all if we can stop subclassing ndarray...). Is the idea > that there will be a general way for allow a dtype to define how to > convert an array to one with another dtype? (Just as one now > implicitly is able to convert between, say, int and float.) And, if > so, is the idea that one of those conversion possibilities might > involve checking units? Or were you thinking of implementing units > more directly? The former would seem most sensible, if only so you can > initially focus on other things than deciding how to support, say, esu > vs emu units, or whether or not to treat radians as equal to > dimensionless (which they formally are, but it is not always handy to > do so).
Well, to some extent the answers here are going to be "you tell me" :-). I'm not an expert in unit handling, and these plans are pretty high-level right now -- there will be lots more discussions to work out details once we've hired people and they're ramping up, and as we work out the larger context around how to improve the dtype system. But, generally, yeah, one of the things that a custom dtype will need to be able to do is to hook into the casting and ufunc dispatch systems. That means, when you define a dtype, you get to answer questions like "can you cast yourself into float32 without loss of precision?", or "can you cast yourself into int64, truncating values if you have to?". (Or even, "can you cast yourself to <this other unit type>?", which would presumably trigger unit conversion.) And you'd also get to define how things like overriding how np.add and np.multiply work for your dtype -- it's already the case that ufuncs have multiple implementations for different dtypes and there's machinery to pick the best one; this would just be extending that to these new dtypes as well. One possible approach that I think might be particularly nice would be to implement units as a "wrapper dtype". The idea would be that if we have a standard interface that dtypes implement, then not only can you implement those methods yourself to make a new dtype, but you can also call those methods on an existing dtype. So you could do something like: class WithUnits(np.dtype): def __init__(self, inner_dtype, unit): self.inner_dtype = np.dtype(inner_dtype) self.unit = unit # Simple operations like bulk data copying are delegated to the inner dtype # (Invoked by arr.copy(), making temporary buffers for calculations, etc.) def copy_data(self, source, dest): return self.inner_dtype.copy_data(source, dest) # Other operations like casting can do some unit-specific stuff and then # delegate def cast_to(self, other_dtype, source, dest): if isinstance(other_dtype, WithUnits): if other_dtype.unit == self.unit: # Something like casting WithUnits(float64, meters) -> WithUnits(float32, meters) # So no unit trickiness needed; delegate to the inner dtype to handle the storage # conversion (e.g. float64 -> float32) self.inner_dtype.cast_to(other_dtype.inner_dtype, source, dest) # ... other cases to handle unit conversion, etc. ... And then as a user you'd use it like np.array([1, 2, 3], dtype=WithUnits(float, meters)) or whatever. (Or some convenience function that ultimately does this.) This is obviously a hand-wavey sketch, I'm sure the actual details will look very different. But hopefully it gives some sense of the kind of possibilities here? -n -- Nathaniel J. Smith -- https://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion