Re: [Python-Dev] basenumber redux
> -Original Message- > From: [EMAIL PROTECTED] [mailto:python-dev- > [EMAIL PROTECTED] On Behalf Of Martin v. Löwis > Sent: Wednesday, January 18, 2006 3:36 PM > To: Jason Orendorff > Cc: python-dev@python.org > Subject: Re: [Python-Dev] basenumber redux > > Jason Orendorff wrote: > > Really this is just further proof that type-checking is a royal pain > > in Python. Or rather, it's not hard to cover the builtin and stdlib > > types, but it's hard to support "duck typing" too. Are we going about > > this the right way? > > It's not as bad. There is nothing wrong with restricting the set of > acceptable types if callers would have no problems to convert their > input into one of the acceptable types. Somehow my earlier post on this thread didn't seem to take. There are problems for callers converting their inputs: * currently existing number-like objects would need be retro-fitted with a new base class and possibly change their behavior from old-style to new-style. * some useful classes (such as a symbolic type) already inherit from str. That prevents them from also being able to inherit from basenumber. I'm -1 on the proposal because the benefits are dubious (at best it simplifies just a handful of code fragments); it adds yet another API to learn and remember; it is darned inconvenient for existing code seeking to emulate number-like behavior; and it precludes number emulation for classes that already have a built-in base class. For the most part, functions that enforce type checking are up to no good and make life harder for their callers. If Python ultimately grows interfaces, I hope they remain optional; as soon as functions start insisting on interface checking, then it will spread like cancer. The basenumber proposal is essentially a step down that slippery slope. Raymond Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
Jason Orendorff wrote: > Really this is just further proof that type-checking is a royal pain > in Python. Or rather, it's not hard to cover the builtin and stdlib > types, but it's hard to support "duck typing" too. Are we going about > this the right way? It's not as bad. There is nothing wrong with restricting the set of acceptable types if callers would have no problems to convert their input into one of the acceptable types. In the imaplib example, requesting that a broken-down time is passed as a tuple or a time.struct_time is not too hard for a caller. It will be formatted as dt = time.strftime("%d-%b-%Y %H:%M:%S", tt) so it needs to have the right number of fields. Callers having other kinds of sequence can easily use tuple(L) to convert their data into what the function accepts. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
On 1/17/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Alex Martelli wrote: > > But this doesn't apply to the Python Standard Library, for example see > > line 1348 of imaplib.py: "if isinstance(date_time, (int, float)):". > [...] > > Being able to change imaplib to use basenumber instead of (int, float) > > won't make it SIMPLER, but it will surely make it BETTER -- why should > > a long be rejected, or a Decimal, for that matter? > > Right. I think this function should read > > if isinstance(date_time, str) and \ > (date_time[0],date_time[-1]) == ('"','"'): > return date_time# Assume in correct format > > if isinstance(date_time, (tuple, time.struct_time)): > tt = date_time > else: > tt = time.localtime(date_time) So... arbitrary number-like objects should work, but arbitrary sequence-like objects should fail? Hmmm. Maybe that "if isinstance()" line should say "if hasattr(date_time, '__getitem__'):". Am I sure? No. The original author of imaplib apparently got it wrong, and Martin got it wrong, and they're both smarter than me. Really this is just further proof that type-checking is a royal pain in Python. Or rather, it's not hard to cover the builtin and stdlib types, but it's hard to support "duck typing" too. Are we going about this the right way? Options: 1. Redesign the API so each parameter has a clearly defined set of operations it must support, thus eliminating the need for type-checking. Drawback: An annoying API might be worse than the problem we're trying to solve. 2. Write a set of imprecise, general-purpose type-checking functions (is_real_number(v), is_sequence(v), ...) and use those. (They are imprecise because the requirements are vague and because it's not really possible to pin them down.) Drawback: Error-prone, compounded by deceptively clean appearance. 3. Write very specific custom type-checking code each time you need it (the imaplib approach). Drawbacks: Error-prone (as we've seen), precarious, tedious, unreadable. 4. Use the "better-to-ask-forgiveness-than-permission" idiom. Drawback: Potential bad behavior on error, again potentially worse than the original problem. Yuck. Does anyone have the answer to this one? Or is the problem not as bad as it looks? -j ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
Alex Martelli wrote: > On 1/17/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >> Alex, I think you're missing a point here: what you are looking >> for is an interface, not a base class - simply because the > > I expect numbers to support arithmetic operators, &c -- no need for > basenumber to "spell this out", i.e., "be an itnerface". If at all, basenumber would be an abstract class. However, unlike for basestring, the interface (which methods it supports, including operator methods) would not be well- defined. >> If you look at the Python C API, you'll find that "a number" >> is actually never tested. > > There being no way to generically test for "a number", that's unsurprising. Hmm, I lost you there. If it's unsurprising that there's no check for "a number", then why would you want a basenumber ? >> The tests always ask for either >> integers or floats. > > But this doesn't apply to the Python Standard Library, for example see > line 1348 of imaplib.py: "if isinstance(date_time, (int, float)):". Why not use the functions I added to my previous mail ? >> The addition of a basenumber base class won't make these any >> simpler. > > Being able to change imaplib to use basenumber instead of (int, float) > won't make it SIMPLER, but it will surely make it BETTER -- why should > a long be rejected, or a Decimal, > for that matter? Similarly, on line 1352 it should use the existing > basestring, though it now uses str (this function IS weird -- if it > finds date_time to be of an unknown TYPE it raises a *ValueError* > rather than a *TypeError* -- ah well). Again, why not use floatnumber() instead, which takes care of all the details behind finding out whether an object should be considered a number and even converts it to a float for you ? Why try to introduce a low-level feature when a higher level solution is readily available and more usable. You will rarely really care for the type of an object if all you're interested in is the float value of an object (or the integer value). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 18 2006) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
On 1/17/06, Alex Martelli <[EMAIL PROTECTED]> wrote: > Being able to change imaplib to use basenumber instead of (int, float) > won't make it SIMPLER, but it will surely make it BETTER -- why should > a long be rejected, or a Decimal, > for that matter? Because there's no guarantee that they'll produce correct results? All number types are approximations of true numbers, and they all behave wrong in creative ways. For example: def xrange(start, stop, step): i = start while i < stop: yield i i += step This works fine so long as you only give it int as input, and has no maximum value. >>> for i in xrange(2**53, 2**53+3, 1): print i ... 9007199254740992 9007199254740993 9007199254740994 Float inputs also work so long as you don't get large enough to provoke rounding. However, as soon as you do... >>> for i in xrange(2**53, 2**53+3, 1.0): print i ... 9007199254740992 9.00719925474e+15 9.00719925474e+15 9.00719925474e+15 9.00719925474e+15 9.00719925474e+15 9.00719925474e+15 974e+15 Traceback (most recent call last): File "", line 1, in ? KeyboardInterrupt The function fails. Floating point, despite being a "number" and supporting the "number interface", does not behave close enough to what the programmer desires to work for all values. There might be a way to handle floats specially that a mathematician may understand, but the only way I know of is to convert to int at the start of the function. def xrange(start, stop, step): start, stop, step = int(start), int(stop), int(step) i = start while i < stop: yield i i += step >>> for i in xrange(2**53, 2**53+3, 1.0): print i ... 9007199254740992 9007199254740993 9007199254740994 That works so long as the floats are all integral values. Unfortunately a non-integral value will get truncated silently. An explicit check for equality after the conversion would have to be added, or Guido's __index__ could be used, but __index__ seems misnamed for this usage. Another approach would be changing operations involving floats to return intervals instead. The high end of the interval would continue to count up when rounding is provoked, and would raise an exception when the i < stop is executed (due to being ambiguous). Once float uses intervals you could state that all number types are expected to use intervals in the face of inexactness (and those who don't behave as expected would have unexpected results.) -- Adam Olsen, aka Rhamphoryncus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
Alex Martelli wrote: > But this doesn't apply to the Python Standard Library, for example see > line 1348 of imaplib.py: "if isinstance(date_time, (int, float)):". [...] > Being able to change imaplib to use basenumber instead of (int, float) > won't make it SIMPLER, but it will surely make it BETTER -- why should > a long be rejected, or a Decimal, for that matter? Right. I think this function should read if isinstance(date_time, str) and \ (date_time[0],date_time[-1]) == ('"','"'): return date_time# Assume in correct format if isinstance(date_time, (tuple, time.struct_time)): tt = date_time else: tt = time.localtime(date_time) If this is something that time.localtime can't handle, it will give a TypeError. This is much better than raise ValueError("date_time not of a known type") # (why does it raise a ValueError if it says "type"?) Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
On 1/17/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > Alex, I think you're missing a point here: what you are looking > for is an interface, not a base class - simply because the I expect numbers to support arithmetic operators, &c -- no need for basenumber to "spell this out", i.e., "be an itnerface". > If you look at the Python C API, you'll find that "a number" > is actually never tested. There being no way to generically test for "a number", that's unsurprising. > The tests always ask for either > integers or floats. But this doesn't apply to the Python Standard Library, for example see line 1348 of imaplib.py: "if isinstance(date_time, (int, float)):". > The addition of a basenumber base class won't make these any > simpler. Being able to change imaplib to use basenumber instead of (int, float) won't make it SIMPLER, but it will surely make it BETTER -- why should a long be rejected, or a Decimal, for that matter? Similarly, on line 1352 it should use the existing basestring, though it now uses str (this function IS weird -- if it finds date_time to be of an unknown TYPE it raises a *ValueError* rather than a *TypeError* -- ah well). Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
On 1/17/06, Raymond Hettinger <[EMAIL PROTECTED]> wrote: [snip] > I don't see a way around creating an integer recognition tool that > doesn't conflate its terminology with broadly-held, pre-existing math > knowledge: complex is a superset of reals, reals include rationals and > irrationals some of which are trancendental, and rationals include > integers which are an infinite superset of non-negative integers, whole > numbers, negative numbers, etc. > > The decimal class only makes this more complicated. All binary floats > can be translated exactly to decimal but not vice-versa. I'm not sure > where they would fit into a inheritance hierarchy. To repeat a popular suggestion these days, python might borrow a page from Haskell. Haskell's Prelude_ defines a number (pardon the pun) of numeric typeclasses, each of which requires certain members. The inheritance graph shapes up roughly like this: Num - the ur-base class for all numbers Real - inherits from Num Integral - inherits from Real. Integral numbers support integer division Fractional - inherits from Num. Fractionals support true division Floating - inherits from Fractional. Floating-class objects support trigonometric and hyperbolic functions and related functions. RealFrac - inherits from Real and Fractional. This is used to operate on the components of fractions. RealFloat - inherits from RealFrac and Floating, providing efficient, machine-independent access to the components of a floating-point number. While it may not be necessary to concoct that many base classes for python, having a solid selection of such classes to subclass would reduce the need for heuristics like attribute testing. Moreover, subclassing a (or several) numeric type classes makes your intentions explicit, rather than relying on testing for an "implicit interface". Given the impact this would have on legacy code, and the need to refit the built-in types and standard library, such a chance might be better put off until Python 3k. _Prelude - http://www.haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html Thanks, Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
Alex, I think you're missing a point here: what you are looking for is an interface, not a base class - simply because the assumptions you make when finding a "KnownNumberTypes" instance are only related to an interface you expect them to provide. A common case class won't really help all that much with this, since the implementations of the different types will vary a lot (unlike, for example, strings and Unicode, which implement a very common interface) and not necessarily provide a common interface. If you look at the Python C API, you'll find that "a number" is actually never tested. The tests always ask for either integers or floats. The addition of a basenumber base class won't make these any simpler. Here's a snippet which probably does what you're looking for using Python's natural way of hooking up to an implicit interface: import UserString STRING_TYPES = (basestring, UserString.UserString) def floatnumber(obj): if isinstance(obj, STRING_TYPES): raise TypeError('strings are not numbers') # Convert to a float try: return float(obj) except (AttributeError, TypeError, ValueError): raise TypeError('%r is not a float' % obj) def intnumber(obj): if isinstance(obj, STRING_TYPES): raise TypeError('strings are not numbers') # Convert to an integer try: value = int(obj) except (AttributeError, TypeError, ValueError): raise TypeError('%r is not an integer' % obj) # Double check so that we don't lose precision try: floatvalue = floatnumber(obj) except TypeError: return value if floatvalue != value: raise TypeError('%r is not an integer' % obj) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 17 2006) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
Raymond Hettinger wrote: > Are you sure? The former has two builtin lookups (which also entail two > failed global lookups), a function call, and a test/jump for the result. > The latter approach has no lookups (just a load constant), a try-block > setup, an add operation (optimized for integers, a fast slot lookup > otherwise), and a block end. > > Even if there were a performance edge, I suppose that the type checking > is the time critical part of most scripts. My guess is that it depends on the common case. If the common case is that the type test fails (i.e. element-wise operations are the exception), then you also have the exception creation and storing in the exception approach, compared to returning only existing objects in the type test case. If the common case is that x+0 succeeds, x+0 may or may not create new objects. However, I have long ago learned not to guess about performance, so I won't do further guessing until I see the actual code. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
Alex Martelli wrote: >> As I suggested in a different message: Why are you doing that >> in the first place? > > > Because isinstance is faster and handier than testing with try/except > around (say) "x+0". Nit: I think you should not test. Instead, you should starting you mean to do if the test passes, and then expect TypeErrors from doing so. > As to why I want to distinguish numbers from non-numbers, let me quote > from a message I sent in 2003 (one of the few you'll find by searching > for [basestring site:python.org] as I have repeatedly recommended, but > apparently there's no way to avoid just massively copying and pasting...): As I said in that other message, I found this one, and still don't understand what the use case is, because the example you give still reads hypothetical (I'm sure there is an actual example behind it, but you don't say what that is). > """ > def __mul__(self, other): > if isinstance(other, self.KnownNumberTypes): > return self.__class__([ x*other for x in self.items ]) > else: > # etc etc, various other multiplication cases So what *are* the "various other multiplication cases"? You just shouldn't be doing that: multiplication shouldn't mean "item multiplication" sometimes, and various other things if it can't mean item multiplication. > in Python/bltinmodule.c , function builtin_sum uses C-coded typechecking > to single out strings as an error case: > > /* reject string values for 'start' parameter */ > if (PyObject_TypeCheck(result, &PyBaseString_Type)) { > PyErr_SetString(PyExc_TypeError, > "sum() can't sum strings [use ''.join(seq) instea This is indeed confusing: why is it again that sum refuses to sum up strings? > [etc]. Now, what builtin_sum really "wants" to do is to accept numbers, > only -- it's _documented_ as being meant for "numbers": it uses +, NOT > +=, so its performance on sequences, matrix and array-ish things, etc, > is not going to be good. But -- it can't easily _test_ whether something > "is a number". If we had a PyBaseNumber_Type to use here, it would > be smooth, easy, and fast to check for it. There shouldn't be a check at all. It should just start doing the summing, and it will "work" if PyNumber_Add works for the individual items. Of course, there is an education value of ruling out string addition, since there is a better way to do that, and there should be only one obvious way. I see nothing wrong in summing up sequences, matrices, and arrayish things, using sum. > A fast rational number type, see http://gmpy.sourceforge.net for > details (gmpy wraps LGPL'd library GMP, and gets a lot of speed and > functionality thereby). Ok, so mpq are rational numbers. >> if the parameter belongs to some algebraic ring homomorphic >> with the real numbers, or some such. Are complex numbers also numbers? >> Is it meaningful to construct gmpy.mpqs out of them? What about >> Z/nZ? > > > If I could easily detect "this is a number" about an argument x, I'd > then ask x to change itself into a float, so complex would be easily > rejected (while decimals would mostly work fine, although a bit slowly > without some specialcasing, due to the Stern-Brocot-tree algorithm I > use to build gmpy.mpq's from floats). I can't JUST ask x to "make > itself into a float" (without checking for x's "being a number") > because that would wrongfully succeed for many cases such as strings. Hmm. If you want to do the generic conversion from numbers to rationals by going through float, then this is what you should do. Convert to float, and don't bother with checking whether it will succeed. Instead, if the type conversion gives an error, just return that to the caller. However, it also sounds odd that you are trying to do the to-rational-with-arbitrary-precision conversion by going through floats. Instead, if the argument is decimal, you really should do the division by the approprate base of 10, no? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
[Me] > Even if there were a performance edge, I suppose that the type checking > is the time critical part of most scripts. That should be: RARELY the time critical part of most scripts. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
> Because isinstance is faster and handier than testing with try/except > around (say) "x+0". Are you sure? The former has two builtin lookups (which also entail two failed global lookups), a function call, and a test/jump for the result. The latter approach has no lookups (just a load constant), a try-block setup, an add operation (optimized for integers, a fast slot lookup otherwise), and a block end. Even if there were a performance edge, I suppose that the type checking is the time critical part of most scripts. > If an abstract > basetype > 'basenumber' caught many useful cases, I'd put it right at the start of > the KnownNumberTypes tuple, omit all subclasses thereof from it, get > better performance, AND be able to document very simply what the user > must do to ensure his own custom type is known to me as "a number". So, this would essentially become a required API? It would no longer be enough to duck-type a number, you would also have to subclass from basenumber? Wouldn't this preclude custom number types that also need to derive from another builtin such as str? For instance, somewhere I have Gram-Schmidt orthogonalization transformation code for computing orthonormal bases and the only requirement for the input basis is that it be an inner-product space -- accordingly, the inputs can be functions, symbols (a subclass of str), or vectors (pretty-much anything supporting multiplication and subtraction). Similar conflicts arise for any pure computation function -- I should be able to supply either numbers or symbolic inputs (a subclass of str) that act like numbers. > in Python/bltinmodule.c , function builtin_sum uses C-coded > typechecking > to single out strings as an error case: > > /* reject string values for 'start' parameter */ > if (PyObject_TypeCheck(result, &PyBaseString_Type)) { > PyErr_SetString(PyExc_TypeError, > "sum() can't sum strings [use ''.join(seq) instea > > [etc]. Now, what builtin_sum really "wants" to do is to accept numbers, > only -- it's _documented_ as being meant for "numbers": it uses +, NOT > +=, so its performance on sequences, matrix and array-ish things, etc, > is not going to be good. But -- it can't easily _test_ whether > something > "is a number". If we had a PyBaseNumber_Type to use here, it would > be smooth, easy, and fast to check for it. > """ I think this a really a counter-example. We wanted to preclude sequences so that sum() didn't become a low performance synonym for ''.join(). However, there's no reason to preclude user-defined matrix or vector classes. And, if you're suggesting that the type-check should have been written with as instance(result, basenumber), one consequence would be that existing number-like classes would start to fail unless retro-fitted with a basenumber superclass. Besides being a PITA, retro-fitting existing, working code could also entail changing from an old to new-style class (possibly changing the semantics of the class). > A fast rational number type, see http://gmpy.sourceforge.net for > details (gmpy wraps LGPL'd library GMP, and gets a lot of speed and > functionality thereby). > > > if the parameter belongs to some algebraic ring homomorphic > > with the real numbers, or some such. Are complex numbers also numbers? > > Is it meaningful to construct gmpy.mpqs out of them? What about > > Z/nZ? > > If I could easily detect "this is a number" about an argument x, I'd > then ask x to change itself into a float, so complex would be easily > rejected (while decimals would mostly work fine, although a bit > slowly without some specialcasing, due to the Stern-Brocot-tree > algorithm I use to build gmpy.mpq's from floats). I can't JUST ask x > to "make itself into a float" (without checking for x's "being a > number") because that would wrongfully succeed for many cases such as > strings. Part of the rationale for basestring was convenience of writing isinstance(x,basestring) instead of isinstance(x,(str,unicode)). No existing code needed to be changed for this to work. This proposal, on the other hand, is different. To get the purported benefits, everyone has to play along. All existing number-look-a-like classes would have to be changed in order to work with functions testing for basenumber. That would be too bad for existing, work code. Likewise, it would be impossible for symbolic code which already subclassed from str (as discussed above). > >> If I do write the PEP, should it be just about basenumber, or should > >> it include baseinteger as well? Introducing a baseinteger type is likely to create confusion because all integers are real, but baseintegers are not a subclass of floats. I don't see a way around creating an integer recognition tool that doesn't conflate its terminology with broadly-held, pre-existing math knowledge: complex is a superset of reals, reals include rationals and irrationals s
Re: [Python-Dev] basenumber redux
On 1/16/06, Alex Martelli <[EMAIL PROTECTED]> wrote: > Nothing was said about "different design intent for basestring", as I > recall (that discussion comes up among the few hits for [basestring > site:python.org] if you want to check the details). Please. How many times do I have to say this. *I added this with the stated intent.* Maybe I didn't communicate that intent clearly. But my intent was exactly as it is documented -- basestring == (str, unicode) and nothing else. Why this intent? Because I felt the need to be able to distinguish the *built-in* string types from *anything* else; they have many special properties and purposes. Also note that basestring was introduced in 2.3, a whole release *after* inheritance from str was made possible. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
On Jan 16, 2006, at 2:01 PM, Martin v. Löwis wrote: > Alex Martelli wrote: >> I can't find a PEP describing this restriction of basestring, and I >> don't see why a coder who needs to implement another kind of >> character string shouldn't subclass basestring, so that those >> instances pass an isinstance test on basestring which is quite likely >> to be found e.g. in the standard library. > > People could do that, but they would be on their own. basestring > could be interpreted as "immutable sequence of characterish things", > but it was really meant to be "union of str and unicode". There > is no PEP because it was introduced by BDFL pronouncement. Unfortunately the lack of a PEP leaves this a bit underdocumented. > That it is not a generic base class for stringish types can be > seen by looking at UserString.UserString, which doesn't inherit > from basestring. Raymond Hettinger's reason for not wanting to add basestring as a base for UserString was: UserString is a legacy class, since today people can inherit from str directly, and it cannot be changed from a classic class to a new-style one without breaking backwards compatibility, which for a legacy class would be a big booboo. Nothing was said about "different design intent for basestring", as I recall (that discussion comes up among the few hits for [basestring site:python.org] if you want to check the details). > For most practical purposes, those two definitions actually > define the same thing - there simply aren't any stringish data > types in praxis: you always have Unicode, and if you don't, > you have bytes. But not necessarily in one big blob that's consecutive (==compact) in memory. mmap instances are "almost" strings and could easily be made into a closer match, at least for the immutable variants, for example; other implementations such as SGI STL's "Rope" also come to mind. In the context of a current struggle (a different and long story) between Python builds with 2-bytes Unicode and ones with 4-bytes Unicode, I've sometimes found myself dreaming of a string type that's GUARANTEED to be 2-bytes, say, and against which extension modules could be written that don't need recompilation to move among such different builds, for example. It's not (yet) hurting enough to make me hunker down and write such an extension (presumably mostly by copy- past-edit from Python's own sources;-), but if somebody did it would sure be nice if they could have that type "assert it's a string" by inheriting from basestring, no? >> Implementing different kinds of numbers is more likely than >> implementing different kinds of strings, of course. > > Right. That's why a PEP is needed here, but not for basestring. OK, I've mailed requesting a number. > >> A third argument against it is asymmetry: why should I use completely >> different approaches to check if x is "some kind of string", vs >> checking if x is "some kind of number"? > > I guess that's for practicality which beats purity. People often > support interfaces that either accept both an individual string > and a list of strings, and they need the test in that case. > It would be most natural to look for whether it is a sequence; > unfortunately, strings are also sequences. Sure, isinstance-tests with basestring are a fast and handy way to typetest that. But so would similar tests with basenumber be. > >> isinstance with a tuple of number types, where the tuple did not >> include Decimal (because when I developed and tested that module, >> Decimal wasn't around yet). > > As I suggested in a different message: Why are you doing that > in the first place? Because isinstance is faster and handier than testing with try/except around (say) "x+0". As to why I want to distinguish numbers from non-numbers, let me quote from a message I sent in 2003 (one of the few you'll find by searching for [basestring site:python.org] as I have repeatedly recommended, but apparently there's no way to avoid just massively copying and pasting...): """ def __mul__(self, other): if isinstance(other, self.KnownNumberTypes): return self.__class__([ x*other for x in self.items ]) else: # etc etc, various other multiplication cases right now, that (class, actually) attribute KnownNumberTypes starts out "knowing" about int, long, float, gmpy.mpz, etc, and may require user customization (e.g by subclassing) if any other "kind of (scalar) number" needs to be supported; besides, the isinstance check must walk linearly down the tuple of known number types each time. (I originally had quite a different test structure: try: other + 0 except TypeError: # other is not a number # various other multiplication cases else: # other is a number, so... return self.__class__([ x*other for x in self.items ]) but the performance for typical benchmarks improved with the isinstance test, so, reluctantly, that's
Re: [Python-Dev] basenumber redux
Alex Martelli wrote: > I can't find a PEP describing this restriction of basestring, and I > don't see why a coder who needs to implement another kind of > character string shouldn't subclass basestring, so that those > instances pass an isinstance test on basestring which is quite likely > to be found e.g. in the standard library. People could do that, but they would be on their own. basestring could be interpreted as "immutable sequence of characterish things", but it was really meant to be "union of str and unicode". There is no PEP because it was introduced by BDFL pronouncement. That it is not a generic base class for stringish types can be seen by looking at UserString.UserString, which doesn't inherit from basestring. For most practical purposes, those two definitions actually define the same thing - there simply aren't any stringish data types in praxis: you always have Unicode, and if you don't, you have bytes. > Implementing different kinds of numbers is more likely than > implementing different kinds of strings, of course. Right. That's why a PEP is needed here, but not for basestring. > A third argument against it is asymmetry: why should I use completely > different approaches to check if x is "some kind of string", vs > checking if x is "some kind of number"? I guess that's for practicality which beats purity. People often support interfaces that either accept both an individual string and a list of strings, and they need the test in that case. It would be most natural to look for whether it is a sequence; unfortunately, strings are also sequences. > isinstance with a tuple of number types, where the tuple did not > include Decimal (because when I developed and tested that module, > Decimal wasn't around yet). As I suggested in a different message: Why are you doing that in the first place? > I have the same issue in the C-coded extension gmpy: I want (e.g.) a > gmpy.mpq to be able to be constructed by passing any number as the > argument, but I have no good way to say "what's a number", so I use > rather dirty tricks -- in particular, I've had to tweak things in a > weird direction in the latest gmpy to accomodate Python 2.4 > (specifically Decimal). Not sure what a gmpy.mpq is, but I would expect that can only work if the parameter belongs to some algebraic ring homomorphic with the real numbers, or some such. Are complex numbers also numbers? Is it meaningful to construct gmpy.mpqs out of them? What about Z/nZ? > If I do write the PEP, should it be just about basenumber, or should > it include baseinteger as well? I think it should only do the case you care about. If others have other use cases, they might get integrated, or they might have to write another PEP. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
Alex Martelli wrote: > > As you already suspected, I think a PEP is needed. The intent of > > I'll be happy to write it, if it stands any chance. > > > basestring was to *only* be used as the base class for *built-in* > > string types. Clearly what you're proposing is different (Decimal is > > not built-in -- not yet anyway). > > I can't find a PEP describing this restriction of basestring that's how it's documented, at least: http://effbot.org/lib/builtin.basestring > Implementing different kinds of numbers is more likely than > implementing different kinds of strings, of course. indeed. implementing a new string type is not a trivial task in today's CPython. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
On Jan 16, 2006, at 7:53 AM, Guido van Rossum wrote: > On 1/15/06, Alex Martelli <[EMAIL PROTECTED]> wrote: >> Now, today, I have _again_ been bit by the lack of basenumber (by a >> bug of mine, fixed by adding decimal.Decimal to a long tuple of >> classes to be passed to an isinstance call -- I hadn't run that >> particular numeric code of mine since the time of Python 2.3, >> apparently), so I'm back to pining for it. > > As you already suspected, I think a PEP is needed. The intent of I'll be happy to write it, if it stands any chance. > basestring was to *only* be used as the base class for *built-in* > string types. Clearly what you're proposing is different (Decimal is > not built-in -- not yet anyway). I can't find a PEP describing this restriction of basestring, and I don't see why a coder who needs to implement another kind of character string shouldn't subclass basestring, so that those instances pass an isinstance test on basestring which is quite likely to be found e.g. in the standard library. Implementing different kinds of numbers is more likely than implementing different kinds of strings, of course. > Like other posters, I suspect that the best way of detecting numbers > might be some other kind of test, not necessarily a call to > isinstance(). I've tended to use a try/except around x+0 to detect if "x is a number". But that's NOT how the standard library does it -- rather, it has isinstance tests (often forgetting long in the tuple of types), as I pointed out in my mails on the subject back in 2003 (google for [basenumber site:python.org], there aren't many). I will reproduce those in the PEP, of course, if I do write one. The x+0 test has been criticized in the past because x COULD be an instance of a type which defines an __add__ which has side effects, or a very inclusive __add__ which happens to accept an int argument even though the type is NOT meant to be a number. These could be seen as design defects of x's type, of course. A second argument against the x+0 test is performance. A third argument against it is asymmetry: why should I use completely different approaches to check if x is "some kind of string", vs checking if x is "some kind of number"? Once upon a time I used x +'' (with try/except around it) to check for stringness, but the introduction of basestring strongly signaled that this was not the "one obvious way" any more. > > It would also help to explain the user case more. ("I've been bitten" > doesn't convey a lot of information. :-) isinstance with a tuple of number types, where the tuple did not include Decimal (because when I developed and tested that module, Decimal wasn't around yet). That's the problem of using isinstance without a "flag class" like basestring: one is hardwiring a specific tuple of types as being "singled out" for different treatment. If a new type comes up (be it for the standard library or some extension) there's no way to respect the "open-closed principle", leaving the affected module "closed to changes" yet "open for extension". In this way, using isinstance with a "hardwired" tuple of types is open to the same objections as "type-switching": it produces code that is not extensible to new types beyond those specific ones it had considered at coding time. I have the same issue in the C-coded extension gmpy: I want (e.g.) a gmpy.mpq to be able to be constructed by passing any number as the argument, but I have no good way to say "what's a number", so I use rather dirty tricks -- in particular, I've had to tweak things in a weird direction in the latest gmpy to accomodate Python 2.4 (specifically Decimal). Since "being a number" is a protocol (albeit, like "being a string", a rather peculiar one -- "the verb TO BE" is always fraught;-), other traditional possibilities for supporting it are introducing a special method or flag attribute such as "__isanumber__" (either callable, or flagging numberhood just by its presence). But introducing flag- abstract-class basenumber is more consistent with what was done for strings and affords a simpler test via isinstance. Of course _if_ PEP 246 was accepted, anybody could add more types to the set of those which "can be adapted to being numbers", but an object to denote that protocol should still be there (since the standard library does need to test for numberhood in a few places). If I do write the PEP, should it be just about basenumber, or should it include baseinteger as well? The case for the latter is IMHO still good but a bit weaker; it would be nice to be able to code 'xy' * Z without having str.__rmul__ perform a hardcoded test on Z being specifically an int or a long, making "other kinds of integers" (e.g. gmpy.mpz instances) smoothly substitutable for ints everywhere (similarly for somelist[Z], of course). Right now I have to pepper my code with int(Z) casts when Z is
Re: [Python-Dev] basenumber redux
On 1/15/06, Alex Martelli <[EMAIL PROTECTED]> wrote: > Now, today, I have _again_ been bit by the lack of basenumber (by a > bug of mine, fixed by adding decimal.Decimal to a long tuple of > classes to be passed to an isinstance call -- I hadn't run that > particular numeric code of mine since the time of Python 2.3, > apparently), so I'm back to pining for it. As you already suspected, I think a PEP is needed. The intent of basestring was to *only* be used as the base class for *built-in* string types. Clearly what you're proposing is different (Decimal is not built-in -- not yet anyway). Like other posters, I suspect that the best way of detecting numbers might be some other kind of test, not necessarily a call to isinstance(). It would also help to explain the user case more. ("I've been bitten" doesn't convey a lot of information. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
On Monday 16 January 2006 20:05, Nick Coghlan wrote: > For example, what's wrong with "hasattr(x, __int__)"? That works > for all the builtin types, and, IMO, anyone defining a direct > conversion to an integer for a non-numeric type deserves whatever > happens to them. What about something that's got something like: def __int__(self): raise TypeError("This type is not a number!") I don't see a problem with defining basenumber. For the use cases, pretty much the same set as basesstring. -- Anthony Baxter <[EMAIL PROTECTED]> It's never too late to have a happy childhood. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
Alex Martelli wrote: > I'll be happy to draft a PEP if needed (and just as happy to > eventually provide an implementation patch if the PEP's accepted), > but wanted to doublecheck on the general issue first! I haven't really followed the earlier basenumber discussions (aside from the sidetrack into the nature of mappings), but why would we want this ability as a typecheck and not some form of interface check? For example, what's wrong with "hasattr(x, __int__)"? That works for all the builtin types, and, IMO, anyone defining a direct conversion to an integer for a non-numeric type deserves whatever happens to them. Something like: def is_number(x): return hasattr(x, '__int__') def is_integer(x): return x == int(x) Requiring inheritance from "basenumber" in order to make something behave like a real number seems antithetical to both duck-typing and the adaptation PEP. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] basenumber redux
Alex Martelli wrote: > I'll be happy to draft a PEP if needed (and just as happy to > eventually provide an implementation patch if the PEP's accepted), > but wanted to doublecheck on the general issue first! Please do so. I've browsed somewhat through past discussions, but wasn't able to find a proposed specification of basenumber. Also, I only found half of a rationale for it: it is meant to be used along with isinstance, but I couldn't find out why you wanted to do that. In http://mail.python.org/pipermail/python-dev/2003-November/039916.html you explain that you wanted to multiply all items of self with other if other is a number; why couldn't this be written as def __mul__(self, other): try: return self.__class__([ x*other for x in self.items ]) except TypeError: # various other multiplication cases You give performance as the rationale; this is unconvincing as it would rather indicate that performance of exceptions should be improved (also, I think it is unpythonic to change the language for performance reasons, except in really common cases). Also, in that example, I wonder if the use of multiplication is flawed. If you have so many multiplication cases, perhaps you abuse the notion of multiplication? Users will need to understand the different cases, as well, and they will be surprised when it works in one case, but not in a (seemingly similar) othercase. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] basenumber redux
For the last 2+ years I've been occasionally arguing for the introduction of a basenumber (and ideally a baseinteger, but that, to me, is a slightly lesser issue) analogous to basestring. Google search fo [basenumber site:python.org] for several messages on the subject, by me and others; it will also find the recent thread about more general abstract baseclasses, which seems to have bogged down on such issues as whether sets are mappings. Now, today, I have _again_ been bit by the lack of basenumber (by a bug of mine, fixed by adding decimal.Decimal to a long tuple of classes to be passed to an isinstance call -- I hadn't run that particular numeric code of mine since the time of Python 2.3, apparently), so I'm back to pining for it. The previous discussion was short but pretty exhaustive, so I'd ask further discussants to refer back to it, rather than repeating it; no blocking issue appears to have emerged at that time, plenty of use cases were pointed out, etc. Can we PLEASE have basenumber (and maybe baseinteger, so sequences can typecheck against that for their indices -- that's the key usecase of baseinteger) rather than have them "hijacked" by wider consideration of basesequence, basemapping, and so on...? Pretty please? Let's be pragmatic: basenumber isn't at all complicated nor controversial, baseinteger hardly at all, so let's accept them while pondering on other potential base* classes for as long as it takes for the dust to settle I'll be happy to draft a PEP if needed (and just as happy to eventually provide an implementation patch if the PEP's accepted), but wanted to doublecheck on the general issue first! Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com