[Python-Dev] Numerical robustness, IEEE etc.
As I have posted to comp.lang.python, I am not happy with Python's numerical robustness - because it basically propagates the 'features' of IEEE 754 and (worse) C99. Yes, it's better, but I would like to make it a LOT better. I already have a more robust version of 2.4.2, but there are some problems, technical and political. I should appreciate advice. 1) Should I start off by developing a testing version, to give people a chance to scream at me, or write a PEP? Because I am no Python development expert, the former would help to educate me into its conventions, technical and political. 2) Because some people are dearly attached to the current behaviour, warts and all, and there is a genuine quandary of whether the 'right' behaviour is trap-and-diagnose, propagate-NaN or whatever-IEEE-754R- finally-specifies (let's ignore C99 and Java as beyond redemption), there might well need to be options. These can obviously be done by a command-line option, an environment variable or a float method. There are reasons to disfavour the last, but all are possible. Which is the most Pythonesque approach? 3) I am rather puzzled by the source control mechanism. Are commit privileges needed to start a project like this in the main tree? Note that I am thinking of starting a test subtree only. 4) Is there a Python hacking document? Specifically, if I want to add a new method to a built-in type, is there any guide on where to start? 5) I am NOT offering to write a full floating-point emulator, though it would be easy enough and could provide repeatable, robust results. "Easy" does not mean "quick" :-( Maybe when I retire. Incidentally, experience from times of yore is that emulated floating-point would be fast enough that few, if any, Python users would notice. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pre-PEP: Allow Empty Subscript List Without Parentheses
Talin <[EMAIL PROTECTED]> wrote: > > Ok, so in order to clear up the confusion here, I am going to take a > moment to try and explain Noam's proposal in clearer language. > > Now, as to the specifics of Noam's problem: Apparently what he is trying > to do is what many other people have done, which is to use Python as a > base for some other high-level language, building on top of Python > syntax and using the various operator overloads to define the semantics > of the language. No, that's too restrictive. Back in the 1970s, Genstat (a statistical language) and perhaps others introduced the concept of an array type with an indefinite number of dimensions. This is a requirement for implementing such things as continengy tables, analysis of variance etc., and was and is traditionally handled by some ghastly code. It always was easy to handle in LISP and, as far as this goes, Python is a descendent of LISP rather than of Algol, CPL or Fortran. Now, I thought of how conventional "3rd GL" languages (Algol 68, Fortran, C etc.) could be extended to support those - it is very simple, and is precisely what Noam is proposing. An index becomes a single-dimensional vector of integers, and all is hunky-dory. When you look at it, you realise that you DO want to allow zero-length index vectors, to avoid having to write separate code for the scalar case. So it is not just a matter of mapping another language, but that of meeting a specific requirement, that is largely language-independent. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Brett Cannon's and Neal Norwitz's replies appreciated and noted, but responses sent by mail. Nick Coghlan <[EMAIL PROTECTED]> wrote: > > Python 2.4's decimal module is, in essence, a floating point emulator based > on > the General Decimal Arithmetic specification. Grrk. Format and all? Because, in software, encoding, decoding and dealing with the special cases accounts for the vast majority of the time. Using a format and specification designed for implementation in software is a LOT faster (often 5-20 times). > If you want floating point mathematics that doesn't have insane platform > dependent behaviour, the decimal module is the recommended approach. By the > time Python 2.6 rolls around, we will hopefully have an optimized version > implemented in C (that's being worked on already). Yes. There is no point in building a wheel if someone else is doing it. Please pass my name on to the people doing the optimisation, as I have a lot of experience in this area and may be able to help. But it is a fairly straightforward (if tricky) task. > That said, I'm not clear on exactly what changes you'd like to make to the > binary floating point type, so I don't know if I think they're a good idea or > not :) Now, here it is worth posting a reponse :-) The current behaviour follows C99 (sic) with some extra checking (e.g. division by zero raises an exception). However, this means that a LOT of errors will give nonsense answers without comment, and there are a lot of ways to 'lose' NaN values quietly - e.g. int(NaN). That is NOT good software engineering. So: Mode A: follow IEEE 754R slavishly, if and when it ever gets into print. There is no point in following C99, as it is too ill-defined, even if it were felt desirable. This should not be the default, because of the flaws I mention above (see Kahan on Java). Mode B: all numerically ambiguous or invalid operations should raise an exception - including pow(0,0), int(NaN) etc. etc. There is a moot point over whether overflow is such a case in an arithmetic that has infinities, but let's skip over that one for now. Mode C: all numerically ambiguous or invalid operations should return a NaN (or infinity, if appropriate). Anything that would lose the error indication would raise an exception. The selection between modes B and C could be done by a method on the class - with mode B being selected if any argument had it set, and mode C otherwise. Now, both modes B and C are traditional approaches to numerical safety, and have the property that error indications can't be lost "by accident", though they make no guarantees that the answers make sense. I am agnostic about which is better, though mode B is a LOT better from the debugging point of view, as you discover an error closer to where it was made. Heaven help us, there could be a mode D, which would be mode C but with trace buffers. They are another sadly neglected software engineering technique, but let's not add every bell and whistle on the shelf :-) "tjreedy" <[EMAIL PROTECTED]> wrote: > > > experience from times of yore is that emulated floating-point would > > be fast enough that few, if any, Python users would notice. > > Perhaps you should enquire on the Python numerical and scientific computing > lists to see how many feel differently. I don't see how someone crunching > numbers hours per day could not notice a slowdown. Oh, certainly, almost EVERYONE will "feel" differently! But that is not the point. Those few of us remaining (and there are damn few) who know how a fast emulated floating-point performs know that the common belief that it is very slow is wrong. I have both used and implemented it :-) The point is, as I mention above, you MUST use a software-friendly format AND specification if you want performance. IEEE 754 and IBM's decimal pantechnichon are both extremely software-hostile. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Michael Hudson <[EMAIL PROTECTED]> wrote: > > > As I have posted to comp.lang.python, I am not happy with Python's > > numerical robustness - because it basically propagates the 'features' > > of IEEE 754 and (worse) C99. > > That's not really now I would describe the situation today. It is certainly the case in 2.4.2, however you would describe it. > > 2) Because some people are dearly attached to the current behaviour, > > warts and all, and there is a genuine quandary of whether the 'right' > > behaviour is trap-and-diagnose, propagate-NaN or whatever-IEEE-754R- > > finally-specifies (let's ignore C99 and Java as beyond redemption), > > Why? Maybe it's clear to you, but it's not totally clear to me, and > it any case the discussion would be better informed for not being too > dismissive. Why which? There are several things that you might be puzzled over. And where can I start? Part of the problem is that I have spent a LOT of time in these areas in the past decades, and have been involved with many of the relevant standards, and I don't know what to assume. > > there might well need to be options. These can obviously be done by > > a command-line option, an environment variable or a float method. > > There are reasons to disfavour the last, but all are possible. Which > > is the most Pythonesque approach? > > I have heard Tim say that there are people who would dearly like to be > able to choose. Environment variables and command line switches are > not Pythonic. All right, but what is? Firstly, for something that needs to be program-global? Secondly, for things that don't need to be brings up my point of adding methods to a built-in class. > I'm interested in making Python's floating point story better, and > have worked on a few things for Python 2.5 -- such as > pickling/marshalling of special values -- but I'm not really a > numerical programmer and don't like to guess what they need. Ah. I must get a snapshot, then. That was one of the lesser things on my list. I have spent a lot of the past few decades in the numerical programming arena, from many aspects. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
gt; have worked on a few things for Python 2.5 -- such as > >> pickling/marshalling of special values -- but I'm not really a > >> numerical programmer and don't like to guess what they need. > > > > Ah. I must get a snapshot, then. That was one of the lesser things > > on my list. > > It was fairly straightforward, and still caused portability problems... Now, why did I predict that? Did you, by any chance, include System/390 and VAX support in your code :-) Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Very interesting. I need to investigate in more depth. > The work-in-progress can be seen in Python's SVN sandbox: > > http://svn.python.org/view/sandbox/trunk/decimal-c/ beelzebub$svn checkout http://svn.python.org/view/sandbox/trunk/decimal-c/ svn: PROPFIND request failed on '/view/sandbox/trunk/decimal-c' svn: PROPFIND of '/view/sandbox/trunk/decimal-c': Could not read chunk size: connection was closed by server. (http://svn.python.org) Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
check interrupt have to do with anything? Because a machine check is one of the classes of interrupt that you POSITIVELY want the other cores stopped until you have worked out whether it impacts just the interrupted core or the CPU as a whole. Inter alia, the PowerPC architecture takes one when a core has just gone AWOL - and there is NO WAY that the dead core can handle the interrupt indicating its own demise! > > Oh, that's the calm, moderate description. The reality is worse. > > Yes, but fortunately irrelevant... Unfortunately, it isn't. I wish that it were :-( > Now, a more general reply: what are you actually trying to acheive > with these posts? I presume it's more than just make wild claims > about how much more you know about numerical programming than anyone > else... Sigh. What I am trying to get is floating-point support of the form that, when a programmer makes a numerical error (see above), he gets EITHER an exception value returned OR an exception raised. I do, of course, need to exclude the cases when the code is testing states explicitly, twiddling bits and so on. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
"Neal Norwitz" <[EMAIL PROTECTED]> wrote: > > Seriously, there seems to be a fair amount of miscommunication in this > thread. ... Actually, this isn't really a reply to you, but you have described the issue pretty well. > The best design doc that I know of is code. :-) > > It would be much easier to communicate using code snippets. > I'd suggest pointing out places in the Python code that are lacking > and how you would correct them. That will make it easier for everyone > to understand each other. Yes. That is easy. What, however, I have part of (already) and was proposing to do BEFORE going into details was to generate a testing version that shows how I think that it should be done. Then people could experiment with both the existing code and mine, to see the differences. But, in order to do that, I needed to find out the best way of going about it It wouldn't help with the red herrings, such as the reasons why it is no longer possible to rely on hardware interrupts as a mechanism. But they are only very indirectly relevant. The REASON that I wanted to do that was precisely because I knew that very few people would be deeply into arithmetic models, the details of C89 and C99 (ESPECIALLY as the standard is incomplete :-( ), and so having a sandbox before starting the debate would be a GREAT help. It's much easier to believe things when you can try them yourself "Facundo Batista" <[EMAIL PROTECTED]> wrote: > > Well, so I'm completely lost... because, if all you want is to be able > to chose a returned value or an exception raised, you actually can > control that in Decimal. Yes, but I have so far failed to get hold of a copy of the Decimal code! I will have another go at subverting Subversion. I should VERY much like to be get hold of those documents AND build a testing version of the code - then I can go away, experiment, and come back with some more directed comments (not mere generalities). Aahz <[EMAIL PROTECTED]> wrote: > > You can't expect us to do your legwork for you, and you can't expect > that Tim Peters is the only person on the dev team who understands what > you're getting at. Well, see above for the former - I did post my intents in my first message. And, as for the latter, I have tried asking what I can assume that people know - it is offensive and time-consuming and hence counter-productive to start off assuming that your audience does not have a basic background. To repeat, it is precisely to address THAT issue that I wanted to build a sandbox BEFORE going into details. If people don't know the theory in depth and but are interested, they could experiment with the sandbox and see what happens in practice. > Incidentally, your posts will go directly to python-dev without > moderation if you subscribe to the list, which is a Good Idea if you want > to participate in discussion. Er, you don't receive a mailing list at all if you don't subscribe! If that is the intent, I will see if I can find how to subscribe in the unmoderated fashion. I didn't spot two methods on the Web pages when I subscribed. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
To the moderator: this is getting ridiculous. [EMAIL PROTECTED] wrote: > > > >Unfortunately, that doesn't help, because it is not where the issues > > >are. What I don't know is how much you know about numerical models, > > >IEEE 754 in particular, and C99. You weren't active on the SC22WG14 > > >reflector, but there were some lurkers. > > Hand wave, hand wave, hand wave. Many of us here aren't stupid and have > more than passing experience with numerical issues, even if we haven't been > "active on SC22WG14". Let's stop with the high-level pissing contest and > lay out a clear technical description of exactly what has your knickers in a > twist, how it hurts Python, and how we can all work together to make the > pain go away. SC22WG14 is the ISO committee that handles C standardisation. One of the reasons that the UK voted "no" was because the C99 standard was seriously incomprehensible in many areas to anyone who had not been active on the reflector. If you think that I can summarise a blazing row that went on for over 5 years and produced over a million lines of technical argument alone in a "clear technical description", you have an exaggerated idea of my abilities. I have a good many documents that I could post, but they would not help. Some of them could be said to be "clear technical descriptions" but most of them were written for other audiences, and assume those audiences' backgrounds. I recommend starting by reading the comments in floatobject.c and mathmodule.c and then looking up the sections of the C89 and C99 standards that are referenced by them. > A good place to start: You mentioned earlier that there where some > nonsensical things in floatobject.c. Can you list some of the most serious > of these? Well, try the following for a start: Python 2.4.2 (#1, May 2 2006, 08:28:01) [GCC 4.1.0 (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> a = "NaN" >>> b = float(a) >>> c = int(b) >>> d = (b == b) >>> print a, b, c, d NaN nan 0 False Python 2.3.3 (#1, Feb 18 2004, 11:58:04) [GCC 2.8.1] on sunos5 Type "help", "copyright", "credits" or "license" for more information. >>> a = "NaN" >>> b = float(a) >>> c = int(b) >>> d = (b == b) >>> print a, b, c, d NaN NaN 0 True That demonstrates that the error state is lost by converting to int, and that NaN testing isn't reliable. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
Michael Hudson <[EMAIL PROTECTED]> wrote: > > But, a floating point exception isn't a machine check interrupt, it's > a program interrupt... For reasons that I could go into, but are irrelevant, almost all modern CPU architectures have one ONE interrupt mechanism, and use it for both of those. It is the job of the interrupt handler (i.e. FLIH, first-level interrupt handler, usually in Assembler) to classify those, get into the appropriate state and call the interrupt handling code. Now, this is a Bad Idea, but separating floating-point exceptions from machine checks at the hardware level died with mainframes, as far as I know. The problem with the current approach is that it makes it very hard for the operating system to allow the application to handle the former. And the problem with most modern operating systems is that don't even do what they could do at all well, because THAT died with the mainframes, too :-( The impact of all this mess on things like Python is that exception handling is a nightmare area, especially when you bring in threading (i.e. hardware threading with multiple cores, or genuinely parallel threading on a single core). Yes, I have brought a system down by causing too many floating-point exceptions in all threads of a highly parallel program on a large SMP > See, that wasn't so hard! We'd have saved a lot of heat and light if > you'd said that at the start (and if you think you'd made it clear > already: you hadn't). I thought I had. I accept your statement that I hadn't. Sorry. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
"Tim Peters" <[EMAIL PROTECTED]> wrote: > > I suspect Nick spends way too much time reading standards ;-) God help me, YES! And in trying to get them improved. Both of which are very bad for my blood pressure :-( My real interest is in portable, robust programming - I DON'T abuse the terms to mean bitwise identical, but that is by the way - and I delved in here trying to write a jiffy little bit of just such code as part of a course example. BANG!!! It failed in both respects on the first two systems I tried on, and it wasn't my code that was wrong. The killer is that standards are the nearest to a roadmap for portability, especially portability and robustness. If you have non-conforming code, and it goes bananas, the compiler vendor will refuse to do anything, no matter how clearly it is a bug in the compiler or library. What is worse is that there is an incentive for the leading vendors (see below) to implement down to the standard, even when it is easier to do better. And this is happening in this area. > What he said is: > > If you look at floatobject.c, you will find it solid with constructions > that make limited sense in C99 but next to no sense in C89. > > And, in fact, C89 truly defines almost nothing about floating-point > semantics or pragmatics. Nevertheless, if a thing "works" under gcc > and under MS C, then "it works" for something like 99.9% of Python's > users, and competitive pressures are huge for other compiler vendors > to play along with those two. Yup, though you mean gcc on an x86/AMD64/EM64T system, and 99.9% is a rhetorical exaggeration - but one of the failures WAS on one of those! > I don't know what specifically Nick had in mind, and join the chorus > asking for specifics. That is why I wanted to: a) Read the decimal stuff and play around with the module and: b) Write a sandbox and sort out my obvious errors and: c) Write a PEP describing the issue and proposals BEFORE going into details. The devil is in the details, and I wanted to leave him sleeping until I had lined up my howitzers > I _expect_ he's got a keen eye for genuine > coding bugs here, but also expect I'd consider many technically > dubious bits of FP code to be fine under the "de facto standard" > dodge. Actually, I tried to explain that I don't have many objections to the coding of the relevant files - whoever wrote them and I have a LOT of common attitudes :-) And I have been strongly into de facto standards for over 30 years, so am happy with them. Yes, I have found a couple of bugs, but not ones worth fixing (e.g. there is a use of x != x where PyISNAN should be used, and a redundant test for an already excluded case, but what the hell?) My main objection is that they invoke C behaviour in many places, and that is (a) mostly unspecified in C, (b) numerically insane in C99 and (c) broken in practice. > So, sure, everything we do is undefined, but, no, we don't really care > :-) If a non-trivial 100%-guaranteed-by-the-standard-to-work C > program exists, I don't think I've seen it. I can prove that none exists, though I would have to trawl over SC22WG14 messages to prove it. I spent a LONG time trying to get "undefined" defined and used consistently (let alone sanely) in C, and failed dismally. > BTW, Nick, are you aware of Python's fpectl module? That's > user-contributed code that attempts to catch overflow, div-by-0, and > invalid operation on 754 boxes and transform them into raising a > Python-level FloatingPointError exception. Changes were made all over > the place to try to support this at the time. Every time you see a > PyFPE_START_PROTECT or PyFPE_END_PROTECT macro in Python's C code, > that's the system it's trying to play nicely with. "Normally", those > macros have empty expansions. Aware of, yes. Have looked at, no. I have already beaten my head against that area and already knew the issues. I have even implemented run-time systems that got it right, and that is NOT pretty. > fpectl is no longer built by default, because repeated attempts failed > to locate any but "ya, I played with it once, I think" users, and the > masses of platform-specific #ifdef'ery in fpectlmodule.c were > suffering fatal bit-rot. No users + no maintainers means I expect > it's likely that module will go away in the foreseeable future. You'd > probably hate its _approach_ to this anyway ;-) Oh, yes, I know that problem. You would be AMAZED at how many 'working' programs blow up when I turn it on on systems that I manage - not excluding Python itself (integer overflow) :-) And, no, I don't hate that approach, because it is one of the plausible ones; not good, but what can
Re: [Python-Dev] Numerical robustness, IEEE etc.
"Tim Peters" <[EMAIL PROTECTED]> wrote: > > > SC22WG14? is that some marketing academy? not a very good one, obviously. > > That's because it's European ;-) Er, please don't post ironic satire of that nature - many people will believe it! ISO is NOT European. It is the Internatational Standards Organisation, of which ANSI is a member. And, for reasons and with consequences that are only peripherally relevant, SC22WG14 has always been dominated by ANSI. In fact, C89 was standardised by ANSI (sic), acting as an agent for ISO. C99 was standardised by ISO directly, but for various reasons only some of which I know, was even more ANSI-dominated than C89. Note that I am NOT saying "big bad ANSI", as a large proportion of that was and is the ghastly support provided by many countries to their national standards bodies. The UK not excepted. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
"Terry Reedy" <[EMAIL PROTECTED]> wrote: > > Of interest among their C-EPs is one for adding the equivalent of our > decimal module > http://www.open-std.org/jtc1/sc22/wg14/www/projects#24732 IBM is mounting a major campaign to get its general decimal arithmetic standardised as THE standard form of arithmetic. There is a similar (more advanced) move in C++, and they are working on Fortran. I assume that Cobol is already on board, and there may be others. There is nothing underhand about this - IBM is quite open about it, I believe that they are making all critical technologies freely design has been thought out and is at least half-sane - which makes it among the best 1-2% of IT technologies :-( Personally, I think that it is overkill, because it is a MASSIVELY complex solution, and will make sense only where at least two of implementation cost, performance, power usage and CPU/memory size are not constraints. E.g. mainframes, heavyweight commercial codes etc. but definitely NOT massive parallelism, very low power computing, micro-minaturisation and so on. IEEE 754 was bad (which is why it is so often implemented only in part), but this is MUCH worse. God alone knows whether IBM will manage to move the whole of IT design - they have done it before, and have failed before (often after having got further than this). Now, whether that makes it a good match for Python is something that is clearly fruitful grounds for debate :-) Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
"Jim Jewett" <[EMAIL PROTECTED]> wrote: > > > The conventional interpretation was that any operation that > > was not mathematically continuous in an open region including its > > argument values (in the relevant domain) was an error, and that all > > such errors should be flagged. That is what I am talking about. > > Not a bad goal, but not worth sweating over, since it isn't > sufficient. It still allows functions whose continuity does not > extend to the next possible floating point approximation, or functions > whose value, while continuous, changes "too much" in that region. Oh, yes, quite. But I wasn't describing something that needed effort; I was merely describing the criterion that was traditionally used (and still is, see below). There is also the Principle of Least Surprise: the behaviour of a language should be at least explicable to mere mortals (a.k.a. ordinary programmers) - one that says "whatever the designing committee thought good at the time" is a software engineering disaster. > For some uses, it is more important to be consistent with established > practice than to be as correct as possible. If the issues are still > problems, and can't be solved in languages like java, then ... the > people who want "correct" behavior will be a tiny minority, and it > makes sense to have them use a 3rd-party extension. I don't think that you understand the situation. I was and am describing established practice, as used by the numeric programmers who care about getting reliable answers - most of those still use Fortran, for good and sufficient reasons. There are two other established practices: Floating-point is figment of your imagination - don't support it. Yeah. Right. Whatever. It's only approximate, so who gives a damn what it does? Mine is the approach taken by the Fortran, C and C++ standards and many Fortran implementations, but the established practice in highly optimised Fortran and most C is the last. Now, Java (to some extent) and C99 introduced something that attempts to eliminate errors by defining what they do (more-or-less arbitrarily); much as if Python said that, if a list or dictionary entry wasn't found, it would create one and return None. But that is definitely NOT established practice, despite the fact that its proponents claim it is. Even IEEE 754 (as specified) has never reached established practice at the language level. The very first C99 Annex F implementation that I came across appeared in 2005 (Sun One Studio 9 under Solaris 10 - BOTH are needed); I have heard rumours that HP-UX may have one, but neither AIX nor Linux does (even now). I have heard rumours that the latest Intel compiler may be C99 Annex F, but don't use it myself, and I haven't heard anything reliable either way for Microsoft. What is more, many of the tender documents for systems bought for numeric programming in 2005 said explicitly that they wanted C89, not C99 - none asked for C99 Annex F that I saw. No, C99 Annex F is NOT established practice and, God willing, never will be. > > For example, consider conversion between float > > and long - which class should control the semantics? > > The current python approach with binary fp is to inherit from C > (consistency with established practice). The current python approach > for Decimal (or custom classes) is to refuse to guess in the first > place; people need to make an explicit conversion. How is this a > problem? See above re C extablished practice. The above is not my point. I am talking about the generic problem where class A says that overflow should raise an exception, class B says that it should return infinity and class C says nothing. What should C = A*B do on overflow? > [ Threading and interrupts ] No, that is a functionality issue, but the details are too horrible to go into here. Python can do next to nothing about them, except to distrust them - just as it already does. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Numerical robustness, IEEE etc.
[EMAIL PROTECTED] wrote: > > I'm not asking you to describe SC22WG14 or post detailed technical summaries > of the long and painful road. I'd like you to post things directly relevant > to Python with footnotes to necessary references. It is then incumbent on > those that wish to respond to your post to familiarize themselves with the > relevant background material. However, it is really darn hard to do that > when we don't know what you're trying to fix in Python. The examples you > show below are a good start in that direction. Er, no. Given your response, it has merely started off a hare. The issues you raise are merely ones of DETAIL, and I was and am trying to tackle the PRINCIPLE (a.k.a. design). I originally stated my objective, and asked for information so that I could investigate in depth and produce (in some order) a sandbox and a PEP. That is still my plan. This example was NOT of problems with the existing implementation, but was to show how even the most basic numeric code that attempts to handle errors cannot avoid tripping over the issues. I shall respond to your points, but shall try to refrain from following up. > 1) The string representation of NaN is not standardized across platforms Try what I actually used: x = 1.0e300 x = (x*x)/(x*x) I converted that to float('NaN') to avoid confusing people. There are actually many issues around the representation of NaNs, including whether signalling NaNs should be separated from quiet NaNs and whether they should be allowed to have values. See IEEE 754, IEEE 754R and C99 for more details (but not clarification). > 2) on a sane platform, int(float('NaN')) should raise an ValueError > exception for the int() portion. Well, I agree with you, but Java and many of the C99 people don't. > 3) float('NaN') == float('NaN') should be false, assuming NaN is not a > signaling NaN, by default Why? Why should it not raise ValueError? See table 4 in IEEE 754. I could go into this one in much more depth, but let's not, at least not now. > So the open question is how to both define the semantics of Python floating > point operations and to implement them in a way that verifiably works on the > vast majority of platforms without turning the code into a maze of > platform-specific defines, kludges, or maintenance problems waiting to > happen. Well, in a sense, but the second is really a non-question - i.e. it answers itself almost trivially once the first is settled. ALL of your above points fall into that category. The first question to answer is what the fundamental model should be, and I need to investigate in more depth before commenting on that - which should tell you roughly what I know and what I don't about the decimal model. The best way to get a really ghastly specification is to decide on the details before agreeing on the intent. Committees being what they are, that is a recipe for something that nobody else will ever get their heads around. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python memory model (low level)
I Have been thinking about software floating point, and there are some aspects of Python and decimal that puzzle me. Basically, they are things that are wanted for this sort of thing and seem to be done in very contorted ways, so I may have missed something. Firstly, can Python C code assume no COMPACTING garbage collector, or should it allow for things shifting under its feet? Secondly, is there any documentation on the constraints and necessary ritual when allocating chunks of raw data and/or types of variable size? Decimal avoids the latter. Thirdly, I can't find an efficient way for object-mangling code to access class data and/or have some raw data attached to a class (as distinct from an instance). Fourthly, can I assume that no instance of a class will remain active AFTER the class disappears? This would mean that it could use a pointer to class-level raw data. I can explain why all of those are the 'right' way to approach the problem, at an abstract level, but it is quite possible that Python does not support the abstract model of class implementation that I am thinking of. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python memory model (low level)
Aahz <[EMAIL PROTECTED]> wrote: > > Without answering your specific questions, keep in mind that Python and > Python-C code are very different things. The current Decimal > implementation was designed to be *readable* and efficient *Python* code. > For a look at what the Python-C implementation of Decimal might look > closer to, take a look at the Python long implementation. Er, perhaps I should have said explicitly that I was looking at the Decimal-in-C code and not the Python. Most of my questions don't make any sense at the Python level. But you have a good point. The long code will be both simpler and have had a LOT more work done on it - but it will address only the object of variable size issue, as it doesn't need class-level data in the same way as Decimal and I do. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python memory model (low level)
"Tim Peters" <[EMAIL PROTECTED]> wrote: [ Many useful answers ] Thanks very much! That helps. Here are a few points where we are at cross-purposes. I am talking about the C level. What I am thinking of is the standard method of implementing the complicated housekeeping of a class (e.g. inheritance) in Python, and the basic operations in C (for efficiency). The model that I would like to stick to is that the Python layer never knows about the actual object implementation, and the C never knows about the housekeeping. The housekeeping would include the class derivation, which would (inter alia) fix the size of a number. The C code would need to allocate some space to store various constants and workspace, shared between all instances of the derived class. This would be accessible from the object it returns. Each instance would be of a length specified by its derivation (i.e. like Decimal), but would be constant for all members of the class (i.e. not like long). So it would be most similar to tuple in that respect. Operations like addition would copy the pointer to the class data from the arguments, and ones like creation would need to be passed the appropriate class and whatever input data they need. I believe that, using the above approach, it would be possible to achieve good efficiency with very little C - certainly, it has worked in other languages. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Another 2.5 bug candidate?
"A.M. Kuchling" <[EMAIL PROTECTED]> wrote: > > http://www.python.org/sf/1488934 argues that Python's use of fwrite() > has incorrect error checking; this most affects file.write(), but > there are other uses of fwrite() in the core. It seems fwrite() can > return N bytes written even if an error occurred, and the code needs > to also check ferror(f->fp). > > At the last sprint I tried to assemble a small test case to exhibit > the problem but failed. The reporter's test case uses SSH, and I did > verify that Python does loop infinitely if executed under SSH, but a > test case would need to work without SSH. > > Should this be fixed in 2.5? I'm nervous about such a change to error > handling without a test case to add; maybe it'll cause problems on one > of our platforms. So would assembling a test case. NOTHING will cause ferror to return True that isn't classed as undefined behaviour, and therefore may fail on some platforms. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Handling of sys.args (Re: User's complaints)
On systems that are not Unix-derived (which, nowadays, are rare), there is commonly no such thing as a program name in the first place. It is possible to get into that state on some Unices - i.e. ones which have a form of exec that takes a file descriptor, inode number or whatever. This is another argument for separating off argv[0] and allowing the program name to be None if there IS no program name. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Handling of sys.args (Re: User's complaints)
Greg Ewing <[EMAIL PROTECTED]> wrote: > > > On systems that are not Unix-derived (which, nowadays, are rare), > > there is commonly no such thing as a program name in the first place. > > It is possible to get into that state on some Unices - i.e. ones which > > have a form of exec that takes a file descriptor, inode number or > > whatever. > > I don't think that applies to the Python args[] though, > since its args[0] isn't the path of the OS-level > executable, it's the path of the main Python script. Oh, yes, it does! The file descriptor or inode number could refer to the script just as well as it could to the interpreter binary. > But you could still end up without one, if the main > script comes from somewhere other than a file. I didn't want to include that, to avoid confusing people who haven't used systems with such features. Several systems have had the ability to exec to a memory segment, for example. But, yes. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Handling of sys.args (Re: User's complaints)
"Guido van Rossum" <[EMAIL PROTECTED]> wrote: > > OK, then I propose that we wait to see which things you end up having > to provide to sandboxed code, rather than trying to analyze it to > death in abstracto. However, the ORIGINAL proposal in this thread (to split off argv[0] and/or make that and the arguments read-only) is entirely different. That is purely a matter of convenience, cleanliness of specification or whatever you call it. I can't imagine any good reason to separate argv[0] from argv[1:] by a sandbox (either way). Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strategy for converting the decimal module to C
Georg Brandl <[EMAIL PROTECTED]> wrote: > > > Even then, we need to drop the concept of having the flags as counters > > rather than booleans. > > Yes. Given that even Tim couldn't imagine a use case for counting the > exceptions, I think it's sensible. Well, I can. There is a traditional, important use - tuning. When such arithmetic is implemented in hardware, it is normal for exceptional cases to be handled by interrupt, and that is VERY expensive - often 100-1,000 times the cost of a single operation, occasionally 10,000 times. It then becomes important to know how many of the things you got, to know whether it is worth putting code in to avoid them or even using a different algorithm. Now, it is perfectly correct to say that this does not apply to emulated arithmetic and that there is no justification for such ghastly implementations. But, regrettably, almost all exception handling on modern systems IS ghastly - at least by the standards of the 1960s. Whether you regard the use of Python for tuning code that is to be run using hardware, where the arithmetic will be performance- critical as important, is a matter of taste. I don't :-) Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strategy for converting the decimal module to C
Greg Ewing <[EMAIL PROTECTED]> wrote: > > But couldn't you just put in an interrupt handler that > counts the interrupts, for the purpose of measurement? No, but the reasons are very arcane. The general reason is that taking an interrupt handler and returning is not transparent, and is often not possible on modern systems. If that problem is at the hardware level (as on the Alpha and 3086/7), you are stuffed. But, more often, it is due the the fact that the architecture means that such handling can only be done at maximally privileged level. Now, interrupting into that level has to be transparent, in order to support TLB misses, clock interrupts, device interrupts, machine-check interrupts and so on. But the kernels rarely support transparent callbacks from that state into user code (though they used to); it is actually VERY hard to do, and even the mainframes had problems. This very commonly means that such counting breaks other facilities, unless it is done IN the privileged code. Of course, a GOOD hardware architecture wouldn't leave the process state when it gets a floating-point interrupt, but would just invoke an asynchronous routine call. That used to be done. As I said, none of this is directly relevant to emulated implementations, such as the current Python ones, but it IS to the design of an arithmetic specification.It could become relevant if Python wants to start to use a hardware implementation, because your proposal would mean that it would have to try to ensure that such callbacks are transparent. As one of the few people still working who has extensive experience with doing that, I can assure you that it is an order of magnitude fouler than you can imagine. A decimal order of magnitude :-( But note that I wasn't saying that such things should be put into the API, merely that there is a very good reason to do so for hardware implementations and ones used to tune code for such implementations. Personally, I wouldn't bother. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] new security doc using object-capabilities
"Giovanni Bajo" <[EMAIL PROTECTED]> wrote: > > This recipe for safe_eval: > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/496746 > which is otherwise very cute, does not handle this case as well: it tries to > catch and interrupt long-running operations through a secondary thread, but > fails on a single long operation because the GIL is not released and the > alarm thread does not get its chance to run. Grin :-) You have put your finger on the Great Myth of such virtualisations, which applies to the system-level ones and even to the hardware-level ones. In practice, there is always some request that a sandbox can make to the hypervisor that can lock out or otherwise affect other sandboxes. The key is, of course, to admit that and to specify what is and is not properly virtualised, so that the consequences can at least be analysed. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strategy for converting the decimal module to C
Greg Ewing <[EMAIL PROTECTED]> wrote: > > > Now, interrupting into that level has to be transparent, in order to > > support TLB misses, clock interrupts, device interrupts, machine-check > > interrupts and so on. > > I thought we were just talking about counting the number > of floating point exceptions that a particular piece of > code generates. Surely that's deterministic, and isn't Er, no. Rather fundamentally, on two grounds. Please bear with me, as this IS relevant to Python. See the summary at the end if you like :-) The first is that such things are NOT deterministic, not even on simple CPUs - take a look at the Alpha architecture for an example, and then follow it up with the IA64 one if you have the stomach for it. But that wasn't my main point. It is that modern CPUs have a SINGLE interrupt mechanism (a mistake in itself, but they do), so a CPU may be interrupted when it is running a device driver, other kernel thread or within a system call as much as when running an application. In fact, to some extent, interrupt handlers can themselves be interrupted (let's skip the details). Now, in order to allow the application to run its handler, the state has to be saved, sanitised and converted back to application context; and conversely on return. That is hairy, and is why it is not possible to handle interrupts generated within system calls on many systems. But that is not directly Python's problem. What is, is that the code gets interrupted at an unpredictable place, and the registers and other state may not be consistent as the language run-time system and Python are concerned. It is critical (a) that a sane state is restored before calling the handler and (b) that calling the handler neither relies on nor disturbs any of the "in flight" actions in the interrupted code. To cut a long story short, it is impractical for a language run-time system to call user-defined handlers with any degree of reliability unless the compiled code and run-time interoperate carefully - I have been there and done that many times, but few people still working have. On architectures with out-of-order execution (and interrupts), you have to assume that an interrupt may occur anywhere, even when the code does not use the relevant facility. Floating-point overflow in the middle of a list insertion? That's to be expected. It becomes considerably easier if the (run-time system) interrupt handler merely needs to flag or count interrupts, as it can use a minimal handler which is defensive and non-intrusive. Even that is a pretty fair nightmare, as many systems temporarily corrupt critical registers when they think that it is safe. And few think of interrupts when deciding that So, in summary, please DON'T produce a design that relies on trapping floating-point exceptions and passing control to a Python function. This is several times harder than implementing fpectl. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strategy for converting the decimal module to C
James Y Knight <[EMAIL PROTECTED]> wrote: > > > To cut a long story short, it is impractical for a language run-time > > system to call user-defined handlers with any degree of reliability > > unless the compiled code and run-time interoperate carefully - I have > > been there and done that many times, but few people still working > > have. > > On architectures with out-of-order execution (and interrupts), you > > have to assume that an interrupt may occur anywhere, even when the > > code does not use the relevant facility. Floating-point overflow > > in the middle of a list insertion? That's to be expected. > > While this _is_ a real problem, is it _not_ a general problem as you > are describing it. Processors are perfectly capable of generating > precise interrupts, and the inability to do so has nothing to do with > the out-of-order execution, etc. Almost all interrupts are precise. I am sorry, but this is almost totally wrong, though I agree that you will get that impression upon reading the architecture books unless you are very deeply into that area. Let's skip the hardware issues, as they aren't what I am talking about (though see [*]). I am referring to the interaction between the compiled code, deep library functions and run-time interrupt handler. It is almost universal for some deep library functions and common for compiled code to leave data structures inconsistent in a short window that "cannot possibly fail" - indeed, most system interfaces do this around system calls. If an interrupt occurs then, the run-time system will receive control with those data structures in a state where they must not be accessed. And it is fairly common for such data structures to include ones critical to the functioning of the run-time system. Now, it IS possible to write run-time systems that are safe against this, and still allow asynchronous interrupts, but I am one of three people in the world that I know have done it in the past two decades. There may be as many as six, but I doubt more, and I know of no such implementation on any Unix or Microsoft system. It is even possible to do this for compiled code, but that is where the coordination between the compiler and run-time system comes in. > The only interesting one which is not, on x86 processors, is the x87 > floating point exception, ... Er, no. Try a machine-check in a TLB miss handler. But it is all pretty irrelevant, as the problem arises with asychronous exceptions (e.g. timer interrupts, signals from other processes), anyway. > Also, looking forward, the "simd" floating point instructions (ie mmx/ > sse/sse2/sse3) _do_ ... The critical problems with the x87 floating-point exception were resolved in the 80386. [*] Whether or not it is a fundamental problem, it is very much a general problem at present, and it will become more so as more CPUs implement micro-threading. For why it is tied up with out-of-order execution etc., consider a system with 100 operations flying, of which 10 are memory accesses, and then consider what happens when you have combinations of floating-point exceptions, TLB misses, machine-checks (e.g. ECC problems on memory) and device/timer interrupts. Once you add user-defined handlers into that mix, you either start exposing that mess to the program or have to implement them by stopping the CPU, unwinding the pipeline, and rerunning in very, very serial mode until the handler is called. Look at IA64 Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strategy for converting the decimal module to C
Greg Ewing <[EMAIL PROTECTED]> wrote: > > But we weren't talking about asynchronous exceptions, > we were talking about floating point exceptions. Unless > your TLB miss handler uses floating point arithmethic, > there's no way it can get interrupted by one. (And if > it does use floating point arithmetic in a way that > can cause an exception, you'd better write it to deal > with that!) I am really not getting my message across, am I? Yes, that is true - as far as it goes. The trouble is that designing systems based on assuming that IS true as far as it goes means that they don't work when it goes further. And it does. Here are a FEW of the many examples of where the simplistic model is likely to fail in an x86 context: The compiled code has made a data structure temporarily inconsistent because the operation is safe (say, list insertion), and then gets an asynchronous interrupt (e.g. SIGINT). The SIGINT handler does some operation (e.g. I/O) that implicitly uses floating-point, which then interrupts. The x86 architecture is extended to include out-of-order floating-point as it had in the past, many systems have today, and is very likely to happen in the future. It is one of the standard ways to get better performance, after all, and is on the increase. The x86 architecture is extended to support micro-threading. I have not been told by Intel or AMD that either have such plans, but I have very good reason to believe that both have such projects. IBM and Sun certainly do, though I don't know if IBM's is/are relevant. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rounding float to int directly ...
"M.-A. Lemburg" <[EMAIL PROTECTED]> wrote: > > You often have a need for controlled rounding when doing > financial calculations or in situations where you want to > compare two floats with a given accuracy, e.g. to work > around rounding problems ;-) The latter is a crude hack, and was traditionally used to save cycles when floating-point division was very slow. There are better ways, and have been for decades. > Float formatting is an entirely different issue. Not really. You need controlled rounding to a fixed precision in the other base. But I agree that controlled rounding in binary does not help with controlled rounding in decimal. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rounding float to int directly ...
Aahz <[EMAIL PROTECTED]> wrote: > On Tue, Aug 01, 2006, M.-A. Lemburg wrote: > > > > You often have a need for controlled rounding when doing financial > > calculations or in situations where you want to compare two floats > > with a given accuracy, e.g. to work around rounding problems ;-) > > > > The usual approach is to use full float accuracy throughout the > > calculation and then apply rounding a certain key places. > > That's what Decimal() is for. Well, maybe. There are other approaches, too, and Decimal has its problems with that. In particular, when people need precisely defined decimal rounding, they ALWAYS need fixed-point and not floating-point. > (Note that I don't care all that much about round(), but I do think we > want to canonicalize Decimal() for all purposes in standard Python where > people care about accuracy. Additional float features can go into > NumPy.) Really? That sounds like dogma, not science. Decimal doesn't even help people who care about accuracy. At most (and with the reservation mentioned above), it means that you can can map external decimal formats to internal ones without loss of precision. Not a big deal, as there ARE no requirements for doing that for floating-point, and there are plenty of other solutions for fixed-point. People who care about the accuracy of calculations prefer binary, as it is a more accurate model. That isn't a big deal, either. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rounding float to int directly ...
Raymond Hettinger <[EMAIL PROTECTED]> wrote: > > Hogwash. The only issues with decimal are ease-of-use and speed. I suggest that you get hold of a good 1960s or 1970s book on computer arithmetic, and read up about "wobbling precision". While it is not a big deal, it was regarded as such, and is important enough to cause significant numerical problems to the unwary - which means 99.99% of modern programmers :-( And, as I am sure that Aahz could point out, there are significant reliability issues concerned with frequent base changes where any loss of precision is unacceptable. Yes, it can always be done, but only a few people are likely to do it correctly in all cases. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rounding float to int directly ...
Greg Ewing <[EMAIL PROTECTED]> wrote: > > You should NOT be using binary floats for money > in the first place. Or floating-point at all, actually. But binary floating-point is definitely unsuited for such a use. > Pseudo-rounding to decimal places is not > the right way to do that. The right way is > to compare the difference to a tolerance. Right. Where the tolerance should be a combination of relative and absolute accuracy. 1.0e-300 should usually be 'similar' to 0.0. Simon Burton <[EMAIL PROTECTED]> wrote: > > It's not even clear to me that int(round(x)) is always the > nearest integer to x. There is a sense in which this is either true or overflow occurs. > Is it always true that float(some_int)>=some_int ? (for positive values). > > (ie. I am wondering if truncating the float representation > of an int always gives back the original int). No. Consider 'standard' Python representations on a 64-bit system. There are only 53 bits in the mantissa, but an integer can have up to 63. Very large integers need to be rounded, and can be rounded up or down. Please note that I am not arguing against an int_rounded() function. There is as much reason to want one as an int_truncated() one, but there is no very good reason to to want more than one of the two. int_expanded() [i.e. ceiling] is much less useful. For people interested in historical trivia, the dominance of the truncating form of integer conversion over the rounding form seems to be yet another side-effect of the Fortran / IBM 370 dominance over the Algol / other hardware, despite the fact that most modern languages are rooted in CPL rather than Fortran. I am unaware of any technical grounds to prefer one over the other (i.e. the reasons for wanting each are equally balanced). It all comes down to the simple question "Do we regard a single primitive for int(round()) as important enough to provide?" I abstain :-) Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rounding float to int directly ...
Michael Chermside <[EMAIL PROTECTED]> wrote: > > > Decimal doesn't even help people who care about accuracy. > > Not true! The float class is incapable of maintaining 700 digits of > precision, but Decimal handles it just fine. (Why you would WANT > more than around 53 bits of precision is a different question, but > Decimal handles it.) Oh, yes, the CURRENT decimal class is potentially more accurate than the CURRENT floating class, but that has nothing to do with the intrinsic differences in the base. > > People who care about the accuracy of calculations prefer binary, > > as it is a more accurate model. > > Huh? That doesn't even make sense! A model is not inherently accurate > or inaccurate, it is only an accurate or inaccurate representation > of some "real" system. Neither binary nor decimal is a better > representation of either rational or real numbers, the first > candidates for "real" system I thought of. Financial accounting rules > tend to be based on paper-and-pencil calculations for which > decimal is usually a far better representation. > > If you had said that binary floats squeeze out more digits of > precision per bit of storage than decimal floats, or that binary > floats are faster because they are supported by specialized hardware, > then I'd go along, but they're not a "better model". No, that isn't true. The "wobbling precision" effect may be overstated, but is real, and gets worse the larger the base is. To the best of my knowledge, that is almost the only way in which binary is more accurate than decimal, in absolute terms, and it is a marginal difference. Note that I said "prefer", not "require" :-) For example, calculating the relative difference between two close numbers is sensitive to whether you are using the numbers in their normal or inverse forms (by a factor on N in base N), and this is a common cause of incorrect answers. A factor of 2 is better than one of 10. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rounding float to int directly ...
James Y Knight <[EMAIL PROTECTED]> wrote: > > I'd be happy to see floats lose their __int__ method entirely, > replaced by an explicit truncate function. Come back Algol - all is forgiven :-) Yes, indeed. I have favoured that view for 35 years - anything that can lose information quietly should be explicit. [EMAIL PROTECTED] (Christian Tanzer) wrote: > Greg Ewing <[EMAIL PROTECTED]> wrote: > > > What's the feeling about this? If, e.g. int() > > were changed in Py3k to round instead of truncate, > > would it cause anyone substantial pain? > > Gratuitous breakage! > > I shudder at the thought of checking hundreds of int-calls to see if > they'd still be correct under such a change. My experience of doing that when compilers sometimes did one and sometimes the other is that such breakages are rarer than the conversions to integer that are broken with both rules! And both are rarer than the code that works with either rule. However, a 5% breakage rate is still enough to be of concern. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rounding float to int directly ...
Ka-Ping Yee <[EMAIL PROTECTED]> wrote: > > That's my experience as well. In my opinion, the purpose of round() > is most commonly described as "to make an integer". So it should > yield an integer. Grrk. No, that logic is flawed. There are algorithms where the operation of rounding (or truncation) is needed, but where the value may be larger than can be held in an integer, and that is not an error. If the only rounding or truncation primitive converts to an integer, those algorithms are unimplementable. You need at least one primitive that converts a float to an integer, held as a float. Which is independent of whether THIS particular facility should yield an integer or float! Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rounding float to int directly ...
Ronald Oussoren <[EMAIL PROTECTED]> wrote: > > > There are algorithms where the operation of rounding (or truncation) > > is needed, but where the value may be larger than can be held in an > > integer, and that is not an error. > > Is that really true for python? Python integers are unbounded in > magnitute, they are not the same as int or long in C, therefore any > float except exceptional values like NaN can be converted to an > integer value. The converse is not true, python integers can contain > values that are larger than any float (aka C's double). It depends a great deal on what you mean by a Python integer! Yes, I was assuming the (old) Python model, where it is a C long, but so were many (most?) of the other postings. If you are assuming the (future?) model, where there is a single integer type of unlimited size, then that is true. There is still an efficiency point, in that such algorithms positively don't want a float value like 1.0e300 (or, worse, 1.0e4000) expanded to its full decimal representation as an intermediate step. Whatever. There should still be at least one operation that rounds or truncates a float value, returning a float of the same type, on either functionality or efficiency grounds. I and most programmers of such algorithms don't give a damn which it does, provided that it is clearly documented, at least half-sane and doesn't change with versions of Python. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Dicts are broken ...
Michael Hudson <[EMAIL PROTECTED]> wrote: > > I'd say it's more to do with __eq__. It's a strange __eq__ method > that raises an Exception, IMHO. Not entirely. Any type that supports invalid values (e.g. IEEE 754) and is safe against losing the invalid state by accident needs to raise an exception on A == B. IEEE 754 is not safe. > Please do realize that the motivation for this change was hours and > hours of tortous debugging caused by a buggy __eq__ method making keys > "impossibly" seem to not be in dictionaries. Quite. Been there - been caught by that. It is a catastrophic (but very common) misdesign to conflate failure and the answer "no". There is a fundamental flaw of that nature in card-based banking, that I pointed out was insoluble to the Grid people, and then got caught by just a month later! Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] gcc 4.2 exposes signed integer overflows
"Tim Peters" <[EMAIL PROTECTED]> wrote: > > This is a wrong time in the release process to take on chance on > discovering a flaky LONG_MIN on some box, so I want to keep the code > as much as possible like what's already there (which worked fine for > > 10 years on all known boxes) for now. No, it didn't. I reported a bug a couple of years back. A blanket rule not to use symbols is clearly wrong, but there are good reasons not to want to rely on LONG_MIN (or INT_MIN for that matter). Because of some incredibly complex issues (which I only know some of), it hasn't been consistently -(1+LONG_MAX) on twos' complement machines. There are good reasons for making it -LONG_MAX, but they aren't the ones that actually cause it to be so. There are, however, very good reasons for using BOTH tests. I.e. if I have a C system which defines LONG_MIN to be -LONG_MAX because it uses -(1+LONG_MAX) for an integer NaN indicator in some contexts, you really DON'T want to create such a value. I don't know if there are any such C systems, but there have been other languages that did. I hope that Guido wasn't saying that Python should EVER rely on signed integer overflow wrapping in twos' complement. Despite the common usage, Java and all that, it is perhaps the worst systematic architectural change to have happened in 30 years, and accounts for a good 30% of all hard bugs in many classes of program. Simple buffer overflow is fairly easy to avoid by good programming style; integer overflow causing trashing of unpredictable data isn't. Any decent programming language (like Python!) regards integer overflow as an error, and the moves to make C copy Java semantics are yet another step away from software engineering in the direction of who-gives-a-damn hacking. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
"Gustavo Carneiro" <[EMAIL PROTECTED]> wrote: > > We have to resort to timeouts in pygtk in order to catch unix signals > in threaded mode. A common defect of modern designs - TCP/IP is particularly objectionable in this respect, but that battle was lost and won over two decades ago :-( > The reason is this. We call gtk_main() (mainloop function) which > blocks forever. Suppose there are threads in the program; then any > thread can receive a signal (e.g. SIGINT). Python catches the signal, > but doesn't do anything; it simply sets a flag in a global structure > and calls Py_AddPendingCall(), and I guess it expects someone to call > Py_MakePendingCalls(). However, the main thread is blocked calling a > C function and has no way of being notified it needs to give control > back to python to handle the signal. Hence, we use a 100ms timeout > for polling. Unfortunately, timeouts needlessly consume CPU time and > drain laptop batteries. Yup. > According to [1], all python needs to do to avoid this problem is > block all signals in all but the main thread; then we can guarantee > signal handlers are always called from the main thread, and pygtk > doesn't need a timeout. 1) That page is password protected, so I can't see what it says, and am disinclined to register myself to yet another such site. 2) No way, Jose, anyway. The POSIX signal handling model was broken beyond redemption, even before threading was added, and the combination is evil almost beyond belief. That procedure is good practice, yes, but that is NOT all that you have to do - it may be all that you CAN do, but that is not the same. Come back MVS (or even VMS) - all is forgiven! That is only partly a joke. > Another alternative would be to add a new API like > Py_AddPendingCallNotification, which would let python notify > extensions that new pending calls exist and need to be processed. Nope. Sorry, but you can't solve a broken design by adding interfaces. > But I would really prefer the first alternative, as it could be > fixed within python 2.5; no need to wait for 2.6. It clearly should be done, assuming that Python's model is that it doesn't want to get involved with subthread signalling (and I really, but REALLY, recommend not doing so). The best that can be done is to say that all signal handling is the business of the main thread and that, when the system bypasses that, all bets are off. > Please, let's make Python ready for the enterprise! [2] Given that no Unix variant or Microsoft system is, isn't that rather an unreasonable demand? I am probably one of the last half-dozen people still employed in a technical capacity who has implemented run-time systems that supported user-level signal handling with threads/asynchronicity and allowing for signals received while in system calls. It would be possible to modify/extend POSIX or Microsoft designs to support this, but currently they don't make it possible. There is NOTHING that Python can do but to minimise the chaos. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
"Gustavo Carneiro" <[EMAIL PROTECTED]> wrote: > > Oh, sorry, here's the comment: > > (coment by Arjan van de Ven): > | afaik the kernel only sends signals to threads that don't have them blocked. > | If python doesn't want anyone but the main thread to get signals, it > should just > | block signals on all but the main thread and then by nature, all > signals will go > | to the main thread Well, THAT'S wrong, I am afraid! Things ain't that simple :-( Yes, POSIX implies that things work that way, but there are so many get-out clauses and problems with trying to implement that specification that such behaviour can't be relied on. > Well, Python has a broken design too; it postpones tasks and expects > to magically regain control in order to finish the job. That often > doesn't happen! Very true. And that is another problem with POSIX :-( > Python is halfway there; it assumes signals are to be handled in the > main thread. However, it _catches_ them in any thread, sets a flag, > and just waits for the next opportunity when it runs again in the main > thread. It is precisely this "split handling" of signals that is > failing now. I agree that is not how to do it, but that code should not be removed. Despite best attempts, there may well be circumstances under which signals are received in a subthread, despite all attempts of the program to ensure that the main thread gets them. > Anyway, attached a patch that should fix the problem in posix > threads systems, in case anyone wants to review. Not "fix" - "improve" :-) I haven't looked at it, but I agree that what you have said is the way to proceed. The best solution is to enable the main thread for all relevant signals, disable all subthreads, but to not rely on any of that working in all cases. It won't help with the problem where merely receiving a signal causes chaos, or where blocking them does so, but there is nothing that Python can do about that, in general. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
"Gustavo Carneiro" <[EMAIL PROTECTED]> wrote: > > That's a very good point; I wasn't aware that child processes > inherited the signals mask from their parent processes. That's one of the few places where POSIX does describe what happens. Well, usually. You really don't want to know what happens when you call something revolting, like csh or a setuid program. This particular mess is why I had to write my own nohup - the new POSIX interfaces broke the existing one, and it remains broken today on almost all systems. > I am now thinking of something along these lines: > typedef void (*PyPendingCallNotify)(void *user_data); > PyAPI_FUNC(void) Py_AddPendingCallNotify(PyPendingCallNotify callback, > void *user_data); > PyAPI_FUNC(void) Py_RemovePendingCallNotify(PyPendingCallNotify > callback, void *user_data); Why would that help? The problems are semantic, not syntactic. Anthony Baxter isn't exaggerating the problem, despite what you may think from his posting. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
Chris McDonough <[EMAIL PROTECTED]> wrote: > > Would adding an API for sigprocmask help here? No. sigprocmask is a large part of the problem. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
"Gustavo Carneiro" <[EMAIL PROTECTED]> wrote: > > You guys are tough customers to please. I am just trying to solve a > problem here, not create a new one; you have to believe me. Oh, I believe you. Look at it this way. You are trying to resolve the problem that your farm is littered with cluster bombs, and your cows keep blowing their legs off. Your solution is effectively saying "well, let's travel around and pick them all up then". > We want to get rid of timeouts. Now my idea: add a Python API to say: > "dear Python, please call me when you start having pending calls, > even if from a signal handler context, ok?" Yes, I know. I have been there and done that, both academically and (observing, as a consultant) to the vendor. And that was on a system that was a damn sight better engineered than any of the main ones that Python runs on today. I have attempted to do much EASIER tasks under both Unix and (earlier) versions of Microsoft Windows, and failed dismally because the system wasn't up to it. > From that point on, signals will get handled by Python, python calls > PyGTK, PyGTK calls a special API to safely wake up the main loop even > from a thread or signal handler, then main loop checks for signal by > calling PyErr_CheckSignals(), it is handled by Python, and the process > lives happily ever after, or die trying. The first thing that will happen to that beautiful theory when it goes out into Unix County or Microsoft City is that a gang of ugly facts will find it and beat it into a pulp. > I sincerely hope my explanation was satisfactory this time. Oh, it was last time. It isn't that that is the problem. > Are signal handlers guaranteed to not be interrupted by another > signal, at least? What about threads? No and no. In theory, what POSIX says about blocking threads should be reliable; in my experience, it almost is, except under precisely the circumstances that you most want it to work. Look, I am agreeing that your basic design is right. What I am saying is that (a) you cannot make delivery reliable and abolish timeouts and (b) that it is such a revoltingly system-dependent mess that I would much rather Python didn't fiddle with it. Do you know how signalling is misimplemented at the hardware level? And that it is possible for a handler to be called with any of its critical pointers (INCLUDING the global code and data pointers) in undefined states? Do you know how to program round that sort of thing? I can answer "yes" to all three - for my sins, which must be many and grievous, for that to be the case :-( Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
Jean-Paul Calderone <[EMAIL PROTECTED]> wrote: > On Mon, 04 Sep 2006 17:24:56 +0100, > David Hopwood <[EMAIL PROTECTED] > der.co.uk> wrote: > >Jean-Paul Calderone wrote: > >> PyGTK would presumably implement its pending call callback by writing a > >> byte to a pipe which it is also passing to poll(). > > > >But doing that in a signal handler context invokes undefined behaviour > >according to POSIX. > > write(2) is explicitly listed as async-signal safe in IEEE Std 1003.1, 2004. > Was this changed in a later edition? Otherwise, I don't understand what you > mean by this. Try looking at the C90 or C99 standard, for a start :-( NOTHING may safely be done in a real signal handler, except possibly setting a value of type static volatile sig_atomic_t. And even that can be problematic. And note that POSIX defers to C on what the C languages defines. So, even if the function is async-signal-safe, the code that calls it can't be! POSIX's lists are complete fantasy, anyway. Look at the one that defines thread-safety, and then try to get your mind around what exit being thread-safe actually implies (especially with regard to atexit functions). Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
Jean-Paul Calderone <[EMAIL PROTECTED]> wrote: > > Thanks for expounding. Given that it is basically impossible to do > anything useful in a signal handler according to the relevant standards > (does Python's current signal handler even avoid relying on undefined > behavior?), how would you suggest addressing this issue? Much as you are doing, and I described, but the first step would be to find out what 'most' Python people need for signal handling in threaded programs. This is because there is an unavoidable conflict between portability/reliability and functionality. I would definitely block all signals in threads, except for those that are likely to be generated ON the thread (SIGFPE etc.) It is a very good idea not to touch the handling of several of those, because doing so can cause chaos. I would have at least two 'standard' handlers, one of which would simply set a flag and return, and the other of which would abort. Now, NEITHER is a very useful specification, but providing ANY information is risky, which is why it is critical to know what people need. I would not TRUST the blocking of signals, so would set up handlers even when I blocked them, and would do the minimum fiddling in the main thread compatible with decent functionality. I would provide a call to test if the signal flag was set, and another to test and clear it. This would be callable ONLY from the main thread, and that would be checked. It is possible to do better, but that starts needing serious research. > It seems to me that it is actually possible to do useful things in a > signal handler, so long as one accepts that doing so is relying on > platform specific behavior. Unfortunately, that is wrong. That was true under MVS and VMS, but in Unix and Microsoft systems, the problem is that the behaviour is both platform and circumstance-dependent. What you can do reliably depends mostly on what is going on at the time. For example, on many Unix and Microsoft platforms, signals received while you are in the middle of certain functions or system calls, or certain particular signals (often SIGFPE), call the C handler with a bad set of global pointers or similar. I believe that this is one of reasons (perhaps the main one) that some such failures so often cause debuggers to be unable to find the stack pointer. I have tracked a few of those down, and have occasionally identified the cause (and even got it fixed!), but it is a murderous task, and I know of few other people who have ever succeeded. > How hard would it be to implement this for the platforms Python supports, > rather than for a hypothetical standards-exact platform? I have seen this effect on OSF/1, IRIX, Solaris, Linux and versions of Microsoft Windows. I have never used a modern BSD, haven't used HP-UX since release 9, and haven't used Microsoft systems seriously in years (though I did hang my new laptop in its GUI fairly easily). As I say, this isn't so much a platform issue as a circumstance one. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Cross-platform math functions?
Andreas Raab <[EMAIL PROTECTED]> wrote: > > I'm curious if there is any interest in the Python community to achieve > better cross-platform math behavior. A quick test[1] shows a > non-surprising difference between the platform implementations. > Question: Is there any interest in changing the behavior to produce > identical results across platforms (for example by utilizing fdlibm > [2])? Since I have need for a set of cross-platform math functions I'll > probably start with a math-compatible fdlibm module (unless somebody has > done that already ;-) > > [1] Using Python 2.4: > >>> import math > >>> math.cos(1.0e32) > > WinXP:-0.39929634612021897 > LinuxX86: -0.49093671143542561 Well, I hope not, but I am afraid that there is :-( The word "better" is emotive and inaccurate. Such calculations are numerically meaningless, and merely encourage the confusion between consistency and correctness. There is a strong sense in which giving random results between -1 and 1 would be better. Now, I am not saying that you don't have a requirement for consistency but I am saying that confusing it with correctness (as has been fostered by IEEE 754, Java etc.) is harmful. One of the great advantages of the wide variety of arithmetics available in the 1970s is that numerical testing was easier and more reliable - if you got wildly different results on two platforms, you got a strong pointer to numerical problems. That viewpoint is regarded as heresy nowadays, but used not to be! Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
"Adam Olsen" <[EMAIL PROTECTED]> wrote: > On 9/4/06, Gustavo Carneiro <[EMAIL PROTECTED]> wrote: > > > Now, we've had this API for a long time already (at least 2.5 > > years). I'm pretty sure it works well enough on most *nix systems. > > Event if it works 99% of the times, it's way better than *failing* > > *100%* of the times, which is what happens now with Python. > > Failing 99% of the time is as bad as failing 100% of the time, if your > goal is to eliminate the short timeout on poll(). 1% is quite a lot, > and it would probably have an annoying tendency to trigger repeatedly > when the user does certain things (not reproducible by you of course). That can make it a lot WORSE that repeated failure. At least with hard failures, you have some hope of tracking them down in a reasonable time. The problem with exception handling code that goes off very rarely, under non-reproducible circumstances, is that it is almost untestable and that bugs in it are positive nightmares. I have been inflicted with quite a large number in my time, and have a fairly good success rate, but the number of people who know the tricks is decreasing. Consider the (real) case where an unpredictable process on a large server (64 CPUs) was failing about twice a week (detectably), with no indication of how many failures were giving wrong answers. We replaced dozens of DIMMs, took days of down time and got nowhere; it then went hard (i.e. one failure a day). After a week's total down time, with me spending 100% of my time on it and the vendor allocating an expert at high priority, we cracked it. We were very lucky to find it so fast. I could give you other examples that were/are there years and decades later, because the pain threshhold never got high enough to dedicate the time (and the VERY few people with experience). I know of at least one such problem in generic TCP/IP (i.e. on Linux, IRIX, AIX and possibly Solaris) that has been there for decades and causes occasional failure in most networked applications/protocols. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
"Gustavo Carneiro" <[EMAIL PROTECTED]> wrote: > > Anyway, I was speaking hypothetically. I'm pretty sure writing to a > pipe is async signal safe. It is the oldest trick in the book, > everyone uses it. I don't have to see a written signed contract to > know that it works. Ah. Well, I can assure you that it's not the oldest trick in the book, and not everyone uses it. > This is all the evidence that I need. And again I reiterate that > whether or not async safety can be achieved in practice for all > platforms is not Python's problem. I wish you the joy of trying to report a case where it doesn't work to a large vendor and get them to accept that it is a bug. > Although I believe writing to a > pipe is 100% reliable for most platforms. Even if it is not, any > mission critical application relying on signals for correct behaviour > should be rewritten to use unix sockets instead; end of argument. Er, no. There are lots of circumstances where that isn't feasible, such as wanting to close down an application cleanly when the scheduler sends it a SIGXCPU. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
Johan Dahlin <[EMAIL PROTECTED]> wrote: > > Are you saying that we should let less commonly used platforms dictate > features and functionality for the popular ones? > I mean, who uses HP/UX, SCO and [insert your favorite flavor] as a modern > desktop system where this particular bug makes a difference? You haven't been following the thread. As I posted, this problem occurs to a greater or lesser degree on all platforms. This will be my last posting on the topic, but I shall try to explain. The first problem is in the hardware and operating system. A signal interrupts the thread, and passes control to a handler with a very partial environment and (usually) information on the environment when it was interrupted. If it interrupted the thread in the middle of a system call or other library routine that uses non-Python conventions, the registers and other state may be weird. There ARE solutions to this, but they are unbelievably foul, and even Linux on x86 gas had trouble with this. And, on return, everything has to be reversed entirely transparently! It is VERY common for there to be bugs in the C run-time system and not rare for there to be ones in the kernel (that area of Linux has been rewritten MANY times, for this reason). In many cases, the run-time system simply doesn't pretend to handle interrupts in arbitrary code (which is where the C undefined behaviour is used by vendors). The second problem is that what you can do depends both on what you were doing and how your 'primitive' is implemented. For example, if you call something that takes out even a very short term lock or uses a spin loop to emulate an atomic operation, you had better not use it if you interrupted code that was doing the same. Your thread may hang, crash or otherwise go bananas. Can you guarantee that even write is free of such things? No, and certainly not if you are using a debugger, a profiling library or even tracing system calls. I have often used programs that crashed as soon as I did one of those :-( Related to this is that it is EXTREMELY hard to write synchronisation primitives (mutexes etc.) that are interrupt-safe - MUCH harder than to write thread-safe ones - and few people are even aware of the issues. There was a thread on some Linux kernel mailing list about this, and even the kernel developers were having headaches thinking about the issues. Even if write is atomic, there are gotchas. What if the interrupted code is doing something to that file at the time? Are you SURE that an unexpected operation on it (in the same thread) won't cause the library function of program to get confused? And can you be sure that the write will terminate fast enough to not cause time-critical code to fail? And have you studied the exact semantics of blocking on pipes? They are truly horrible. So this is NOT a matter of platform X is safe and platform Y isn't. Even Linux x86 isn't entirely safe - or wasn't, the last time I heard. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
I was hoping to have stopped, but here are a few comments. I agree with Jan Kanis. That is the way to tackle this one. "Adam Olsen" <[EMAIL PROTECTED]> wrote: > > I don't think we should let this die, at least not yet. Nick seems to > be arguing that ANY signal handler is prone to random crashes or > corruption (due to bugs). However, we already have a signal handler, > so we should already be exposed to the random crashes/corruption. No. I am afraid that is a common myth and often catastrophic mistake. In this sort of area, NEVER assume that even apparently unrelated changes won't cause 'working' code to misbehave. Yes, Python is already exposed, but it would be easy to turn a very rare failure into a more common one. What I was actually arguing for was defensive programming. > If we're going to rely on signal handling being correct then I think > we should also rely on write() being correct. Note that I'm not > suggesting an API that allows arbitrary signal handlers, but rather > one that calls write() on an array of prepared file descriptors > (ignoring errors). For your interpretation of 'correct'. The cause of this chaos is that the C and POSIX standards are inconsistent, even internally, and they are wildly incompatible. So, even if things 'work' today, don't bet on the next release of your favourite system behaving the same way. It wouldn't matter if there was a de facto standard (i.e. a consensus), but there isn't. > Ensuring modifications to that array are atomic would be tricky, but I > think it would be doable if we use a read-copy-update approach (with > two alternating signal handler functions). Not sure how to ensure > there's no currently running signal handlers in another thread though. > Maybe have to rip the atomic read/write stuff out of the Linux > sources to ensure it's *always* defined behavior. Yes. But even that wouldn't solve the problem, as that code is very gcc-specific. > Looking into the existing signalmodule.c, I see no attempts to ensure > atomic access to the Handlers data structure. Is the current code > broken, at least on non-x86 platforms? Well, at a quick glance at the actual handler (the riskiest bit): 1) It doesn't check the signal range - bad practice, as systems do sometimes generate wayward numbers. 2) Handlers[sig_num].tripped = 1; is formally undefined, but actually pretty safe. If that breaks, nothing much will work. It would be better to make the int sig_atomic_t, as you say. 3) is_tripped++; and Py_AddPendingCall(checksignals_witharg, NULL); will work only because the handler ignores all signals in subthreads (which is definitely NOT right, as the comments say). Despite the implication, the code of Py_AddPendingCall is NOT safe against simultaneous registration. It is just plain broken, I am afraid. The note starting "Darn" should be a LOT stronger :-) [ For example, think of two threads calling the function at exactly the same time, in almost perfect step. Oops. ] I can't honestly promise to put any time into this in the forseeable future, but will try (sometime). If anyone wants to tackle this, please ask me for comments/help/etc. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
"Jason Orendorff" <[EMAIL PROTECTED]> wrote: > > Anyway, this kind of static analysis is probably more entertaining > than relevant. ... Well, yes. One can tell that by the piffling little counts being bandied about! More seriously, yes, it is Well Known that 0.0 is the Most Common Floating-Point Number is most numerical codes; a lot of older (and perhaps modern) sparse matrix algorithms use that to save space. In the software floating-point that I have started to draft some example code but have had to shelve (no, I haven't forgotten) the values I predefine are Invalid, Missing, True Zero and Approximate Zero. The infinities and infinitesimals (a.k.a. signed zeroes) could also be included, but are less common and more complicated. And so could common integers and fractions. It is generally NOT worth doing a cache lookup for genuinely numerical code, as the common cases that are not the above rarely account for enough of the numbers to be worth it. I did a fair amount of investigation looking for compressibility at one time, and that conclusion jumped out at me. The exact best choice depends entirely on what you are doing. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
"Terry Reedy" <[EMAIL PROTECTED]> wrote: > > For true floating point measurements (of temperature, for instance), > 'integral' measurements (which are an artifact of the scale used (degrees F > versus C versus K)) should generally be no more common than other realized > measurements. Not quite, but close enough. A lot of algorithms use a conversion to integer, or some of the values are actually counts (e.g. in statistics), which makes them a bit more likely. Not enough to get excited about, in general. > Thirty years ago, a major stat package written in Fortran (BMDP) required > that all data be stored as (Fortran 4-byte) floats for analysis. So a > column of yes/no or male/female data would be stored as 0.0/1.0 or perhaps > 1.0/2.0. That skewed the distribution of floats. But Python and, I hope, > Python apps, are more modern than that. And SPSS and Genstat and others - now even Excel > Float caching strikes me a a good subject for cookbook recipies, but not, > without real data and a willingness to slightly screw some users, for the > default core code. Yes. It is trivial (if tedious) to add analysis code - the problem is finding suitable representative applications. That was always my difficulty when I was analysing this sort of thing - and still is when I need to do it! > Nick Craig-Wood <[EMAIL PROTECTED]> wrote: > > For my application caching 0.0 is by far the most important. 0.0 has > ~200,000 references - the next highest reference count is only about ~200. Yes. All the experience I have ever seen over the past 4 decades confirms that is the normal case, with the exception of floating-point representations that have a missing value indicator. Even in IEEE 754, infinities and NaN are rare unless the application is up the spout. There are claims that a lot of important ones have a lot of NaNs and use them as missing values but, despite repeated requests, none of the people claiming that have ever provided an example. There are some pretty solid grounds for believing that those claims are not based in fact, but are polemic. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
=?iso-8859-1?Q?Kristj=E1n_V=2E_J=F3nsson?= <[EMAIL PROTECTED]> wrote: > > The total count of floating point numbers allocated at this point is 985794. > Without the reuse, they would be 1317145, so this is a saving of 25%, and > of 5Mb. And, if you optimised just 0.0, you would get 60% of that saving at a small fraction of the cost and considerably greater generality. It isn't clear whether the effort justifies doing more. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
[EMAIL PROTECTED] wrote: > > Doesn't that presume that optimizing just 0.0 could be done easily? Suppose > 0.0 is generated all over the place in EVE? Yes, and it isn't, respectively! The changes in floatobject.c would be trivial (if tedious), and my recollection of my scan is that floating values are not generated elsewhere. It would be equally easy to add a general caching algorithm, but that would be a LOT slower than a simple floating-point comparison. The problem (in Python) isn't hooking the checks into place, though it could be if Python were implemented differently. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: > > >> The total count of floating point numbers allocated at this point is > >> 985794. > >> Without the reuse, they would be 1317145, so this is a saving of 25%, and > >> of 5Mb. > > > > And, if you optimised just 0.0, you would get 60% of that saving at > > a small fraction of the cost and considerably greater generality. > > As Michael Hudson observed, this is difficult to implement, though: > You can't distinguish between -0.0 and +0.0 easily, yet you should. That was the point of a previous posting of mine in this thread :-( You shouldn't, despite what IEEE 754 says, at least if you are allowing for either portability or numeric validation. There are a huge number of good reasons why IEEE 754 signed zeroes fit extremely badly into any normal programming language and are seriously incompatible with numeric validation, but Python adds more. Is there any other type where there are two values that are required to be different, but where both the hash is required to be zero and both are required to evaluate to False in truth value context? Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: > > Ah, you are proposing a semantic change, then: -0.0 will become > unrepresentable, right? Well, it is and it isn't. Python currently supports only some of IEEE 754, and that is more by accident than design - because that is exactly what C90 implementations do! There is code in floatobject.c that assumes IEEE 754, but Python does NOT attempt to support it in toto (it is not clear if it could), not least because it uses C90. And, as far as I know, none of that is in the specification, because Python is at least in theory portable to systems that use other arithmetics and there is no current way to distinguish -0.0 from 0.0 except by comparing their representations! And even THAT depends entirely on whether the C library distinguishes the cases, as far as I can see. So distinguishing -0.0 from 0.0 isn't really in Python's current semantics at all. And, for reasons that we could go into, I assert that it should not be - which is NOT the same as not supporting branch cuts in cmath. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: > > py> x=-0.0 > py> y=0.0 > py> x,y Nobody is denying that SOME C90 implementations distinguish them, but it is no part of the standard - indeed, a C90 implementation is permitted to use ANY criterion for deciding when to display -0.0 and 0.0. C99 is ambiguous to the point of internal inconsistency, except when __STDC_IEC_559__ is set to 1, though the intent is clear. And my reading of Python's code is that it relies on C's handling of such values. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
Alastair Houghton <[EMAIL PROTECTED]> wrote: > > AFAIK few systems have floating point traps enabled by default (in > fact, isn't that what IEEE 754 specifies?), because they often aren't > very useful. The first two statements are true; the last isn't. They are extremely useful, not least because they are the only practical way to locate numeric errors in most 3 GL programs (including C, Fortran etc.) > And in the specific case of the Python interpreter, why > would you ever want them turned on? Surely in order to get > consistent floating point semantics, they need to be *off* and Python > needs to handle any exceptional cases itself; even if they're on, by > your argument Python must do that to avoid being terminated. Grrk. Why are you assuming that turning them off means that the result is what you expect? That isn't always so - sometimes it merely means that you get wrong answers but no indication of that. > or see if it can't turn them off using the C99 APIs. That is a REALLY bad idea. You have no idea how broken that is, and what the impact it would be on Python. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
James Y Knight <[EMAIL PROTECTED]> wrote: > > This is a really poor argument. Python should be moving *towards* > proper '754 fp support, not away from it. On the platforms that are > most important, the C implementations distinguish positive and > negative 0. That the current python implementation may be defective > when the underlying C implementation is defective doesn't excuse a > change to intentionally break python on the common platforms. Perhaps you might like to think why only IBM POWERx (and NOT the Cell or most embedded POWERs) is the ONLY mainstream system to have implemented all of IEEE 754 in hardware after 22 years? Or why NO programming language has provided support in those 22 years, and only Java and C have even claimed to? See Kahan's "How Javas Floating-Point Hurts Everyone Everywhere", note that C99 is much WORSE, and then note that Java and C99 are the only languages that have even attempted to include IEEE 754. You have also misunderstood the issue. The fact that a C implementation doesn't support it does NOT mean that the implementation is defective; quite the contrary. The issue always has been that IEEE 754's basic model is incompatible with the basic models of all programming languages that I am familiar with (which is a lot). And the specific problems with C99 are in the STANDARD, not the IMPLEMENTATIONS. > IEEE 754 is so widely implemented that IMO it would make sense to > make Python's floating point specify it, and simply declare floating > point operations on non-IEEE 754 machines as "use at own risk, may > not conform to python language standard". (or if someone wants to use > a software fp library for such machines, that's fine too). Firstly, see the above. Secondly, Python would need MAJOR semantic changes to conform to IEEE 754R. Thirdly, what would you say to the people who want reliable error detection on floating-point of the form that Python currently provides? Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Caching float(0.0)
On Wed, Oct 04, 2006 at 12:42:04AM -0400, Tim Peters wrote: > > > If C90 doesn't distinguish -0.0 and +0.0, how can Python? > > > Can you give a simple example where the difference between the two > > is apparent to the Python programmer? > > Perhaps surprsingly, many (well, comparatively many, compared to none > ) people have noticed that the platform atan2 cares a lot: Once upon a time, floating-point was used as an approximation to mathematical real numbers, and anything which was mathematically undefined in real arithmetic was regarded as an error in floating- point. This allowed a reasonable amount of numeric validation, because the main remaining discrepancy was that floating-point has only limited precision and range. Most of the numerical experts that I know of still favour that approach, and it is the one standardised by the ISO LIA-1, LIA-2 and LIA-3 standards for floating-point arithmetic. atan2(0.0,0.0) should be an error. But C99 differs. While words do not fail me, they are inappropriate for this mailing list :-( Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cloning threading.py using proccesses
"M.-A. Lemburg" <[EMAIL PROTECTED]> wrote: > > This is hard to believe. I've been in that business for a few > years and so far have not found an OS/hardware/network combination > with the mentioned features. Surely you must have - unless there is another M.-A. Lemburg in IT! Some of the specialist systems, especially those used for communication, were like that, and it is very likely that many still are. But they aren't currently in Python's domain. I have never used any, but have colleagues who have. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cloning threading.py using proccesses
Josiah Carlson <[EMAIL PROTECTED]> wrote: > > It would be convenient, yes, but the question isn't always 'threads or > processes?' In my experience (not to say that it is more or better than > anyone else's), when going multi-process, the expense on some platforms > is significant enough to want to persist the process (this is counter to > my previous forking statement, but its all relative). And sometimes one > *wants* multiple threads running in a single process handling multiple > requests. Yes, indeed. This is all confused by the way that POSIX (and Microsoft) threads have become essentially just processes with shared resources. If one had a system with real, lightweight threads, the same might well not be so. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
Sorry. I was on holiday, and then buried this when sorting out my thousands of Emails on my return, partly because I had to look up the information! =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: > > >> | afaik the kernel only sends signals to threads that don't have them > >> blocked. > >> | If python doesn't want anyone but the main thread to get signals, it > >> should just > >> | block signals on all but the main thread and then by nature, all > >> signals will go > >> | to the main thread > > > > Well, THAT'S wrong, I am afraid! Things ain't that simple :-(> > > Yes, POSIX implies that things work that way, but there are so many > > get-out clauses and problems with trying to implement that specification > > that such behaviour can't be relied on. > > Can you please give one example for each (one get-out clause, and > one problem with trying to implement that). http://www.opengroup.org/onlinepubs/009695399/toc.htm 2.4.1 Signal Generation and Delivery It is extremely unclear what that means, but it talks about the generation and delivery of signals to both threads and processes. I can tell you (from speaking to system developers) that they understand that to mean that they are allowed to send signals to specific threads when that is appropriate. But they are as confused by POSIX's verbiage as I am! > I fail to see why it isn't desirable to make all signals occur > in the main thread, on systems where this is possible. Oh, THAT's easy. Consider a threaded application running on a muti-CPU machine and consider hardware generated signals (e.g. SIGFPE, SIGSEGV etc.) Sending them to the master thread involves either moving them between CPUs or moving the master thread; both are inefficient and neither may be possible. [ I have brought systems down with signals that did have to be handled on a particular CPU, by flooding that with signals from dozens of others (yes, big SMPs) and blocking out high-priority interrupts. The efficiency point can be serious. ] That also applies to many of the signals that do not reach programs, such as TLB misses, ECC failure etc. But, in those cases, what does Python or even POSIX need to know about them? Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Signals, threads, blocking C functions
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: Michael Hudson schrieb: > > >> According to [1], all python needs to do to avoid this problem is > >> block all signals in all but the main thread; > > > > Argh, no: then people who call system() from non-main threads end up > > running subprocesses with all signals masked, which breaks other > > things in very mysterious ways. Been there... > > Python should register a pthread_atfork handler then, which clears > the signal mask. Would that not work? No. It's not the only such problem. Personally, I think that anyone who calls system(), fork(), spawn() or whatever from threads is cuckoo. It is precisely the sort of thing that is asking for trouble, because there are so many ways of doing it 'right' that you can't be sure exactly what mental model the system developers will have. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Floor division
"Guido van Rossum" <[EMAIL PROTECTED]> wrote: > > That really sucks, especially since the whole point of making int > division return a float was to make the integers embedded in the > floats... I think the best solution would be to remove the definition > of % (and then also for divmod()) for floats altogether, deferring to > math.fmod() instead. Please, NO!!! At least not without changing the specification of the math module. The problem with it is that it is specified to be a mapping of the underlying C library, complete with its error handling. fmod isn't bad, as goes, BUT: God alone knows what happens with fmod(x,0.0), let alone fmod(x,-0.0). C99 says that it is implementation-defined whether a domain error occurs or the function returns zero, but domain errors are defined behaviour only in C90 (and not in C99!) It is properly defined only if Annex F is in effect (with all the consequences that implies). Note that I am not saying that syntactic support is needed, because Fortran gets along perfectly well with this as a function. All I am saying is that we want a defined function with decent error handling! Returning a NaN is fine on systems with proper NaN support, which is why C99 Annex F fmod is OK. > For ints and floats, real could just return self, and imag could > return a 0 of the same type as self. I guess the conjugate() function > could also just return self (although I see that conjugate() for a > complex with a zero imaginary part returns something whose imaginary > part is -0; is that intentional? I'd rather not have to do that when > the input is an int or float, what do you think?) I don't see the problem in doing that - WHEN implicit conversion to a smaller domain, losing information, raises an exception. The errors caused by needing a 'cast' (including Fortran INT, DBLE and (ugh) COMPLEX, here) causing not just conversion but information loss have caused major trouble for as long as I have been around. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Floor division
"Tim Peters" <[EMAIL PROTECTED]> wrote: > > > I guess the conjugate() function could also just return self (although I see > > that conjugate() for a complex with a zero imaginary part returns > > something whose imaginary part is -0; is that intentional? > > That's wrong, if true: it should return something with the opposite > sign on the imaginary part, whether or not that equals 0 (+0. and -0. > both "equal 0"). Grrk. Why? Seriously. IEEE 754 signed zeroes are deceptive enough for float, but are a gibbering nightmare for complex; Kahan may be able to handle them, but mere mortals can't. Inter alia, the only sane forms of infinity for complex numbers are a SINGLE one (the compactified model) and to may infinity into NaN (which I prefer, as it leads to less nonsense). And, returning to 'floor' - if one is truncating towards -infinity, should floor(-0.0) deliver -1.0, 0.0 or -0.0? > math.fmod is 15 years old -- whether or not someone likes it has > nothing to do with whether Python should stop trying to use the > current integer-derived meaning of % for floats. Eh? No, it isn't. Because of the indirection to the C library, it is changing specification as we speak! THAT is all I am getting at; not that the answer might not be A math.fmod with defined behaviour. > On occasion we've added additional error checking around functions > inherited from C. But adding code to return a NaN has never been > done. If you want special error checking added to the math.fmod > wrapper, it would be easiest to "sell" by far to request that it raise > ZeroDivisionError (as integer mod does) for a modulus of 0, or > ValueError (Python's historic mapping of libm EDOM, and what Python's > fmod(1, 0) already does on some platforms). The `decimal` module > raises InvalidOperation in this case, but that exception is specific > to the `decimal` module for now. I never said that it should; I said that it is reasonable behaviour on systems that support them. I personally much prefer an exception in this case. What I was trying to point out is that the current behaviour is UNDEFINED (and may give total nonsense). That is not good. > >> For ints and floats, real could just return self, and imag could > >> return a 0 of the same type as self. I guess the conjugate() function > >> could also just return self (although I see that conjugate() for a > >> complex with a zero imaginary part returns something whose imaginary > >> part is -0; is that intentional? I'd rather not have to do that when > >> the input is an int or float, what do you think?) > > > I don't see the problem in doing that - WHEN implicit conversion > > to a smaller domain, losing information, raises an exception. > > Take it as a pragmatic fact that it wouldn't. Besides, e.g., the > conjugate of 10**5 is exactly 10**5 mathematically. Why raise > an exception just because it can't be represented as a float? The > exact result is easily supplied with a few lines of "obviously > correct" implementation code (incref `self` and return it). Eh? I don't understand. Are you referring to float("1.0e5"), pow(10,5), pow(10.0,5), or a conjugate (and, if so, of what?) float(conjg(1.23)) obviously need not raise an exception, except possibly "Sanity failure" :-) Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Floor division
A generic comment. Many of my postings seem to be being misunderstood. I hold no brief for ANY particular floating-point religion, sect or heresy, except insofar as it affects robustness and portability (i.e. "software engineering"). I can work with and teach almost any model, and have done so with some pretty weird ones. My objections to some proposals is that they are sacrificing those in favour of some ill-defined objectives. "Tim Peters" <[EMAIL PROTECTED]> wrote: > [TIm Peters] > >> That's wrong, if true: it should return something with the opposite > >> sign on the imaginary part, whether or not that equals 0 (+0. and -0. > >> both "equal 0"). > > |[Nick Maclaren] > > Grrk. Why? Seriously. > > Seriously: because there's some reason to do so and no good reason > not to. Hmm. That doesn't fully support the practice, except for IEEE 754(R) numbers. To require a floating-point format to have signed zeroes is a religious matter. But I agree that specifying something different if the numbers are an IEEE 754(R) format makes no sense. > > And, returning to 'floor' - if one is truncating towards -infinity, > > should floor(-0.0) deliver -1.0, 0.0 or -0.0? > > I'd leave a zero argument alone (for ceiling too), and am quite sure > that's "the right" 754-ish behavior. It's not clear, and there was a debate about it! But it is what IEEE 754R ended up specifying. > Couldn't quite parse that, but nearly all of Python's math-module > functions inherit most behavior from the platform libm. This is often > considered to be a feature: the functions called from Python > generally act much like they do when called from C or Fortran on the > same platform, easing cross-language development on a single platform. And making it impossible to write robust, portable code :-( Note that most platforms have several libms and the behaviour even for a single libm can be wildly variable. It can also ENHANCE cross-language problems, where a user needs to use a library that expects a different libm or libm option. > Do note the flip side: to the extent that different platform > religions refuse to standardize libm endcase behavior, Python plays > along with whatever libm gods the platform it's running on worships. > That's of value to some too. Actually, no, it doesn't. Because Python doesn't support any libm behaviour other than the one that it was compiled with, and that is often NOT what is wanted. > So which one would you prefer? As explained, there are 3 plausible > candidates. > > You seem to be having some trouble taking "yes" for an answer here ;-) Actually, there are a lot more candidates, but let that pass. All I am saying is that there should be SOME defined AND SANE behaviour. While I would prefer an exception, I am not dogmatic about it. What I can't stand is completely undefined behaviour, as was introduced into Python by C99. > > What I was trying to point out is that the current behaviour is > > UNDEFINED (and may give total nonsense). That is not > > good. > > Eh -- I can't get excited about it. AFAIK, in 15 years nobody has > complained about passing a 0 modulus to math.fmod (possibly because > most newbies use the Windows distro, and it does raise ValueError > there). Some people write Python that is intended to be robust and portable; it is those people who suffer. > What Guido would rather do, which I agreed with, was to have > x.conjugate() simply return x when x is float/int/long. No change in > value, no change in type, and the obvious implementation would even > make ... Fine. I am happy with that. What I was pointing out that forcible changes of type aren't harmful if you remove the "gotchas" of loss of information with coercions that aren't intended to do so. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Floor division
"Jim Jewett" <[EMAIL PROTECTED]> wrote: > > >... I can work with and teach almost any model, > > and have done so with some pretty weird ones. > > I think python's model is "Whatever your other tools use. Ask them." > And I think that is a reasonable choice. Answer: It's undefined. Just because you have tested your code today doesn't mean it will work tomorrow, or on a different set of values (however similar), or that it will give the same answer every time you do the same operation on the same input, or that the effects will be limited to wrong answers and stray exceptions. Still think that it is reasonable? > > Some people write Python that is intended to be robust and portable; > > it is those people who suffer. > > If your users stick to sensible inputs, then it doesn't matter which > model you used. Sigh. Let's step back a step. Who decides when inputs are sensible? And where is it documented? Answers: God alone knows, and nowhere. One of Python's general principles is that its operations should either do roughly what a reasonable user would expect, or it will raise an exception. It doesn't always get there, but it isn't bad. What you are saying is that is undesirable. The old Fortran and C model of saying that any user error can cause any effect (including nasal demons) is tolerable only if there is agreement on what IS an error, and there is some way for a user to find that out. In the case of C, neither is true. > If not, there is no way to get robust and portable; it is just a > matter of which users you annoy. Well, actually, there is. Though I agree that the techniques have rather been forgotten in the past 30 years. Python implements more of them than most languages. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] compex numbers (was Floor Division)
"Jim Jewett" <[EMAIL PROTECTED]> wrote: > Tim Peters wrote: > > > complex_new() ends with: > > > cr.real -= ci.imag; > > cr.imag += ci.real; > > > and I have no idea what that thinks it's doing. Surely this isn't > > intended?! > : > > I think it is. python.org/sf/1642844 adds comments to make it less unclear. Agreed. > > If "real" and "imag" are themselves complex numbers, then normalizing > the result will move the imaginary portion of the "real" vector into > the imaginary part and vice versa. Not really. What it does is to make complex(a,b) exactly equivalent to a+1j*b. For example: >>> a = 1+2j >>> b = 3+4j >>> complex(a) (1+2j) >>> b*1j (-4+3j) >>> complex(a,b) (-3+5j) > Note that changing this (to discard the imaginary parts) would break > passing complex numbers to their own constructor. Eh? Now, I am baffled. There are several ways of changing it, all of which would turn one bizarre behaviour into another - or would raise an exception. Personally, I would do the following: complex(a) would permit a to be complex. complex(a,b) would raise an exception if either a or b were complex. But chacun a son gout (accents omitted). Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problem with signals in a single threaded application
On Tue, Jan 23, 2007, Ulisses Furquim wrote: > > I've read some threads about signals in the archives and I was under > the impression signals should work reliably on single-threaded > applications. Am I right? I've thought about a way to fix this, but I > don't know what is the current plan for signals support in python, so > what can be done? This one looks like an oversight in Python code, and so is a bug, but it is important to note that signals do NOT work reliably under any Unix or Microsoft system. Inter alia, all of the following are likely to lead to lost signals: Two related signals received between two 'checkpoints' (i.e. when the signal is tested and cleared). You may only get one of them, and 'related' does not mean 'the same'. A second signal received while the first is being 'handled' by the operating system or language run-time system. A signal sent while the operating system is doing certain things to the application (including, sometimes, when it is swapped out or deep in I/O.) And there is more, some of which can cause program misbehaviour or crashes. You are also right that threading makes the situation a lot worse. Obviously, Unix and Microsoft systems depend on signals, so you can't simply regard them as hopelessly broken, but you can't assume that they are RELIABLE. All code should be designed to cope with the case of signals getting lost, if at all possible. Defending yourself against the other failures is an almost hopeless task, but luckily they are extremely rare except on specialist systems. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Complex constructors [was Re: Floor division]
Gareth McCaughan <[EMAIL PROTECTED]> wrote: > > ... The question is whether > it makes sense to define complex(a,b) = a+ib for all a,b > or whether the two-argument form is always in practice going > to be used with real numbers[1]. If it is, which seems pretty > plausible to me, then changing complex() to complain when > passed two complex numbers would (1) notify users sooner > when they have errors in their programs, (2) simplify the > code, and (3) avoid the arguably broken behaviour Tim was > remarking on, where complex(-0.0).real is +0 instead of -0. > > [1] For the avoidance of ambiguity: "real" is not > synonymous with "double-precision floating-point". Precisely. On this matter, does anyone know of an application where making that change would harm anything? I cannot think of a circumstance under which the current behaviour adds any useful function over the one that raises an exception if there are two arguments and either is complex. Yes, of course, SOME people will find it cool to write complex(a,b) when they really mean a+1j*b, but Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Floor division
Armin Rigo <[EMAIL PROTECTED]> wrote: > > Thanks for the clarification. Yes, it makes sense that __mod__, > __divmod__ and __floordiv__ on float and decimal would eventually follow > the same path as for complex (where they make even less sense and > already raise a DeprecationWarning). Yes. Though them not doing so would also make sense. The difference is that they make no mathematical sense for complex, but the problems with float are caused by floating-point (and do not occur for the mathematical reals). There is an argument for saying that divmod should return a long quotient and a float remainder, which is what C99 has specified for remquo (except that it requires only the last 3 bits of the quotient for reasons that completely baffle me). Linux misimplemented that the last time I looked. Personally, I think that it is bonkers, as it is fiendishly expensive compared to its usefulness - especially with Decimal! But it isn't obviously WRONG. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Floor division
"Guido van Rossum" <[EMAIL PROTECTED]> wrote: > > "(int)float_or_double" truncates in C (even in K&R C) /provided that/ > the true result is representable as an int. Else behavior is > undefined (may return -1, may cause a HW fault, ...). Actually, I have used Cs that didn't, but haven't seen any in over 10 years. C90 is unclear about its intent, but C99 is specific that truncation is towards zero. This is safe, at least for now. > So Python uses C's modf() for float->int now, which is always defined > for finite floats, and also truncates. Yes. And that is clearly documented and not currently likely to change, as far as I know. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Floor division
"Tim Peters" <[EMAIL PROTECTED]> wrote: > > It could, but who would have a (sane) use for a possibly 2000-bit quotient? Well, the 'exact rounding' camp in IEEE 754 seem to think that there is one :-) As you can gather, I can't think of one. Floating-point is an inherently inaccurate representation for anything other than small integers. > This is a bit peculiar to me, because there are ways to compute > "remainder" using a number of operations proportional to the log of > the exponent difference. It could be that people who spend their life > doing floating point forget how to work with integers ;-) Aargh! That is indeed the key! Given that I claim to know something about integer arithmetic, too, how can I have been so STUPID? Yes, you are right, and that is the only plausible way to calculate the remainder precisely. You don't get the quotient precisely, which is what my (insane) specification would have provided. I would nitpick with your example, because you don't want to reduce modulo 3.14 but modulo pi and therefore the modular arithmetic is rather more expensive (given Decimal). However, it STILL doesn't help to make remquo useful! The reason is that pi is input only to the floating-point precision, and so the result of remquo for very large arguments will depend more on the inaccuracy of pi as input than on the mathematical result. That makes remquo totally useless for the example you quote. Yes, I have implemented 'precise' range reduction, and there is no substitute for using an arbitrary precision pi value :-( > > But it isn't obviously WRONG. > > For floats, fmod(x, y) is exactly congruent to x modulo y -- I don't > think it's possible to get more right than exactly right ;-) But, as a previous example of yours pointed out, it's NOT exactly right. It is also supposed to be in the range [0,y) and it isn't. -1%1e100 is mathematically wrong on two counts. 1a Cc: "Tim Peters" <[EMAIL PROTECTED]> Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Floor division
"Tim Peters" <[EMAIL PROTECTED]> wrote: > > [Tim (misattributed to Guido)] Apologies to both! > > C90 is unclear about its intent, > > But am skeptical of that. I don't have a copy of C90 here, but before > I wrote that I checked Kernighan & Ritchie's seminal C book, Harbison > & Steele's generally excellent "C: A Reference Manual" (2nd ed), and a > web version of Plauger & Brodie's "Standard C": > > http://www-ccs.ucsd.edu/c/ > > They all agree that the Cs they describe (all of which predate C99) > convert floating to integral types via truncation, when possible. I do. Kernighan & Ritchie's seminal C book describes the Unix style of "K&R" C - one of the reasons that ANSI/ISO had to make incompatible changes was that many important PC and embedded Cs differed. Harbison and Steele is generally reliable, but not always; I haven't looked at the last, but I would regard it suspiciously. What C90 says is: When a value of floating type is converted to integer type, the fractional part is discarded. There is other wording, but none relevant to this issue. Now, given the history of floating-point remainder, that is seriously ambiguous. > > but C99 is specific that truncation is towards zero. > > As opposed to what? Truncation away from zero? I read "truncation" > as implying toward 0, although the Plauger & Brodie source is explicit > about "the integer part of X, truncated toward zero" for the sake of > logic choppers ;-) Towards -infinity, of course. That was as common as truncation towards zero up until the 1980s. It was near-universal on twos complement floating-point systems, and not rare on signed magnitude ones. During the standardisation of C90, the BSI tried to explain to ANSI that this needed spelling out, but were ignored. C99 did not add the normative text "(i.e., the value is truncated toward zero)" because there was no ambiguity, after all! Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Floor division
"Tim Peters" <[EMAIL PROTECTED]> wrote: > > OTOH, I am a fan of analyzing FP operations as if the inputs were in > fact exactly what they claim to be, which 754 went a long way toward > popularizing. That largely replaced mountains of idiosyncratic > "probabilistic arguments" (and where it seemed no two debaters ever > agreed on the "proper" approach) with a common approach that > sometimes allows surprisingly sharp analysis. Since I spent a good > part of my early career as a professional apologist for Seymour Cray's > "creative" floating point, I'm probably much more grateful to leave > sloppy arithmetic behind than most. Well, I spent some of it working with code (and writing code) that was expected to work, unchanged, on an ICL 1900, CDC 6600/7600, IBM 370 and others. I have seen the harm caused by the 'exact arithmetic' mindset and so don't like it, but I agree about your objections to the "probabilistic arguments" (which were and are mostly twaddle). But that is seriously off-topic. > [remquo] It's really off-topic for Python-Dev, so > I didn't/don't want to belabor it. Agreed, except in one respect. I stand by my opinion that the C99 specification has no known PRACTICAL use (your example is correct, but I know of no such use in a real application), and so PLEASE don't copy it as a model for Python divmod/remainder. > No, /Python's/ definition of mod is inexact for that example. fmod > (which is not Python's definition) is always exact: fmod(-1, 1e100) = > -1, and -1 is trivially exactly congruent to -1 modulo anything > (including modulo 1e100). The result of fmod(x, y) has the same sign > as x; Python's x.__mod__(y) has the same sign as y; and that makes all > the difference in the world as to whether the exact result is always > exactly representable as a float. Oops. You're right, of course. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python's C interface for types
I have a fair amount of my binary floating-point model written, though even of what I have done only some is debugged (and none has been rigorously tested). But I have hit some things that I can't work out, and one query reduced comp.lang.python to a stunned silence :-) Note that I am not intending to do all the following, at least for now, but I have had to restructure half a dozen times to match my implementation requirements to the C interface (as I have learnt more about Python!) and designing to avoid that is always good. Any pointers appreciated. I can't find any detailed description of the methods that I need to provide. Specifically: Does Python use classic division (nb_divide) and inversion (nb_invert) or are they entirely historical? Note that I can very easily provide the latter. Is there any documentation on the coercion function (nb_coerce)? It seems to have unusual properties. How critical is the 'numeric' property of the nb_hash function? I can certainly honour it, but is it worth it? I assume that Python will call nb_richcompare if defined and nb_compare if not. Is that right? Are the inplace methods used and, if so, what is their specification? I assume that I can ignore all of the allocation, deallocation and attribute handling functions, as the default for a VAR object is fine. That seems to work. Except for one thing! My base type is static, but I create some space for every derivation (and it can ONLY be used in derived form). The space creation is donein C but the derivation in Python. I assume that I need a class (not instance) destructor, but what should it do to free the space? Call C to Py_DECREF it? I assume that a class structure will never go away until after all instances have gone away (unless I use Py_DECREF), so a C pointer from an instance to something owned by the class is OK. Is there any documentation on how to support marshalling/pickling and the converse from C types? I would quite like to provide some attributes. They are 'simple' but need code executing to return them. I assume that means that they aren't simple enough, and have to be provided as methods (like conjugate). That's what I have done, anyway. Is there any obvious place for a reduction method to be hooked in? That is a method that takes a sequence, all members of which must be convertible to a single class, and returns a member of that class. Note that it specifically does NOT make sense on a single value of that class. Sorry about the length of this! Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python's C interface for types
Thanks very much! That answers most things. Yes, I had got many of my answers from searching the source, but there is clearly some history there, and it isn't always clear what is current. Here are a few responses to the areas of confusion: > nb_invert is used for bitwise inversion (~) and PyNumber_Invert(). It's not > historical, it's actual. Ah! So it's NOT 1/x! No relevant to floating-point, then. > I don't recall ever seeing useful documentation on coerce() and nb_coerce. > I suggest not to use it; it's gone in Python 3.0 anyway. Excellent! Task completed :-) > Which numeric property? the fact that it returns a C long? Or that, for > natural numbers, it *seems* to return self? The latter. hash(123) == hash(123.0) for example. It is a real pain for advanced formats. Making it the same for things that compare equal isn't a problem. > [inplace ] I assume your floating-point type is > immutable, so you won't have to implement them. I haven't done anything special to flag it as such, but it is. > Where do you allocate this space, and how do you allocate it? If it's space > you malloc() and store somewhere in the type struct, yecchh. You should not > just allocate stuff at the end of the type struct, as the type struct's > layout is not under your control (we actually extend the type struct as > needed, which is why newer features end up in less logical places at the end > of the struct ;) I would suggest using attributes of the type instead, with > the normal Python refcounting. That means the 'extra space' has to be an > actual Python object, though. PyMem_Malloc. I can certainly make it an attribute, as the overhead isn't large for a per-class object. It is just a block of mutable memory, opaque to the Python layer, and NOT containing any pointers! > I don't you can make your own type marshallable. For pickle it's more or > less the same as for Python types. The pickle docs (and maybe > http://www.python.org/dev/peps/pep-0307/) probably cover what you want to > know. You can also look at one of the complexer builtin types that support > pickling, like the datetime types. The only documentation I have found is how to do it in Python. Is that what you mean? I will look at the datetime types. > You can use PyGetSetDef to get 'easy' attributes with getters and setters. > http://docs.python.org/api/type-structs.html#l2h-1020 I was put off by some of the warnings. I will revisit it. > There's nothing I can think of that is a natural match for that in standard > Python methods. I would suggest just making it a classmethod. > (dict.fromkeysis a good example of a classmethod in C.) Thanks. That is a useful reference. Reductions are a problem in many languages. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python's C interface for types
Oops. Something else fairly major I forgot to ask. Python long. I can't find any clean way of converting to or from this, and would much rather not build a knowledge of long's internals into my code. Going via text is, of course, possible - but is not very efficient, even using hex/octal. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python's C interface for types
Giovanni Bajo <[EMAIL PROTECTED]> wrote: > > I personally consider *very* important that hash(5.0) == hash(5) (and > that 5.0 == 5, of course). It gets a bit problematic with floating-point, when you can have different values "exactly 5.0" and "approximately 5.0". IEEE 754 has signed zeroes. And so it goes. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python's C interface for types
Josiah Carlson <[EMAIL PROTECTED]> wrote: > > See _PyLong_FromByteArray and _PyLong_AsByteArray . Oops! Thanks very much. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python's C interface for types
Having looked into the answers a bit more deeply, I am afraid that I am still a bit puzzled. 1) As I understand it, PyMem_Malloc won't cause trouble, but won't be automatically freed, either, as it doesn't return a new reference. I don't think that immediately following it by PyCObject_FromVoidPtr (which is what I do) helps with that. What I need is some standard type that allows me to allocate an anonymous block of memory; yes, I can define such a type, but that seems excessive. Is there one? 2) _PyLong_FromByteArray and _PyLong_AsByteArray aren't in the API and have no comments. Does that mean that they are unstable, in the sense that they may change behaviour in new versions of Python? And will they be there in 3.0? Thanks for any help, again. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Problem with signals in a single threaded application
I apologise for going off-topic, but this is an explanation of why I said that signal handling is not reliable. The only relevance to Python is that Python should avoid relying on signals if possible, and try to be a little defensive if not. Signals will USUALLY do what is expected, but not always :-( Anything further by Email, please. Greg Ewing <[EMAIL PROTECTED]> wrote: > > > This one looks like an oversight in Python code, and so is a bug, > > but it is important to note that signals do NOT work reliably under > > any Unix or Microsoft system. > > That's a rather pessimistic way of putting it. In my > experience, signals in Unix mostly do what they're > meant to do quite reliably -- it's just a matter of > understanding what they're meant to do. Yes, it is pessimistic, but I am afraid that my experience is that it is so :-( That doesn't deny your point that they MOSTLY do 'work', but car drivers MOSTLY don't need to wear seat belts, either. I am talking about high-RAS objectives, and ones where very rare failure modes can become common (e.g. HPC and other specialist uses). More commonly, there are plain bugs in the implementations which are sanctioned by the standards (Linux is relatively disdainful of such legalistic games). Because they say that everything is undefined behaviour, many vendors' support mechanisms will refuse to accept bug reports unless you push like hell. And, as some are DIABOLICALLY difficult to explain, let alone demonstrate, they can remain lurking for years or decades. > There may be bugs in certain systems that cause > signals to get lost under obscure circumstances, but > that's no reason for Python to make the situation > worse by introducing bugs of its own. 100% agreed. > > Two related signals received between two 'checkpoints' (i.e. when > > the signal is tested and cleared). You may only get one of them, > > and 'related' does not mean 'the same'. > > I wasn't aware that this could happen between > different signals. If it can, there must be some > rationale as to why the second signal is considered > redundant. Otherwise there's a bug in either the > design or the implementation. Nope. There is often a clash between POSIX and the hardware, or a cause where a 'superior' signal overrides an 'inferior' one. I have seen SIGKILL flush some other signals, for example. And, on some systems, SIGFPE may be divided into the basic hardware exceptions. If you catch SIGFPE as such, all of those may be cleared. I don't think that many (any?) current systems do that. And it is actually specified to occur for the SISSTOP, SIGTSTP, SIGTTIN, SIGTTOU, SIGCONT group. > > A second signal received while the first is being 'handled' by the > > operating system or language run-time system. > > That one sounds odd to me. I would expect a signal > received during the execution of a handler to be > flagged and cause the handler to be called again > after it returns. But then I'm used to the BSD > signal model, which is relatively sane. It's nothing to do with the BSD model, which may be saner but still isn't 100% reliable, but occurs at a lower layer. At the VERY lowest level, when a genuine hardware event causes an interrupt, the FLIH (first-level interrupt handler) runs in God mode (EVERYTHING disabled) until it classifies what is going on. This is a ubiquitous misdesign of modern hardware, but that is off-topic. Hardware 'signals' from other CPUs/devices may well get lost if they occur in that window. And there are other, but less extreme, causes at higher levels in the operating system. Unix and Microsoft do NOT have a reliable signal delivery model, where the sender of a signal checks if the recipient has got it and retries if not. Some operating systems do - but I don't think that BSD does. > > A signal sent while the operating system is doing certain things to > > the application (including, sometimes, when it is swapped out or > > deep in I/O.) > > That sounds like an outright bug. I can't think > of any earthly reason why the handler shouldn't > be called eventually, if it remains installed and > the process lives long enough. See above. It gets lost at a low level. That is why you can cause serious time drift on an "IBM PC" (most modern ones) by hammering the video card or generating streams of floating-point fixups. Most people don't notice, because xntp or equivalent fixes it up. And there are worse problems. I could start on cross-CPU TLB and ECC handling on large shared memory systems. I managed to get an Origin in a state where it wouldn't even power down from the power-off button, and I had to flip bre
Re: [Python-Dev] Python's C interface for types
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: > > [not sure what "And so it goes" means in English] I apologise. I try to restrain myself from using excessive idiom, but sometimes I forget. It means "That is how things are, and there is and will be more of the same." > It may be a bit problematic to implement, but I think a clean > specification is possible. If a and b are numbers, and a==b, > then hash(a)==hash(b). I'm not sure whether "approximately 5.0" > equals 5 or not: if it does, it should hash the same as 5, > if it doesn't, it may or may not hash the same (whatever is > easier to implement). > For 0: hash(+0.0)==hash(-0.0)==hash(0)=hash(0L)=0 Unfortunately, that assumes that equality is transitive. With the advanced floating-point models, it may not be. For example, if you want to avoid the loss of error information, exact infinity and approximate infinity (the result of overflow) have different semantics. Similarly with infinitesimals. Even at present, Python's float (Decimal probably more so) doesn't allow you to do some things that are quite reasonable. For example, let us say that I am implementing a special function and want to distinguish -0.0 and +0.0. Why can't I use a dictionary? >>> a = float("+0.0") >>> b = float("-0.0") >>> print a, b 0.0 -0.0 >>> c = {a: "+0.0", b: "-0.0"} >>> print c[a], c[b] -0.0 -0.0 Well, we all know why. But it is not what some quite reasonable programmers will expect. And Decimal (with its cohorts and variant precisions) has this problem quite badly - as do I. No, I don't have an answer. You are damned if you do, and damned if you don't. It is an insoluble problem, and CURRENTLY doesn't justify two hashing mechanisms (i.e. ANY difference and EQUALITY difference). Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python's C interface for types
"Jim Jewett" <[EMAIL PROTECTED]> wrote: > > >> For 0: hash(+0.0)==hash(-0.0)==hash(0)=hash(0L)=0 > > > Unfortunately, that assumes that equality is transitive. > > No, but the (transitively closed set of equivalent objects) must have > the same hash. ... Er, how do you have a transitive closure for a non-transitive operation? I really do mean that quite a lot of floating-point bells and whistles are non-transitive. The only one most people will have come across is IEEE NaNs, where 'a is b' does not imply 'a == b', but there are a lot of others (and have been since time immemorial). I don't THINK that IEEE 754R decimal introduces any, though I am not prepared to bet on it. > > let us say that I am implementing a special function and want to > > distinguish -0.0 and +0.0. Why can't I use a dictionary? > > Because they are equal. They aren't identical, but they are equal. You have missed my point, which is extended floating-points effectively downgrade the status of the purely numeric comparisons, and therefore introduce a reasonable requirement for using a tighter match. Note that I am merely commenting that this needs bearing in mind, and NOT that anything should be changed. > >>>> a = float("+0.0") > >>>> b = float("-0.0") > >>>> print a, b > >0.0 -0.0 > > With the standard windows distribution, I get just > > 0.0 0.0 Watch that space :-) Expect it to change. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python's C interface for types
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: > > > I really do mean that quite a lot of floating-point bells and whistles > > are non-transitive. > > If so, they just shouldn't use the equal operator (==). == ought to > be transitive. It should be consistent with has(). Fine. A very valid viewpoint. Would you like to explain that to the IEEE 754 people? Strictly, it is only the reflexive property that IEEE 754 and the Decimal module lack. Yes, A == A is False, if A is a NaN. But the definition of 'transitive' often requires 'reflexive'. >>> from decimal import * >>> x = Decimal("NaN") >>> x == x False I don't know any CURRENT systems where basic floating-point doesn't have the strict transitive relation, but I wouldn't bet that there aren't any. You don't need to extend floating-point to have trouble; even the basic forms often had it. I sincerely hope that one is dead, but people keep reinventing old mistakes :-( The most common form was where comparison was equivalent to subtraction, and there were numbers such that A-B == 0, B-C == 0 but A-C != 0. That could occur even for integers on some systems. I don't THINK that the Decimal specification has reintroduced this, but am not quite sure. > > You have missed my point, which is extended floating-points effectively > > downgrade the status of the purely numeric comparisons, and therefore > > introduce a reasonable requirement for using a tighter match. Note > > that I am merely commenting that this needs bearing in mind, and NOT > > that anything should be changed. > > If introducing extended floating-points would cause trouble to existing > operations, I think extended floating-points should not be introduced > to Python. If all three of you really need them, come up with method > names to express "almost equal" or "equal only after sunset". Fine. Again, a very valid viewpoint. Would you like to explain it to the IEEE 754, Decimal and C99 people, and the Python people who think that tracking C is a good idea? We already have the situation where A == B == 0, but where 'C op A' != 'C op B' != 'C op 0'. Both where op is a built-in operator and where 'C op' is a standard library function. This one is NOT going to go away, and is going to get more serious, especially if extended floating-point formats like Decimal take off. Note that it is not a fault in Decimal, but a feature of almost all extended floating-points. As I said, I have no answer to it. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python's C interface for types
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: > > >> If so, they just shouldn't use the equal operator (==). == ought to > >> be transitive. It should be consistent with has(). > > > > Fine. A very valid viewpoint. Would you like to explain that to > > the IEEE 754 people? > > Why should I? I don't talk about IEEE 754, I talk about Python. The problem is that Python is increasingly assuming IEEE 754 by implication, and you were stating something as a requirement that isn't true in IEEE 754. > > Strictly, it is only the reflexive property that IEEE 754 and the > > Decimal module lack. Yes, A == A is False, if A is a NaN. But > > the definition of 'transitive' often requires 'reflexive'. > > I deliberately stated 'transitive', not 'reflexive'. The standard > definition of 'transitive' is "if a==b and b==c then a==c". When I was taught mathematics, the lecturer said that a transitive relation is a reflexive one that has that extra property. It was then (and may still be) a fairly common usage. I apologise for being confusing! > > The most common form was where comparison was equivalent to subtraction, > > and there were numbers such that A-B == 0, B-C == 0 but A-C != 0. That > > could occur even for integers on some systems. I don't THINK that the > > Decimal specification has reintroduced this, but am not quite sure. > > I'm not talking about subtraction, either. I'm talking about == and > hash. Grrk. Look again. So am I. But let this one pass, as I don't think that mistake will return - and I sincerely hope not! > > Fine. Again, a very valid viewpoint. Would you like to explain it > > to the IEEE 754, Decimal and C99 people, and the Python people who > > think that tracking C is a good idea? > > I'm not explaining anything. I'm stating an opinion. You are, however, stating an opinion that conflicts with the direction that Python is currently taking. > It doesn't look like you *need* to give an answer now. I thought > you were proposing some change to Python (although I'm uncertain > what that change could have been). If you are merely explaining > things (to whom?), just keep going. Thanks. I hope the above clarifies things a bit. My purpose in posting is to point out that some changes are already happening, by inclusion from other standards, that are introducing problems to Python. And to many other languages, incidentally, including Fortran and C. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python's C interface for types
"Jim Jewett" <[EMAIL PROTECTED]> wrote: > > > Fine. A very valid viewpoint. Would you like to explain that to > > the IEEE 754 people? > > When Decimal was being argued, Tim pointed out that the standard > requires certain operations, but doesn't require specific spelling > shortcuts. If you managed to do (and document) it right, people would > be grateful for methods like > > a.exactly(b) > a.close_enough(b) > a.same_expected_value(b) > > but that doesn't mean any of them should be used when testing a==b Hmm. That is misleading, as you state it. IEEE 754R doesn't include specific spellings, but IEEE 754 assuredly does. For example, it states that the equality operator that delivers False for NaN = NaN is spelled .EQ. in Fortran. There was no C standard at the time, but the "ad hoc' spellings are clearly intended for C-like languages, and C99 is very clear that the above equality operator is spelled '=='. However, there is no requirement that Python uses those names. What IS important is (a) that the comparisons are consistent, (b) that IEEE 754 (and IEEE 754R) define no reflexivity-preserving equality operator and (c) that the current float type derives its comparisons from C. > (In Lisp, you typically can specify which equality predicate a > hashtable should use on pairs of keys; in python, you only specify > which it should use on objects of your class, and if the other object > in the comparison disagrees, you're out of luck.) Yup. > > Strictly, it is only the reflexive property that IEEE 754 and the > > Decimal module lack. Yes, A == A is False, if A is a NaN. > > Therefore NaNs should never be used (in python) as dictionary keys. > Therefore, they should be unhashable. Again, a very valid point. Are you suggesting a change? :-) Currently, on my Linux system, Decimal raises an exception when trying to hash a NaN value but float doesn't. Is that a bug? > Also note that PyObject_RichCompareBool (from Objects/object.c) > assumes the reflexive property, and if you try to violate it, you will > get occasional surprises. Oh, yes, indeed! > > We already have the situation where A == B == 0, but where > > 'C op A' != 'C op B' != 'C op 0'. Both where op is a built-in > > operator and where 'C op' is a standard library function. > > That's fine; it just means that numeric equality may not be the > strongest possible equivalence. hash in particular just happens to be > defined in terms of ==, however == is determined. NO!!! What it means is that the equality operator may not be the strongest numeric equivalence! A much stronger statement. As I said, I am not grinding an axe, and have no answers. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] generic async io (was: microthreading vs. async io)
[EMAIL PROTECTED] wrote: > > I think this discussion would be facilitated by teasing the first > bullet-point from the latter two: the first deals with async IO, while > the latter two deal with cooperative multitasking. > > It's easy to write a single package that does both, but it's much harder > to write *two* fairly generic packages with a clean API between them, > given the varied platform support for async IO and the varied syntax and > structures (continuations vs. microthreads, in my terminology) for > multitasking. Yet I think that division is exactly what's needed. Hmm. Now, please, people, don't take offence, but I don't know how to phrase this tactfully :-( The 'threading' approach to asynchronous I/O was found to be a BAD IDEA back in the 1970s, was abandoned in favour of separating asynchronous I/O from threading, and God alone knows why it was reinvented - except that most of the people with prior experience had died or retired :-( Let's go back to the days when asynchronous I/O was the norm, and I/O performance critical applications drove the devices directly. In those days, yes, that approach did make sense. But it rapidly ceased to do so with the advent of 'semi-intelligent' devices and the virtualisation of I/O by the operating system. That was in the mid-1970s. Nowadays, ALL devices are semi-intelligent and no system since Unix has allowed applications direct access to devices, except for specialised HPC and graphics. We used to get 90% of theoretical peak performance on mainframes using asynchronous I/O from clean, portable applications, but it was NOT done by treating the I/O as threads and controlling their synchronisation by hand. In fact, quite the converse! It was done by realising that asynchronous I/O and explicit threading are best separated ENTIRELY. There were two main models: Streaming, as in most languages (Fortran, C, Python, but NOT in POSIX). The key properties here are that the transfer boundaries have no significance, only heavyweight synchronisation primitives (open, close etc.) provide any constraints on when data are actually transferred and (for very high performance) buffers are unavailable from when a transfer is started to when it is checked. If copying is acceptable, the last constraint can be dropped. In the simple case, this allows the library/system to reblock and perform transfers asynchronously. In the more advanced case, the application has to use multiple buffering (at least double), but can get full performance without any form of threading. IBM MVT applications used to get up to 90% without hassle in parallel with computation and using only a single thread (well, there was only a single CPU, anyway). The other model is transactions. This has the property that there is a global commit primitive, and the order of transfers is undefined between commits. Inter alia, it means that overlapping transfers are undefined behaviour, whether in a single thread or in multiple threads. BSP uses this model. The MPI-2 design team included a lot of ex-mainframe people and specifies both models. While it is designed for parallel applications, the I/O per se is not controlled like threads. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] generic async io (was: microthreading vs. async io)
[EMAIL PROTECTED] wrote: > > Knowing the history of something like this is very helpful, but I'm not > sure what you mean by this first paragraph. I think I'm most unclear > about the meaning of "The 'threading' approach to asynchronous I/O"? > Its opposite ("separating asynchronous I/O from threading") doesn't > illuminate it much more. Could you elaborate? I'll try. Sorry about being unclear - it is one of my failings. Here is an example draft of some interfaces: Threading - An I/O operation passes a buffer, length, file and action and receives a token back. This token can be queried for completion, waited on and so on, and is cancelled by waiting on it and getting a status back. I.e. it is a thread-like object. This is the POSIX-style operation, and is what I say cannot be made to work effectively. Streaming - An I/O operation either writes some data to a stream or reads some data from it; such actions are sequenced within a thread, but not between threads (even if the threads coordinate their I/O). Data written goes into limbo until it is read, and there is no way for a reader to find the block boundaries it was written with or whether data HAS been written. A non-blocking read merely tests if data are ready for reading, which is not the same. There are no positioning operations, and only open, close and POSSIBLY a heavyweight synchronise or rewind (both equivalent to close+open) force written data to be transferred. Think of Fortran sequential I/O without BACKSPACE or C I/O without ungetc/ungetchar/fseek/fsetpos. Transactions An I/O operation either writes some data to a file or reads some data from it. There is no synchronisation of any form until a commit. If two transfers between a pair of commits overlap (including file length changes), the behaviour is undefined. All I/O includes its own positioning, and no positioning is relative. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] generic async io (was: microthreading vs. async io)
Greg Ewing <[EMAIL PROTECTED]> wrote: > > > An I/O operation passes a buffer, length, file and action and receives a > > token back. > > You seem to be using the word "threading" in a completely > different way than usual here, which may be causing some > confusion. Not really, though I may have been unclear again. Here is why that approach is best regarded as a threading concept: Perhaps the main current approach to using threads to implement asynchronous I/O operates by the main threads doing just that, and the I/O threads transferring the data synchronously. The reason that a token is needed is to avoid a synchronous data copy that blocks both threads. My general point is that all experience is that asynchronous I/O is best done by separating it completely from threads, and defining a proper asynchronous but NOT threaded interface. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Class destructor
I am gradually making progress with my binary floating-point software, but have had to rewrite several times as I have forgotten most of the details of how to do it! After 30 years, I can't say I am surprised. But I need to clean up workspace when a class (not object) is deallocated. I can't easily use attributes, as people suggested, because there is no anonymous storage built-in type. I could subvert one of the existing storage types (buffer, string etc.), but that is unclean. And I could write one, but that is excessive. So far, I have been unable to track down how to get something called when a class is destroyed. The obvious attempts all didn't work, in a variety of ways. Surely there must be a method? This could be in either Python or C. Thanks. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Class destructor
"Phillip J. Eby" <[EMAIL PROTECTED]> wrote: > > >But I need to clean up workspace when a class (not object) is > >deallocated. I can't easily use attributes, as people suggested, > >because there is no anonymous storage built-in type. I could subvert > >one of the existing storage types (buffer, string etc.), but that is > >unclean. And I could write one, but that is excessive. > > > >So far, I have been unable to track down how to get something called > >when a class is destroyed. The obvious attempts all didn't work, in > >a variety of ways. Surely there must be a method? This could be in > >either Python or C. > > Have you tried a PyCObject? This is pretty much what they're for: Oh, yes, I use them in several places, but they don't really help. Their first problem is that they take a 'void *' and not a request for space, so I have to allocate and deallocate the space manually. Now, I could add a destructor to each of them and do that, but it isn't really much prettier than subverting one of the semi-generic storage types for an improper purpose! It would be a heck of a lot cleaner to deallocate all of my space in exactly the converse way that I allocate and initialise it. It would also all me to collect and log statistics, should I so choose. This could be VERY useful for tuning! I haven't done that, yet, but might well do so. All in all, what I need is some way to get a callback when a class object is destroyed. Well, actually, any time from its last use for object work and the time that its space is reclaimed - I don't need any more precise time than that. I suppose that I could add a C object as an attribute that points to a block of memory that contains copies of all my workspace pointers, and use the object deallocator to clean up. If all else fails, I will try that, but it seems a hell of a long way round for what I would have thought was a basic requirement. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Class destructor
"Phillip J. Eby" <[EMAIL PROTECTED]> wrote: > > Well, you could use a custom metaclass with a tp_dealloc or whatever. Yes, I thought of that, but a custom metaclass to provide one callback is pretty fair overkill! > But I just mainly meant that a PyCObject is almost as good as a weakref > for certain purposes -- i.e. it's got a pointer and a callback. Ah. Yes. Thanks for suggesting it - if it is the simplest way, I may as well do it. > You could of course also use weak references, but that's a bit more > awkward as well. Yes. And they aren't a technology I have used (in Python), so I would have to find out about them. Attributes etc. I have already played with. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Class destructor
"Guido van Rossum" <[EMAIL PROTECTED]> wrote: > > Can you explain the reason for cleaning up in this scenario? Are you > rapidly creating and destroying temporary class objects? Why can't you > rely on the regular garbage collection process? Or does you class > create an external resource like a temp file? Effectively the latter. The C level defines a meta-class, which is instantiated with a specific precision, range etc. to derive the class that can actually be used. There can be an arbitrary number of such derived classes, with different properties. Very like Decimal, but with the context as part of the derived class. The instantiation creates quite a lot of constants and scratch space, some of which are Python objects but others of which are just Python memory (PyMem_Malloc); this is where an anonymous storage built-in type would be useful. The contents of these are of no interest to any Python code, and even the objects are ones which mustn't be accessed by the exported interfaces. Also, on efficiency grounds, all of those need to be accessible by C pointers from the exported class. Searching by name every time they are needed is far too much overhead. Note that, as with Decimal, the issue is that they are arbitrary sized and therefore can't simply be put in the class structure. Now, currently, I have implemented the suggestion of using the callback on the C object that points to the structure that contains the pointers to all of those. I need to investigate it in more detail, because I have had mixed success - that could well be the result of another bug in my code, so let's not worry about it. In THIS case, I am now pretty sure that I don't need any more, but I can imagine classes where it wouldn't be adequate. In particular, THIS code doesn't need to do anything other than free memory, so I don't care whether the C object attribute callback is called before or after the class object is disposed of. But that is obviously not the case in general. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Class destructor
Sorry about a second message, but I mentioned this aspect earlier, and it is semi-independent. If I want to produce statistics, such as the times spent in various operations, I need a callback when the class is disposed of. Now, that WOULD be inconvenient to use the C object attribute callback, unless I could be sure that would be called while the class structure is still around. That could be resolved by taking a copy, of course, but that is messy. This also relates to one of my problems with the callback. I am not being called back if the class is still live at program termination; ones that have had their use counts drop to zero do cause a callback, but not ones whose use count is above zero. I am not sure whether this is my error or a feature of the garbage collector. If the latter, it doesn't matter from the point of view of freeing space, but is assuredly a real pain for producing statistics. I haven't looked into it, as it is not an immediate task. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] except Exception as err, tb [was: with_traceback]
"Jim Jewett" <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: > > > Since this can conceivably be going on in parallel in multiple > > threads, we really don't ever want to be sharing whatever object > > contains the head of the chain of tracebacks since it mutates at every > > frame bubble-up. > > So (full) exceptions can't be unitary objects. > > In theory, raising an already-instantiated instance could indicate "no > traceback", which could make pre-cooked exceptions even lighter. Grrk. I think that this is right, but the wrong way to think of it! If we regard a kind of exception as a class, and an actual occurrence as an instance, things become a lot cleaner. The class is very simple, because all it says is WHAT happened - let's say divide by zero, or an attempt to finagle an object of class chameleon. The instance contains all of the information about the details, such as the exact operation, the values and the context (including the traceback). It CAN'T be an object, because it is not 'assignable' (i.e. a value) - it is inherently bound to its context. You can turn it into an object by copying its context into an assignable form, but the actual instance is not assignable. This becomes VERY clear when you try to implement advanced exception handling - rare nowadays - including the ability to trap exceptions, fix up the failure and continue (especially in a threaded environment). This makes no sense whatsoever in another context, and it becomes clear that the action of turning an instance into an object disables the ability to fix up the exception and continue. You can still raise a Python-style exception (i.e. abort up to the closest handler), but you can't resume transparently. I have implemented such a system, IBM CEL was one, and VMS had/has one. I don't know of any in the Unix or Microsoft environments, but there may be a few in specialised areas. Harking back to your point, your "already-instantiated instance" is actually an object derived directly from the exception class, and everything becomes clear. Because it is an object, any context it includes was a snapshot and is no longer valid. In your case, you would want it to have "context: unknown". Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Access to bits for a PyLongObject
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote: > Eric V. Smith schrieb: > > I'm working on PEP 3101, Advanced String Formatting. About the only > > built-in numeric formatting I have left to do is for converting a > > PyLongOjbect to binary. > > > > I need to know how to access the bits in a PyLong. > > I think it would be a major flaw in PEP 3101 if you really needed it. > The long int representation should be absolutely opaque - even the > fact that it is a sign+magnitude representation should be hidden. Well, it depends on the level for which PEP 3101 is intended. I had the same problem, and was pointed at _PyLong_AsByteArray, which was all I needed. In my case, going though a semi-generic formatter would not have been an acceptable interface (performance). However, if PEP 3101 is intended to be a higher level of formatting, then I agree with you. So I have nailed my colours firmly to the fence :-) Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of thread cancellation
Grrk. I have done this myself, and been involved in one of the VERY few commercial projects that attempted to do it properly (IBM CEL, the other recent one being VMS). I am afraid that there are a lot of misapprehensions here. Several people have said things like: > The thing to model this on, I think, would be the > BSD sigmask mechanism, which lets you selectively > block certain signals to create a critical section > of code. A context manager could be used to make > its use easier and less error-prone (i.e. harder > to block async exceptions and then forget to unblock > them). No, no, no! That is an TRULY horrible! It works fairly well for things like device drivers, which are both structurally simple and with no higher level recovery mechanism, so that a failure turning into a hard hang is not catastrophic. But it is precisely what you DON'T want for complex applications, especially when a thread may need to call an external service 'non-interruptibly'. Think of updating a complex object in a multi-file database, for example. Interrupting half-way through leaves the database in a mess, but blocking interrupts while (possibly remote) file updates complete is asking for a hang. You also see it in horrible GUI (including raw mode text) programs that won't accept interrupts until you have completed the action they think you have started. One of the major advantages of networked systems is that you can usually log in remotely and kill -9 the damn process! The way that I, IBM and DEC approached it was by the classic callback mechanism, with a carefully designed way of promoting unhandled exceptions/interrupts. For example, the following is roughly what I did (somewhat extended, as I didn't do all of this for all exceptions): An event set a defined flag, which could be tested (and cleared) by the thread. If a second, similar event arrived (or it was not handled after a suitable time), the event was escalated. If so, a handler was called that HAD to return (again within a specific time). If a second, similar event arrived or it didn't return by a suitable time, the event was escalated. If so, another handler was called that COULDN'T return. If another event arrived, it returned, or it failed to close down the thread, the event was escalated. If so, the thread's built-in environment was closed down without giving the thread a chance to intervene. If that failed, the event was escalated. If so, the thread was frozen and process termination started. If clean termination failed, the event was escalated. If so, the run-time system produced a dump and killed itself. You can implement a BSD-style ignore by having an initial handler that just clears the flag and returns, but a third interrupt before it does so will force close-down. There was also a facility to escalate an exception at the point of generation, which could be useful for forcible closedown. There are a zillion variations of the above, but all mainframe experience is that callbacks are the only sane way to approach the problem IN APPLICATIONS. In kernel code, that is not so, which is why so many of the computer scientists design BSD-style handling (i.e. they think of kernel programming rather than very complex application programming). > Unconditionally killing a whole process is no big > problem because all the resources it's using get > cleaned up by the OS, and the effect on other > processes is minimal and well-defined (pipes and > sockets get EOF, etc.). But killing a thread can > leave the rest of the program in an awkward state. I wish that were so :-( Sockets, terminals etc. are stateful devices, and killing a process can leave them in a very unclean state. It is one of the most common causes of unkillable processes (the process can't go until its files do, and the socket is jammed). Many people can witness the horrible effects of ptys being left in 'echo off' or worse states, the X focus being left in a stuck override redirect window and so on. But you also have the multi-file database problem, which also applies to shared memory segments. Even if the process dies cleanly, it may be part of an application whose state is global across many processes. One common example is adding or deleting a user, where an unclean kill can leave the system in a very weird state. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status of thread cancellation
Jon Ribbens <[EMAIL PROTECTED]> wrote: > > Can you elaborate on this? You can get zombie entries in the process > table if nobody's called 'wait()' on them, and you can (extremely > rarely) get unkillable process in 'disk-wait' state (usually due to > hardware failure or a kernel bug, I suspect), but I've never heard > of a process on a Unix-like system being unkillable due to something > to do with sockets (or any other kind of file descriptor for that > matter). How could a socket be 'jammed'? What does that even mean? Well, I have seen it hundreds of times on a dozen different Unices; it is very common. You don't always SEE the stuck process - sometimes the 'kill -9' causes the pid to become invisible to ps etc., and just occasionally it can continue to use CPU until the system is rebooted. That is rare, however, and it normally just hangs onto locks, memory and other such resources. Very often its vampiric status is visible only because such things haven't been freed, or when you poke through kernel structures. Sockets get jammed because they are used to connect to subprocesses or kernel threads, which in turn access unreliable I/O devices. If there is a glitch on the device, the error recovery very often fails to work, cleanly, and may wait for an event that will never occur or go into a loop (usually a sleep/poll loop). Typically, a HIGHER level then times out the failing error recovery, so that the normal programmer doesn't notice. But it very often fails to kill the lower level code. As far as applications are concerned, a jammed socket is one where the higher level recovery has NOT done that, and is waiting for the lower level to complete - which it isn't going to do! The other effect that ordinary programmers notice is a system very gradually starting to run down after days/weeks/months of continual operation. The state is cleared by rebooting. Regards, Nick Maclaren. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com