[Python-Dev] GC Changes
Hello, I've been doing some tests on removing the GIL, and it's becoming clear that some basic changes to the garbage collector may be needed in order for this to happen efficiently. Reference counting as it stands today is not very scalable. I've been looking into a few options, and I'm leaning towards the implementing IBMs recycler GC (http://www.research.ibm.com/people/d/dfb/recycler-publications.html ) since it is very similar to what is in place now from the users' perspective. However, I haven't been around the list long enough to really understand the feeling in the community on GC in the future of the interpreter. It seems that a full GC might have a lot of benefits in terms of performance and scalability, and I think that the current gc module is of the mark-and-sweep variety. Is the trend going to be to move away from reference counting and towards the mark-and-sweep implementation that currently exists, or is reference counting a firmly ingrained tradition? On a more immediately relevant note, I'm not certain I understand the full extent of the gc module. From what I've read, it sounds like it's fairly close to a fully functional GC, yet it seems to exist only as a cycle-detecting backup to the reference counting mechanism. Would somebody care to give me a brief overview on how the current gc module interacts with the interpreter, or point me to a place where that is done? Why isn't the mark-and-sweep mechanism used for all memory management? Thanks a lot! Justin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
On 9/30/07, Greg Ewing <[EMAIL PROTECTED]> wrote: > Michael Foord wrote: > > We stick to using the .NET file I/O and so don't > > have a problem. The only time it is an issue for us is our tests, where > > we have string literals in our test code (where new lines are obviously > > '\n') > > If you're going to do that, you really need to be consistent > about and have IronPython use \r\n internally for line endings > *everywhere*, including string literals. I don't know what you mean by "internally". There's lots of portable code that uses the \n character in string literals (either to generate line endings or to recognize them). That code can't suddenly be made invalid. And changing all string literals that say "\n" to secretly become "\r\n" would be worse than the \r <--> \n swap that some old Apple tools used to do. (If len("\n") == 2, what would len("\r\n") be?) > > It is just slightly ironic that the time Python 'gets it wrong' (for > > some value of wrong) is when you are using text mode for I/O :-) > > I would say IronPython is getting it wrong by using inconsistent > internal representations of line endings. Honestly, I find it hard to see much merit in this discussion. A number of Python libraries, including print() and io.py, use \n to represent line endings in memory, and translate these to/from platform-appropriate line endings when reading/writing text files. OTOH, some other APIs, for example, sockets talking various internet protocols (from SMTP to HTTP) as well as most (all?) native .NET APIs, use \r\n to represent line endings. There are any number of ways to convert between these conversions, including various invocations of s.replace() and s.splitlines() (the latter does a universal-newlines-like thing). Applications can take care of this, and APIs can choose to use either convention for line endings (or both, in the case of input). Yes, occasionally users get confused. Too bad. They'll have to learn about this issue. The issue isn't going away by wishing it to go away; it is a fundamental difference between Windows and Unix, and neither is likely to change or disappear. Changing Python to use the Windows convention internally isn't going to help one bit. Changing Python to use the platforn's convention is impossible without introducing a new string escape that would mean \r\n on Windows and \n on Unix; and given that there are legitimate reasons to sometimes deal with \r\n explicitly even on Unix (and with just \n even on Windows) we wouldn't be completely isolated from the issue. Changing APIs to not represent the line ending as a character (as the Java I/O libraries do) would be too big a change (and how would we distinguish between readline() returning an empty line and EOF?) -- and I'm sure the issue still pops up in plenty of places in Java. The best solution for IronPython is probably to have the occasional wrapper around .NET APIs that translates between \r\n and \n on the boundary between Python and .NET; but one must be able to turn this off or bypass the wrappers in cases where the data retrieved from one .NET API is just passed straight on to another .NET API (and the translation would just cause two redundant copies being made). Get used to it. End of discussion. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Michael Foord wrote: > We stick to using the .NET file I/O and so don't > have a problem. The only time it is an issue for us is our tests, where > we have string literals in our test code (where new lines are obviously > '\n') If you're going to do that, you really need to be consistent about and have IronPython use \r\n internally for line endings *everywhere*, including string literals. > It is just slightly ironic that the time Python 'gets it wrong' (for > some value of wrong) is when you are using text mode for I/O :-) I would say IronPython is getting it wrong by using inconsistent internal representations of line endings. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New lines, carriage returns, and Windows
[EMAIL PROTECTED] wrote: > I've been thinking about this some more (in lieu of actually writing up any > sort of proposal ;-) and I'm not so sure it would be all that useful. Yes, despite being the one who suggested it, I've come to the same conclusion myself. The problem should really be addressed at the source, which is the Python/.NET boundary. Anything else would just lead to ambiguity. So I'm voting -1 on my own proposal here. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New lines, carriage returns, and Windows
Nick Maclaren wrote: > I have implemented both of those two models > on systems that are FAR more different than most people can imagine. > Both work, and neither causes confusion. The C/Unix/Python one does. Now I'm not sure what *you* mean by the C/Unix/Python model. As far as newlines are concerned, the internal model is fine as far as I can see. > a mismatch between >>the world of Python strings which use "\n" and .NET >>library code expecting strings which use "\r\n". > > That's an I/O problem :-) If you define passing a string to/from any .NET function as I/O, I suppose that's true, but it's not what people normally mean by the term. > the REASON it causes trouble is the inconsistency > in the basic C/Unix/Python text I/O model. Let's consider just > \f, \r and \n, But we're not talking about \f or anything else here, only newlines. I've never heard anyone complain about getting confused over the handling of \f in Python. That may be because nobody uses \f for anything these days, but whatever the reason, it seems to be a non-issue. In any case, it still doesn't mean that you "don't get back what you wrote". If you write "\f\n" to a file using Python and read it back, you get "\f\n". If you write just "\f", you get back "\f". What the \f *means* is a separate issue. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] OpenSSL httplib bug
On 9/30/07, Richie Ward <[EMAIL PROTECTED]> wrote: > I was using httplib to power my xml rpc script. > > I had problems when I wanted to use SSL and I got this error: > File "/usr/lib/python2.5/httplib.py", line 1109, in recv > return self._ssl.read(len) > socket.sslerror: (8, 'EOF occurred in violation of protocol') > > I figured out this was because of poor error handling in python. > > May I suggest this as a fix to this bug: > $ diff /usr/lib/python2.5/httplib.py /usr/lib/python2.5/httplib.py~ > 1109,1112c1109 > < try: > < return self._ssl.read(len) > < except socket.sslerror: > < return > --- > > return self._ssl.read(len) > > Just a note. I am by no means a python expert, just good enough to get > my work done. > I use Ubuntu gutsy. If you could, Richie, please open a bug report at bugs.python.org and attach your patch to it? That way it won't get lost and it can be attended to properly. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Nick Maclaren wrote: > I don't know PRECISELY what you mean by "universal newlines mode" I mean precisely what Python means by the term: any of "\r", "\n" or "\r\n" represent a newline, and no distinction is made between them. You only need to use that if you don't know what convention is being used by the file you're reading. And if you don't know that, you've already lost information about what the contents of the file means, and there's nothing that any I/O system can do to get it back. > Model 1: certain characters can be used only in combination. > ... That's all fine if you know the file adheres to those conventions. Just open it in binary mode and go for it. The I/O systems of C and/or Python are designed for environments where the files *don't* adhere to conventions as helpful as that. They're making the best of what they're given. > Note that the above is what the program sees - what is written > to the outside world and how input is read is another matter. > > But I can assure you, from my own and many other people's experience, > that neither of the above models cause the confusion being shown by > the postings in this thread. There's no confusion about how newlines are represented *inside* a Python program. The convention is quite clear - a newline is "\n" and only "\n". Confusion only arises when people try to process strings internally that don't adhere to that convention. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] OpenSSL httplib bug
I was using httplib to power my xml rpc script. I had problems when I wanted to use SSL and I got this error: File "/usr/lib/python2.5/httplib.py", line 1109, in recv return self._ssl.read(len) socket.sslerror: (8, 'EOF occurred in violation of protocol') I figured out this was because of poor error handling in python. May I suggest this as a fix to this bug: $ diff /usr/lib/python2.5/httplib.py /usr/lib/python2.5/httplib.py~ 1109,1112c1109 < try: < return self._ssl.read(len) < except socket.sslerror: < return --- > return self._ssl.read(len) Just a note. I am by no means a python expert, just good enough to get my work done. I use Ubuntu gutsy. -- Thanks, Richie Ward ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Michael Foord <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > > Michael> Actually, I usually get these strings from Windows UI > > Michael> components. A file containing '\r\n' is read in with '\r\n' > > Michael> being translated to '\n'. New user input is added containing > > Michael> '\r\n' line endings. The file is written out and now contains a > > Michael> mix of '\r\n' and '\r\r\n'. > > > > So you need a translation layer between the UI component and your code. > > Treat the component as a text file and perform the desired mapping. Yes? > > Actually the problem was reported by one of the IronPython developers on > behalf of another user. We stick to using the .NET file I/O and so don't > have a problem. The only time it is an issue for us is our tests, where > we have string literals in our test code (where new lines are obviously > '\n') and we do a manual 'replace'. Not very difficult. > > It is just slightly ironic that the time Python 'gets it wrong' (for > some value of wrong) is when you are using text mode for I/O :-) Plus ca change, That has been the problem for as long as I have been using the byte stream model (nearly 40 years now). Provided that you can get control, OR there are well-defined semantics, you can sort things out. The semantics "we define only the trivial case, and the programmer must do something arcane, undefined and system-dependent for the rest" means that it is impossible for an interface to do the 'right' translation unless it knows what each side of it is assuming. As I say, there are solutions. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
[EMAIL PROTECTED] wrote: > Michael> Actually, I usually get these strings from Windows UI > Michael> components. A file containing '\r\n' is read in with '\r\n' > Michael> being translated to '\n'. New user input is added containing > Michael> '\r\n' line endings. The file is written out and now contains a > Michael> mix of '\r\n' and '\r\r\n'. > > So you need a translation layer between the UI component and your code. > Treat the component as a text file and perform the desired mapping. Yes? > > Actually the problem was reported by one of the IronPython developers on behalf of another user. We stick to using the .NET file I/O and so don't have a problem. The only time it is an issue for us is our tests, where we have string literals in our test code (where new lines are obviously '\n') and we do a manual 'replace'. Not very difficult. It is just slightly ironic that the time Python 'gets it wrong' (for some value of wrong) is when you are using text mode for I/O :-) Michael > Skip > > ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Michael> Actually, I usually get these strings from Windows UI Michael> components. A file containing '\r\n' is read in with '\r\n' Michael> being translated to '\n'. New user input is added containing Michael> '\r\n' line endings. The file is written out and now contains a Michael> mix of '\r\n' and '\r\r\n'. So you need a translation layer between the UI component and your code. Treat the component as a text file and perform the desired mapping. Yes? Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New lines, carriage returns, and Windows
[EMAIL PROTECTED] wrote: > > I've been thinking about this some more (in lieu of actually writing up any > sort of proposal ;-) and I'm not so sure it would be all that useful. If > you've opened a file in text mode you should only be writing newlines as > '\n' anyway. If you want to translate a text file imported from another > system to use the current system's line ending just open both the input and > output files in text mode. I.e. at least \r, \f and \v are discouraged - i.e. system-dependent, at best. That works. > With universal newlines mode for output, should writing '\r\n' result in one > or two newlines (or one-and-a-half)? Depending on the platform you can > argue that it should write out '\r\r', '\r\n\r\n' or '\n\n' or if on Windows > that it should be left alone as '\r\n'. There is, of course, the current > '\r\r\n' behavior as well. I don't think there's obviously one best answer. Quite. And it has nothing to do with the format the outside system uses - your first question is purely a matter of what the semantics of the Python program are. The question applies as much to zOS as to any of the systems Python supports. > If you want to do something esoteric, open the file in binary mode and do > whatever you like. Er, no. That's the Unix mistake. It works, provided two things are true: 1) You don't need to write portable formatting. 2) The 'outside system' uses the control characters of a byte stream for formatting. Let's skip (1) - but (2) is universally true, nowadays, isn't it? Er, no. Consider reading and writing to an X window (NOT an xterm). Such formatting is out-of-band (sorry, I used out-of-bound in a previous posting). Ouch. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New lines, carriage returns, and Windows
Greg> Maybe there should be a universal newlines mode defined for output Greg> as well as input, which translates any of "\r", "\n" or "\r\n" Greg> into the platform line ending. Skip> I'd be open to such a change. Principle of least surprise? Guido> The symmetry isn't as strong as you suggest, but I agree it would Guido> be a useful feature. Would you mind filing a Py3k feature request Guido> so we don't forget? Guido> A proposal for an API given the existing newlines=... parameter Guido> (described in detail in PEP 3116) would be even better. I've been thinking about this some more (in lieu of actually writing up any sort of proposal ;-) and I'm not so sure it would be all that useful. If you've opened a file in text mode you should only be writing newlines as '\n' anyway. If you want to translate a text file imported from another system to use the current system's line ending just open both the input and output files in text mode. With universal newlines mode for output, should writing '\r\n' result in one or two newlines (or one-and-a-half)? Depending on the platform you can argue that it should write out '\r\r', '\r\n\r\n' or '\n\n' or if on Windows that it should be left alone as '\r\n'. There is, of course, the current '\r\r\n' behavior as well. I don't think there's obviously one best answer. If you want to do something esoteric, open the file in binary mode and do whatever you like. Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New lines, carriage returns, and Windows
Greg Ewing <[EMAIL PROTECTED]> wrote: > > I don't see how this is different from Unix/C "\n" being > an atomic newline character. Have you used systems with the I/O models I referred to (or ones with newlines being out-of-bound data)? > If you're saying that BCPL is better because it defines > standard semantics for more control characters than just > "\n", that may be true, but C is doing about the best it > can with "\n" as far as I can see, given all the crazy > things that different OSes want to do with line endings. I am afraid that you are wrong - see my other posting for how to do it better. Look, I have implemented both of those two models on systems that are FAR more different than most people can imagine. Both work, and neither causes confusion. The C/Unix/Python one does. > In any case, the problem which started all this isn't > really an I/O problem at all, it's a mismatch between > the world of Python strings which use "\n" and .NET > library code expecting strings which use "\r\n". That's an I/O problem :-) > The correct thing to do with that is to translate whenever > a string crosses a boundary between Python code and > .NET code. This is something that ought to be done > automatically by the Python/.NET interfacing machinery, > maybe by having a different type for .NET strings. Agreed. But the REASON it causes trouble is the inconsistency in the basic C/Unix/Python text I/O model. Let's consider just \f, \r and \n, and a few questions: Exactly what does a free-standing \f mean? Does \n\f\n mean starting at the top of a page or one line down? How do \r and \f interact with line-buffering? Think about MacOS here. I could go on, but those are enough to indicate that the problem is insoluble. The answer "Undefined but not even explicitly discouraged" is a recipe for confusion. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Greg Ewing <[EMAIL PROTECTED]> wrote: > > > Grrk. That's the problem. You don't get back what you have written > > You do as long as you *don't* use universal newlines mode > for reading. This is the best that can be done, because > universal newlines are inherently ambiguous. I don't know PRECISELY what you mean by "universal newlines mode", and this issue is all about the details, so any response would merely enhance the confusion. > If you want universal newlines, you just have to accept > that you can't also have \r characters meaning something > other than newlines in your files. This is true regardless > of what programming language or I/O model is being used. No, that is not true, and I have used more than one model where it wasn't. Let's stick to models where newlines are special characters - I prefer the ones where they are not, but that is by the way. Model 1: certain characters can be used only in combination. E.g. \f must occur immediately before (or after) a \n, which it modifies. r is either a newline-with-overprint or must be associated with a \n. In both cases, only ONE of the alternatives is permitted in the chosen model - the other use then becomes an error (and raises an exception). Model 2: (BCPL) there are a variety of newline characters, \n for plain newline, \f for newline-with-form-feed and \r for newline- with-overprint. ALL cause a newline, with the associated property. Note that the above is what the program sees - what is written to the outside world and how input is read is another matter. But I can assure you, from my own and many other people's experience, that neither of the above models cause the confusion being shown by the postings in this thread. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com