Re: Blog about python 3
Le mercredi 8 janvier 2014 01:02:22 UTC+1, Terry Reedy a écrit : On 1/7/2014 9:54 AM, Terry Reedy wrote: On 1/7/2014 8:34 AM, wxjmfa...@gmail.com wrote: Le dimanche 5 janvier 2014 23:14:07 UTC+1, Terry Reedy a écrit : Memory: Point 2. A *design goal* of FSR was to save memory relative to UTF-32, which is what you apparently prefer. Your examples show that FSF successfully met its design goal. But you call that success, saving memory, 'wrong'. On what basis? Point 2: This Flexible String Representation does no effectuate any memory optimization. It only succeeds to do the opposite of what a corrrect usage of utf* do. Since the FSF *was* successful in saving memory, and indeed shrank the Python binary by about a megabyte, I have no idea what you mean. Tim Delaney apparently did, and answered on the basis of his understanding. Note that I said that the design goal was 'save memory RELATIVE TO UTF-32', not 'optimize memory'. UTF-8 was not considered an option. Nor was any form of arithmetic coding https://en.wikipedia.org/wiki/Arithmetic_coding to truly 'optimize memory'. The FSR acts more as an coding scheme selector than as a code point optimizer. Claiming that it saves memory is some kind of illusion; a little bit as saying Py2.7 uses relatively less memory than Py3.2 (UCS-2). sys.getsizeof('a' * 1 + 'z') 10026 sys.getsizeof('a' * 1 + '€') 20040 sys.getsizeof('a' * 1 + '\U0001') 40044 sys.getsizeof('€' * 1 + '€') 20040 sys.getsizeof('€' * 1 + '\U0001') 40044 sys.getsizeof('\U0001' * 1 + '\U0001') 40044 jmf -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/8/2014 4:59 AM, wxjmfa...@gmail.com wrote: [responding to me] The FSR acts more as an coding scheme selector That is what PEP 393 describes and what I and many others have said. The FSR saves memory by selecting from three choices the most compact coding scheme for each string. I ask again, have you read PEP 393? If you are going to critique the FSR, you should read its basic document. than as a code point optimizer. I do not know what you mean by 'code point optimizer'. Claiming that it saves memory is some kind of illusion; Do you really think that the mathematical fact 10026 20040 40044 (from your example below) is some kind of illusion? If so, please take your claim to a metaphysics list. If not, please stop trolling. a little bit as saying Py2.7 uses relatively less memory than Py3.2 (UCS-2). This is inane as 2.7 and 3.2 both use the same two coding schemes. Saying '1 2' is different from saying '2 2'. On 3.3+ sys.getsizeof('a' * 1 + 'z') 10026 sys.getsizeof('a' * 1 + '€') 20040 sys.getsizeof('a' * 1 + '\U0001') 40044 3.2- wide (UCS-4) builds use about 40050 bytes for all three unicode strings. One again, you have posted examples that show how FSR saves memory, thus negating your denial of the saving. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 07/01/2014 13:34, wxjmfa...@gmail.com wrote: Le dimanche 5 janvier 2014 23:14:07 UTC+1, Terry Reedy a écrit : Ned : this has already been explained and illustrated. jmf This has never been explained and illustrated. Roughly 30 minutes ago Terry Reedy once again completely shot your argument about memory usage to pieces. You did not bother to respond to the comments from Tim Delaney made almost one day ago. Please give up. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Le dimanche 5 janvier 2014 23:14:07 UTC+1, Terry Reedy a écrit : On 1/5/2014 9:23 AM, wxjmfa...@gmail.com wrote: Le samedi 4 janvier 2014 23:46:49 UTC+1, Terry Reedy a écrit : On 1/4/2014 2:10 PM, wxjmfa...@gmail.com wrote: And I could add, I *never* saw once one soul, who is explaining what I'm doing wrong in the gazillion of examples I gave on this list. If this is true, it is because you have ignored and not read my numerous, relatively polite posts. To repeat very briefly: 1. Cherry picking (presenting the most extreme case as representative). 2. Calling space saving a problem (repeatedly). 3. Ignoring bug fixes. ... My examples are ONLY ILLUSTRATING, this FSR is wrong by design, can be on the side of memory, performance, linguistic or even typography. Let me expand on 3 of my points. First, performance == time: Point 3. You correctly identified a time regression in finding a character in a string. I saw that the slowdown was *not* inherent in the FSR but had to be a glitch in the code, and reported it on pydev with the hope that someone would fix it even if it were not too important in real use cases. Someone did. Point 1. You incorrectly generalized that extreme case. I reported (a year ago last September) that the overall stringbench results were about the same. I also pointed out that there is an equally non-representative extreme case in the opposite direction, and that it would equally be wrong of me to use that to claim that FSR is faster. (It turns out that this FSR speed advantage *is* inherent in the design.) Memory: Point 2. A *design goal* of FSR was to save memory relative to UTF-32, which is what you apparently prefer. Your examples show that FSF successfully met its design goal. But you call that success, saving memory, 'wrong'. On what basis? You *claim* the FSR is 'wrong by design', but your examples only show that is was temporarily wrong in implementation as far as speed and correct by design as far as memory goes. Point 3: You are right. I'm very happy to agree. Point 2: This Flexible String Representation does no effectuate any memory optimization. It only succeeds to do the opposite of what a corrrect usage of utf* do. Ned : this has already been explained and illustrated. jmf -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/7/2014 8:34 AM, wxjmfa...@gmail.com wrote: Le dimanche 5 janvier 2014 23:14:07 UTC+1, Terry Reedy a écrit : Memory: Point 2. A *design goal* of FSR was to save memory relative to UTF-32, which is what you apparently prefer. Your examples show that FSF successfully met its design goal. But you call that success, saving memory, 'wrong'. On what basis? Point 2: This Flexible String Representation does no effectuate any memory optimization. It only succeeds to do the opposite of what a corrrect usage of utf* do. Since the FSF *was* successful in saving memory, and indeed shrank the Python binary by about a megabyte, I have no idea what you mean. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 8 January 2014 00:34, wxjmfa...@gmail.com wrote: Point 2: This Flexible String Representation does no effectuate any memory optimization. It only succeeds to do the opposite of what a corrrect usage of utf* do. UTF-8 is a variable-width encoding that uses less memory to encode code points with lower numerical values, on a per-character basis e.g. if a code point = U+007F it will use a single byte to encode; if = U+07FF two bytes will be used; ... up to a maximum of 6 bytes for code points = U+400. FSR is a variable-width memory structure that uses the width of the code point with the highest numerical value in the string e.g. if all code points in the string are = U+00FF a single byte will be used per character; if all code points are = U+ two bytes will be used per character; and in all other cases 4 bytes will be used per character. In terms of memory usage the difference is that UTF-8 varies its width per-character, whereas the FSR varies its width per-string. For any particular string, UTF-8 may well result in using less memory than the FSR, but in other (quite common) cases the FSR will use less memory than UTF-8 e.g. if the string contains only contains code points = U+00FF, but some are between U+0080 and U+00FF (inclusive). In most cases the FSR uses the same or less memory than earlier versions of Python 3 and correctly handles all code points (just like UTF-8). In the cases where the FSR uses more memory than previously, the previous behaviour was incorrect. No matter which representation is used, there will be a certain amount of overhead (which is the majority of what most of your examples have shown). Here are examples which demonstrate cases where UTF-8 uses less memory, cases where the FSR uses less memory, and cases where they use the same amount of memory (accounting for the minimum amount of overhead required for each). Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64 bit (AMD64)] on win32 Type help, copyright, credits or license for more information. import sys fsr = u utf8 = fsr.encode(utf-8) min_fsr_overhead = sys.getsizeof(fsr) min_utf8_overhead = sys.getsizeof(utf8) min_fsr_overhead 49 min_utf8_overhead 33 fsr = u\u0001 * 1000 utf8 = fsr.encode(utf-8) sys.getsizeof(fsr) - min_fsr_overhead 1000 sys.getsizeof(utf8) - min_utf8_overhead 1000 fsr = u\u0081 * 1000 utf8 = fsr.encode(utf-8) sys.getsizeof(fsr) - min_fsr_overhead 1024 sys.getsizeof(utf8) - min_utf8_overhead 2000 fsr = u\u0001\u0081 * 1000 utf8 = fsr.encode(utf-8) sys.getsizeof(fsr) - min_fsr_overhead 2024 sys.getsizeof(utf8) - min_utf8_overhead 3000 fsr = u\u0101 * 1000 utf8 = fsr.encode(utf-8) sys.getsizeof(fsr) - min_fsr_overhead 2025 sys.getsizeof(utf8) - min_utf8_overhead 2000 fsr = u\u0101\u0081 * 1000 utf8 = fsr.encode(utf-8) sys.getsizeof(fsr) - min_fsr_overhead 4025 sys.getsizeof(utf8) - min_utf8_overhead 4000 Indexing a character in UTF-8 is O(N) - you have to traverse the the string up to the character being indexed. Indexing a character in the FSR is O(1). In all cases the FSR has better performance characteristics for indexing and slicing than UTF-8. There are tradeoffs with both UTF-8 and the FSR. The Python developers decided the priorities for Unicode handling in Python were: 1. Correctness a. all code points must be handled correctly; b. it must not be possible to obtain part of a code point (e.g. the first byte only of a multi-byte code point); 2. No change in the Big O characteristics of string operations e.g. indexing must remain O(1); 3. Reduced memory use in most cases. It is impossible for UTF-8 to meet both criteria 1b and 2 without additional auxiliary data (which uses more memory and increases complexity of the implementation). The FSR meets all 3 criteria. Tim Delaney -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/7/2014 9:54 AM, Terry Reedy wrote: On 1/7/2014 8:34 AM, wxjmfa...@gmail.com wrote: Le dimanche 5 janvier 2014 23:14:07 UTC+1, Terry Reedy a écrit : Memory: Point 2. A *design goal* of FSR was to save memory relative to UTF-32, which is what you apparently prefer. Your examples show that FSF successfully met its design goal. But you call that success, saving memory, 'wrong'. On what basis? Point 2: This Flexible String Representation does no effectuate any memory optimization. It only succeeds to do the opposite of what a corrrect usage of utf* do. Since the FSF *was* successful in saving memory, and indeed shrank the Python binary by about a megabyte, I have no idea what you mean. Tim Delaney apparently did, and answered on the basis of his understanding. Note that I said that the design goal was 'save memory RELATIVE TO UTF-32', not 'optimize memory'. UTF-8 was not considered an option. Nor was any form of arithmetic coding https://en.wikipedia.org/wiki/Arithmetic_coding to truly 'optimize memory'. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Sat, Jan 4, 2014 at 6:27 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Fast is never more important than correct. It's just that sometimes you might compromise a little (or a lot) on what counts as correct in order for some speed. Is this statement even falsifiable? Can you conceive of a circumstance where someone has traded correctness for speed, but where one couldn't describe it that latter way? I can't. I think by definition you can always describe it that way, you just make what counts as correctness be what the customer wants given the resources available. The conventional definition, however, is what the customer wants, imagining that you have infinite resources. With just a little redefinition that seems reasonable, you can be made never to be wrong! I avoid making unfalsifiable arguments that aren't explicitly labeled as such. I try to reword them as, I prefer to look at it as ... -- it's less aggressive, which means people are more likely to really listen to what you have to say. It also doesn't pretend to be an argument when it isn't. -- Devin -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Le dimanche 5 janvier 2014 03:54:29 UTC+1, Chris Angelico a écrit : On Sun, Jan 5, 2014 at 1:41 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: wxjmfa...@gmail.com wrote: The very interesting aspect in the way you are holding unicodes (strings). By comparing Python 2 with Python 3.3, you are comparing utf-8 with the the internal representation of Python 3.3 (the flexible string represenation). This is incorrect. Python 2 has never used UTF-8 internally for Unicode strings. In narrow builds, it uses UTF-16, but makes no allowance for surrogate pairs in strings. In wide builds, it uses UTF-32. That's for Python's unicode type. What Robin said was that they were using either a byte string (str) with UTF-8 data, or a Unicode string (unicode) with character data. So jmf was right, except that it's not specifically to do with Py2 vs Py3.3. Yes, the key point is the preparation of the unicode text for the PDF producer. This is at this level the different flavours of Python may be relevant. I see four possibilites, I do not know what the PDF producer API is expecting. - Py2 with utf-8 byte string (ev. utf-16, utf-32) - Py2 with its internal unicode - Py3.2 with its internal unicode - Py3.3 with its internal unicode jmf -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 31.12.2013 10:53, Steven D'Aprano wrote: Mark Lawrence wrote: http://blog.startifact.com/posts/alex-gaynor-on-python-3.html. I quote: ...perhaps a brave group of volunteers will stand up and fork Python 2, and take the incremental steps forward. This will have to remain just an idle suggestion, as I'm not volunteering myself. I expect that as excuses for not migrating get fewer, and the deadline for Python 2.7 end-of-life starts to loom closer, more and more haters^W Concerned People will whine about the lack of version 2.8 and ask for *somebody else* to fork Python. I find it, hmmm, interesting, that so many of these Concerned People who say that they're worried about splitting the Python community[1] end up suggesting that we *split the community* into those who have moved forward to Python 3 and those who won't. Exactly. I don't know what exactly their problem is. I've pushed the migration of *large* projects at work to Python3 when support was pretty early and it really wasn't a huge deal. Specifically because I love pretty much every single aspect that Python3 introduced. The codec support is so good that I've never seen anything like it in any other programming language and then there's the tons of beautiful changes (div/intdiv, functools.lru_cache, print(), datetime.timedelta.total_seconds(), int.bit_length(), bytes/bytearray). Regards, Joe -- Wo hattest Du das Beben nochmal GENAU vorhergesagt? Zumindest nicht öffentlich! Ah, der neueste und bis heute genialste Streich unsere großen Kosmologen: Die Geheim-Vorhersage. - Karl Kaos über Rüdiger Thomas in dsa hidbv3$om2$1...@speranza.aioe.org -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Devin Jeanpierre wrote: On Sat, Jan 4, 2014 at 6:27 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Fast is never more important than correct. It's just that sometimes you might compromise a little (or a lot) on what counts as correct in order for some speed. Is this statement even falsifiable? Can you conceive of a circumstance where someone has traded correctness for speed, but where one couldn't describe it that latter way? I can't. Every time some programmer optimises a piece of code (or, more often, *thinks* they have optimised it) which introduces bugs into the software, that's a case where somebody has traded correctness for speed where my statement doesn't apply. Sometimes the response to the subsequent bug report is will not fix, and a retroactive change in the software requirements. (Oh, did we say that indexing a string would return a character? We meant it would return a character, so long as the string only includes no Unicode characters in the astral planes.) Sometimes it is to revert the optimisation or otherwise fix the bug. I accept that there is sometimes a fine line here. I'm assuming that software applications have their requirements fully documented, which in the real world is hardly ever the case. Although, even if the requirements aren't always written down, often they are implicitly understood. (Although it gets very interesting when the users' understanding and the developers' understanding is different.) Take as an example this torture test for a mathematical sum function, where the built-in sum() gets the wrong answer but math.fsum() gets it right: py from math import fsum py values = [1e12, 0.0001, -1e12, 0.0001]*1 py fsum(values) 2.0 py sum(values) 2.4413841796875 Here's another example of the same thing, just to prove it's not a fluke: py values = [1e17, 1, 1, -1e17] py fsum(values) 2.0 py sum(values) 0.0 The reason for the different results is that fsum() tries hard to account for intermediate rounding errors and sum() does not. If you benchmark the two functions, you'll find that sum() is significantly faster than fsum. So the question to be asked is, does sum() promise to calculate floating point sums accurately? If so, then this is a bug, probably introduced by the desire for speed. But in fact, sum() does not promise to calculate floating point sums accurately. What it promises to do is to calculate the equivalent of a + b + c + ... for as many values as given, and that's exactly what it does. Conveniently, that's faster than fsum(), and usually accurate enough for most uses. Is sum() buggy? No, of course not. It does what it promises, it's just that what it promises to do falls short of calculate floating point summations to high accuracy. Now, here's something which *would* be a bug, if sum() did it: class MyInt(int): def __add__(self, other): return MyInt(super(MyInt, self).__add__(other)) def __radd__(self, other): return MyInt(super(MyInt, self).__radd__(other)) def __repr__(self): return MyInt(%d) % self Adding a zero MyInt to an int gives a MyInt: py MyInt(0) + 23 MyInt(23) so sum() should do the same thing. If it didn't, if it optimised away the actual addition because adding zero to a number can't change anything, it would be buggy. But in fact, sum() does the right thing: py sum([MyInt(0), 23]) MyInt(23) I think by definition you can always describe it that way, you just make what counts as correctness be what the customer wants given the resources available. Not quite. Correct means does what the customer wants. Or if there is no customer, it's does what you say it will do. How do we tell when software is buggy? We compare what it actually does to the promised behaviour, or expected behaviour, and if there is a discrepancy, we call it a bug. We don't compare it to some ideal that cannot be met. A bug report that math.pi does not have infinite number of decimal places would be closed as Will Not Fix. Likewise, if your customer pays you to solve the Travelling Salesman Problem exactly, even if it takes a week to calculate, then nothing short of a program that solves the Travelling Salesman Problem exactly will satisfy their requirements. It's no good telling the customer that you can calculate a non-optimal answer twenty times faster if they want the actual optimal answer. (Of course, you may try to persuade them that they don't really need the optimal solution, or that they cannot afford it, or that you cannot deliver and they need to compromise.) The conventional definition, however, is what the customer wants, imagining that you have infinite resources. I don't think the resources really come into it. At least, certainly not *infinite* resources. fsum() doesn't require infinite resources to calculate floating point summations to high accuracy. An even more accurate (but even slower) version would convert each float into a Fraction, then add the Fractions.
Re: Blog about python 3
On Sun, Jan 5, 2014 at 11:28 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: - The Unix 'locate' command doesn't do a live search of the file system because that would be too slow, it uses a snapshot of the state of the file system. Is locate buggy because it tells you what files existed the last time the updatedb command ran, instead of what files exist right now? No, of course not. locate does exactly what it promises to do. Even more strongly: We say colloquially that Google, DuckDuckGo, etc, etc, are tools for searching the web. But they're not. They're tools for *indexing* the World Wide Web, and then searching that index. It's plausible to actually search your file system (and there are times when you want that), but completely implausible to search the (F or otherwise) web. We accept the delayed appearance of a page in the search results because we want immediate results, no waiting a month to find anything! So the difference between what's technically promised and what's colloquially described may be more than just concealing bugs - it may be the vital difference between uselessness and usefulness. And yet we like the handwave. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 31/12/2013 09:53, Steven D'Aprano wrote: Mark Lawrence wrote: http://blog.startifact.com/posts/alex-gaynor-on-python-3.html. I quote: ...perhaps a brave group of volunteers will stand up and fork Python 2, and take the incremental steps forward. This will have to remain just an idle suggestion, as I'm not volunteering myself. I expect that as excuses for not migrating get fewer, and the deadline for Python 2.7 end-of-life starts to loom closer, more and more haters^W Concerned People will whine about the lack of version 2.8 and ask for *somebody else* to fork Python. Should the somebody else fork Python, in ten (ish) years time the Concerned People will be complaining that they can't port their code to Python 4 and will somebody else please produce version 2.9. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Johannes Bauer, 05.01.2014 13:14: I've pushed the migration of *large* projects at work to Python3 when support was pretty early and it really wasn't a huge deal. I think there are two sides to consider. Those who can switch their code base to Py3 and be happy (as you did, apparently), and those who cannot make the switch but have to keep supporting Py2 until 'everyone' else has switched, too. The latter is a bit more work generally and applies mostly to Python packages on PyPI, i.e. application dependencies. There are two ways to approach that problem. One is to try convincing people that Py3 has failed, let's stop migrating more code before I have to start migrating mine, and the other is to say let's finish the migration and get it done, so that we can finally drop Py2 support in our new releases and clean up our code again. As long as we stick in the middle and keep the status quo, we keep the worst of both worlds. And, IMHO, pushing loudly for a Py2.8 release provides a very good excuse for others to not finish their part of the migration, thus prolonging the maintenance burden for those who already did their share. Maybe a couple of major projects should start dropping their Py2 support, just to make their own life easier and to help others in taking their decision, too. (And that's me saying that, who maintains two major projects that still have legacy support for Py2.4 ...) Stefan -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Le samedi 4 janvier 2014 23:46:49 UTC+1, Terry Reedy a écrit : On 1/4/2014 2:10 PM, wxjmfa...@gmail.com wrote: Le samedi 4 janvier 2014 15:17:40 UTC+1, Chris Angelico a écrit : any, and Python has only one, idiot like jmf who completely Chris, I appreciate the many contributions you make to this list, but that does not exempt you from out standard of conduct. misunderstands what's going on and uses microbenchmarks to prove obscure points... and then uses nonsense to try to prove... uhh... Troll baiting is a form of trolling. I think you are intelligent enough to know this. Please stop. I do not mind to be considered as an idiot, but I'm definitively not blind. And I could add, I *never* saw once one soul, who is explaining what I'm doing wrong in the gazillion of examples I gave on this list. If this is true, it is because you have ignored and not read my numerous, relatively polite posts. To repeat very briefly: 1. Cherry picking (presenting the most extreme case as representative). 2. Calling space saving a problem (repeatedly). 3. Ignoring bug fixes. 4. Repetition (of the 'gazillion example' without new content). Have you ever acknowledged, let alone thank people for, the fix for the one bad regression you did find. The FSR is still a work in progress. Just today, Serhiy pushed a patch speeding up the UTF-32 encoder, after previously speeding up the UTF-32 decoder. -- My examples are ONLY ILLUSTRATING, this FSR is wrong by design, can be on the side of memory, performance, linguistic or even typography. I will not refrain you to waste your time in adjusting bytes, if the problem is not on that side. jmf -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/5/14 9:23 AM, wxjmfa...@gmail.com wrote: Le samedi 4 janvier 2014 23:46:49 UTC+1, Terry Reedy a écrit : On 1/4/2014 2:10 PM, wxjmfa...@gmail.com wrote: I do not mind to be considered as an idiot, but I'm definitively not blind. And I could add, I *never* saw once one soul, who is explaining what I'm doing wrong in the gazillion of examples I gave on this list. If this is true, it is because you have ignored and not read my numerous, relatively polite posts. To repeat very briefly: 1. Cherry picking (presenting the most extreme case as representative). 2. Calling space saving a problem (repeatedly). 3. Ignoring bug fixes. 4. Repetition (of the 'gazillion example' without new content). Have you ever acknowledged, let alone thank people for, the fix for the one bad regression you did find. The FSR is still a work in progress. Just today, Serhiy pushed a patch speeding up the UTF-32 encoder, after previously speeding up the UTF-32 decoder. -- My examples are ONLY ILLUSTRATING, this FSR is wrong by design, can be on the side of memory, performance, linguistic or even typography. JMF: this has been pointed out to you time and again: the flexible string representation is not wrong. To show that it is wrong, you would have to demonstrate some semantic of Unicode that is violated. You have never done this. You've picked pathological cases and shown micro-timing output, and memory usage. The Unicode standard doesn't promise anything about timing or memory use. The FSR makes a trade-off of time and space. Everyone but you considers it a good trade-off. I don't think you are showing real use cases, but if they are, I'm sorry that your use-case suffers. That doesn't make the FSR wrong. The most accurate statement is that you don't like the FSR. That's fine, you're entitled to your opinion. You say the FSR is wrong linguistically. This can't be true, since an FSR Unicode string is indistinguishable from an internally-UTF-32 Unicode string, and no, memory use or timings are irrelevant when discussing the linguistic performance of a Unicode string. You've also said that the internal representation of the FSR is incorrect because of encodings somehow. Encodings have nothing to do with the internal representation of a Unicode string, they are for interchanging data. You seem to know a lot about Unicode, but when you make this fundamental mistake, you call all of your expertise into question. To re-iterate what you are doing wrong: 1) You continue to claim things that are not true, and that you have never substantiated. 2) You paste code samples without accompanying text that explain what you are trying to demonstrate. 3) You ignore refutations that disprove your points. These are all the behaviors of a troll. Please stop. If you want to discuss the details of Unicode implementations, I'd welcome an offlist discussion, but only if you will approach it honestly enough to leave open the possibility that you are wrong. I know I would be glad to learn details of Unicode that I have missed, but so far you haven't provided any. --Ned. I will not refrain you to waste your time in adjusting bytes, if the problem is not on that side. jmf -- Ned Batchelder, http://nedbatchelder.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
In article 52c94fec$0$29973$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: How do we tell when software is buggy? We compare what it actually does to the promised behaviour, or expected behaviour, and if there is a discrepancy, we call it a bug. We don't compare it to some ideal that cannot be met. A bug report that math.pi does not have infinite number of decimal places would be closed as Will Not Fix. That's because it is inherently impossible to fix that. But lots of bug reports legitimately get closed with Will Not Fix simply because the added value from fixing it doesn't justify the cost (whether in terms of development effort, or run-time resource consumption). Go back to the package sorting example I gave. If the sorting software mis-reads the address and sends my package to Newark instead of New York by mistake, that's clearly a bug. Presumably, it's an error which could be eliminated (or, at least, the rate of occurrence reduced) by using a more sophisticated OCR algorithm. But, if those algorithms take longer to run, the overall expected value of implementing the bug fix software may well be negative. In the real world, nobody cares if software is buggy. They care that it provides value. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
In article mailman.4930.1388908293.18130.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Sun, Jan 5, 2014 at 2:20 PM, Roy Smith r...@panix.com wrote: I've got a new sorting algorithm which is guaranteed to cut 10 seconds off the sorting time (i.e. $0.10 per package). The problem is, it makes a mistake 1% of the time. That's a valid line of argument in big business, these days, because we've been conditioned to accept low quality. But there are places where quality trumps all, and we're happy to pay for that. Allow me to expound two examples. 1) Amazon http://www.amazon.com/exec/obidos/ASIN/1782010165/evertype-20 I bought this book a while ago. It's about the size of a typical paperback. It arrived in a box too large for it on every dimension, with absolutely no packaging. I complained. Clearly their algorithm was: Most stuff will get there in good enough shape, so people can't be bothered complaining. And when they do complain, it's cheaper to ship them another for free than to debate with them on chat. You're missing my point. Amazon's (short-term) goal is to increase their market share by undercutting everybody on price. They have implemented a box-packing algorithm which clearly has a bug in it. You are complaining that they failed to deliver your purchase in good condition, and apparently don't care. You're right, they don't. The cost to them to manually correct this situation exceeds the value. This is one shipment. It doesn't matter. You are one customer, you don't matter either. Seriously. This may be annoying to you, but it's good business for Amazon. For them, fast and cheap is absolutely better than correct. I'm not saying this is always the case. Clearly, there are companies which have been very successful at producing a premium product (Apple, for example). I'm not saying that fast is always better than correct. I'm just saying that correct is not always better than fast. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Mon, Jan 6, 2014 at 3:34 AM, Roy Smith r...@panix.com wrote: Amazon's (short-term) goal is to increase their market share by undercutting everybody on price. They have implemented a box-packing algorithm which clearly has a bug in it. You are complaining that they failed to deliver your purchase in good condition, and apparently don't care. You're right, they don't. The cost to them to manually correct this situation exceeds the value. This is one shipment. It doesn't matter. If it stopped there, it would be mildly annoying (1% of our shipments will need to be replaced, that's a 1% cost for free replacements). The trouble is that they don't care about the replacement either, so it's really that 100% (or some fairly large proportion) of their shipments will arrive with some measure of damage, and they're hoping that their customers' threshold for complaining is often higher than the damage sustained. Which it probably is, a lot of the time. You are one customer, you don't matter either. Seriously. This may be annoying to you, but it's good business for Amazon. For them, fast and cheap is absolutely better than correct. But this is the real problem, business-wise. Can you really run a business by not caring about your customers? (I also think it's pretty disappointing that a business like Amazon can't just toss in some bubbles, or packing peanuts (what we call trucks for hysterical raisins), or something. It's not that hard to have a machine just blow in some sealed air before the box gets closed... surely?) Do they have that much of a monopoly, or that solid a customer base, that they're happy to leave *everyone* dissatisfied? We're not talking about 1% here. From the way the cust svc guy was talking, I get the impression that they do this with all parcels. And yet I can't disagree with your final conclusion. Empirical evidence goes against my incredulous declaration that surely this is a bad idea - according to XKCD 1165, they're kicking out nearly a cubic meter a *SECOND* of packages. That's fairly good evidence that they're doing something that, whether it be right or wrong, does fit with the world's economy. Sigh. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Chris Angelico ros...@gmail.com wrote: Can you really run a business by not caring about your customers? http://snltranscripts.jt.org/76/76aphonecompany.phtml -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/5/2014 9:23 AM, wxjmfa...@gmail.com wrote: Le samedi 4 janvier 2014 23:46:49 UTC+1, Terry Reedy a écrit : On 1/4/2014 2:10 PM, wxjmfa...@gmail.com wrote: And I could add, I *never* saw once one soul, who is explaining what I'm doing wrong in the gazillion of examples I gave on this list. If this is true, it is because you have ignored and not read my numerous, relatively polite posts. To repeat very briefly: 1. Cherry picking (presenting the most extreme case as representative). 2. Calling space saving a problem (repeatedly). 3. Ignoring bug fixes. ... My examples are ONLY ILLUSTRATING, this FSR is wrong by design, can be on the side of memory, performance, linguistic or even typography. Let me expand on 3 of my points. First, performance == time: Point 3. You correctly identified a time regression in finding a character in a string. I saw that the slowdown was *not* inherent in the FSR but had to be a glitch in the code, and reported it on pydev with the hope that someone would fix it even if it were not too important in real use cases. Someone did. Point 1. You incorrectly generalized that extreme case. I reported (a year ago last September) that the overall stringbench results were about the same. I also pointed out that there is an equally non-representative extreme case in the opposite direction, and that it would equally be wrong of me to use that to claim that FSR is faster. (It turns out that this FSR speed advantage *is* inherent in the design.) Memory: Point 2. A *design goal* of FSR was to save memory relative to UTF-32, which is what you apparently prefer. Your examples show that FSF successfully met its design goal. But you call that success, saving memory, 'wrong'. On what basis? You *claim* the FSR is 'wrong by design', but your examples only show that is was temporarily wrong in implementation as far as speed and correct by design as far as memory goes. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/5/2014 9:23 AM, wxjmfa...@gmail.com wrote: My examples are ONLY ILLUSTRATING, this FSR is wrong by design, Let me answer you a different way. If FSR is 'wrong by design', so are the alternatives. Hence, the claim is, in itself, useless as a guide to choosing. The choices: * Keep the previous complicated system of buggy narrow builds on some systems and space-wasting wide builds on other systems, with Python code potentially acting differently on the different builds. I am sure that you agree that this is a bad design. * Improved the dual-build system by de-bugging narrow builds. I proposed to do this (and gave Python code proving the idea) by adding the complication of an auxiliary array of indexes of astral chars in a UTF-16 string. I suspect you would call this design 'wrong' also. * Use the memory-wasting UTF-32 (wide) build on all systems. I know you do not consider this 'wrong', but come on. From an information theoretic and coding viewpoint, it clearly is. The top (4th) byte is *never* used. The 3rd byte is *almost never* used. The 2nd byte usage ranges from common to almost never for different users. Memory waste is also time waste, as moving information-free 0 bytes takes the same time as moving informative bytes. Here is the beginning of the rationale for the FSR (from http://www.python.org/dev/peps/pep-0393/ -- have you ever read it?). There are two classes of complaints about the current implementation of the unicode type: on systems only supporting UTF-16, users complain that non-BMP characters are not properly supported. On systems using UCS-4 internally (and also sometimes on systems using UCS-2), there is a complaint that Unicode strings take up too much memory - especially compared to Python 2.x, where the same code would often use ASCII strings The memory waste was a reason to stick with 2.7. It could break code that worked in 2.x. By removing the waste, the FSR makes switching to Python 3 more feasible for some people. It was a response to real problems encountered by real people using Python. It fixed both classes of complaint about the previous system. * Switch to the time-wasting UTF-8 for text storage, as some have done. This is different from using UTF-8 for text transmission, which I hope becomes the norm soon. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/5/2014 11:51 AM, Chris Angelico wrote: On Mon, Jan 6, 2014 at 3:34 AM, Roy Smith r...@panix.com wrote: Amazon's (short-term) goal is to increase their market share by undercutting everybody on price. They have implemented a box-packing algorithm which clearly has a bug in it. You are complaining that they failed to deliver your purchase in good condition, and apparently don't care. You're right, they don't. The cost to them to manually correct this situation exceeds the value. This is one shipment. It doesn't matter. If it stopped there, it would be mildly annoying (1% of our shipments will need to be replaced, that's a 1% cost for free replacements). The trouble is that they don't care about the replacement either, so it's really that 100% (or some fairly large proportion) of their shipments will arrive with some measure of damage, and they're hoping that their customers' threshold for complaining is often higher than the damage sustained. Which it probably is, a lot of the time. My wife has gotten several books from Amazon and partners and we have never gotten one loose enough in a big enough box to be damaged. Either the box is tight or has bubble packing. Leaving aside partners, maybe distribution centers have different rules. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Mon, Jan 6, 2014 at 9:56 AM, Terry Reedy tjre...@udel.edu wrote: On 1/5/2014 11:51 AM, Chris Angelico wrote: On Mon, Jan 6, 2014 at 3:34 AM, Roy Smith r...@panix.com wrote: Amazon's (short-term) goal is to increase their market share by undercutting everybody on price. They have implemented a box-packing algorithm which clearly has a bug in it. You are complaining that they failed to deliver your purchase in good condition, and apparently don't care. You're right, they don't. The cost to them to manually correct this situation exceeds the value. This is one shipment. It doesn't matter. If it stopped there, it would be mildly annoying (1% of our shipments will need to be replaced, that's a 1% cost for free replacements). The trouble is that they don't care about the replacement either, so it's really that 100% (or some fairly large proportion) of their shipments will arrive with some measure of damage, and they're hoping that their customers' threshold for complaining is often higher than the damage sustained. Which it probably is, a lot of the time. My wife has gotten several books from Amazon and partners and we have never gotten one loose enough in a big enough box to be damaged. Either the box is tight or has bubble packing. Leaving aside partners, maybe distribution centers have different rules. Or possibly (my personal theory) the CS rep I was talking to just couldn't be bothered solving the problem. Way way too much work to make the customer happy, much easier and cheaper to give a 30% refund and hope that shuts him up. But they managed to ship two books (the original and the replacement) with insufficient packaging. Firstly, that requires the square of the probability of failure; and secondly, if you care even a little bit about making your customers happy, put a little note on the second order instructing people to be particularly careful of this one! Get someone to check it before it's sent out. Make sure it's right this time. I know that's what we used to do in the family business whenever anything got mucked up. (BTW, I had separately confirmed that the problem was with Amazon, and not - as has happened to me with other shipments - caused by Australian customs officials opening the box, looking through it, and then packing it back in without its protection. No, it was shipped that way.) Anyway, this is veering so far off topic that we're at no risk of meeting any Python Alliance ships - as Mal said, we're at the corner of No and Where. But maybe someone can find an on-topic analogy to put some tentative link back into this thread... ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Chris Angelico wrote about Amazon: And yet I can't disagree with your final conclusion. Empirical evidence goes against my incredulous declaration that surely this is a bad idea - according to XKCD 1165, they're kicking out nearly a cubic meter a SECOND of packages. Yes, but judging by what you described as their packing algorithm that's probably only a tenth of a cubic metre of *books*, the rest being empty box for the book to rattle around in and get damaged. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Roy Smith wrote: In article mailman.4930.1388908293.18130.python-l...@python.org, Chris Angelico ros...@gmail.com wrote: On Sun, Jan 5, 2014 at 2:20 PM, Roy Smith r...@panix.com wrote: I've got a new sorting algorithm which is guaranteed to cut 10 seconds off the sorting time (i.e. $0.10 per package). The problem is, it makes a mistake 1% of the time. That's a valid line of argument in big business, these days, because we've been conditioned to accept low quality. But there are places where quality trumps all, and we're happy to pay for that. Allow me to expound two examples. 1) Amazon http://www.amazon.com/exec/obidos/ASIN/1782010165/evertype-20 I bought this book a while ago. It's about the size of a typical paperback. It arrived in a box too large for it on every dimension, with absolutely no packaging. I complained. Clearly their algorithm was: Most stuff will get there in good enough shape, so people can't be bothered complaining. And when they do complain, it's cheaper to ship them another for free than to debate with them on chat. You're missing my point. Amazon's (short-term) goal is to increase their market share by undercutting everybody on price. They have implemented a box-packing algorithm which clearly has a bug in it. You are complaining that they failed to deliver your purchase in good condition, and apparently don't care. You're right, they don't. The cost to them to manually correct this situation exceeds the value. This is one shipment. It doesn't matter. You are one customer, you don't matter either. Seriously. This may be annoying to you, but it's good business for Amazon. For them, fast and cheap is absolutely better than correct. One, you're missing my point that to Amazon, fast and cheap *is* correct. They would not agree with you that their box-packing algorithm is buggy, so long as their customers don't punish them for it. It meets their requirements: ship parcels as quickly as possible, and push as many of the costs (damaged books) onto the customer as they can get away with. If they thought it was buggy, they would be trying to fix it. Two, nobody is arguing against the concept that different parties have different concepts of what's correct. To JMF, the flexible string representation is buggy, because he's detected a trivially small slowdown in some artificial benchmarks. To everyone else, it is not buggy, because it does what it sets out to do: save memory while still complying with the Unicode standard. A small slowdown on certain operations is a cost worth paying. Normally, the definition of correct that matters is that belonging to the paying customer, or failing that, the programmer who is giving his labour away for free. (Extend this out to more stakeholders if you wish, but the more stakeholders you include, the harder it is to get consensus on what's correct and what isn't.) From the perspective of Amazon's customers, presumably so long as the cost of damaged and lost books isn't too high, they too are willing to accept Amazon's definition of correct in order to get cheap books, or else they would buy from someone else. (However, to the extent that Amazon has gained monopoly power over the book market, that reasoning may not apply. Amazon is not *technically* a monopoly, but they are clearly well on the way to becoming one, at which point the customer has no effective choice and the market is no longer free.) The Amazon example is an interesting example of market failure, in the sense that the free market provides a *suboptimal solution* to a problem. We'd all like reasonably-priced books AND reliable delivery, but maybe we can't have both. Personally, I'm not so sure about that. Maybe Jeff Bezos could make do with only five solid gold Mercedes instead of ten[1], for the sake of improved delivery? But apparently not. But I digress... ultimately, you are trying to argue that there is a single absolute source of truth for what counts as correct. I don't believe there is. We can agree that some things are clearly not correct -- Amazon takes your money and sets the book on fire, or hires an armed military escort costing $20 million a day to deliver your book of funny cat pictures. We might even agree on what we'd all like in a perfect world: cheap books, reliable delivery, and a pony. But in practice we have to choose some features over others, and compromise on requirements, and ultimately we have to make a *pragmatic* choice on what counts as correct based on the functional requirements, not on a wishlist of things we'd like with infinite time and money. Sticking to the Amazon example, what percentage of books damaged in delivery ceases to be a bug in the packing algorithm and becomes just one of those things? One in ten? One in ten thousand? One in a hundred billion billion? I do not accept that book gets damaged in transit counts as a bug. More than x% of books get damaged, that's a bug. Average cost to
Re: Blog about python 3
On Mon, Jan 6, 2014 at 12:23 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: (However, to the extent that Amazon has gained monopoly power over the book market, that reasoning may not apply. Amazon is not *technically* a monopoly, but they are clearly well on the way to becoming one, at which point the customer has no effective choice and the market is no longer free.) They don't need a monopoly on the whole book market, just on specific books - which they did have, in the cited case. I actually asked the author (translator, really - it's a translation of Alice in Wonderland) how he would prefer me to buy, as there are some who sell on Amazon and somewhere else. There was no alternative to Amazon, ergo no choice and the market was not free. Like so many things, one choice (I want to buy Ailice's Anters in Ferlielann) mandates another (Must buy through Amazon). I don't know what it cost Amazon to ship me two copies of a book, but still probably less than they got out of me, so they're still ahead. Even if they lost money on this particular deal, they're still way ahead because of all the people who decide it's not worth their time to spend an hour or so trying to get a replacement. So yep, this policy is serving Amazon fairly well. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 06/01/2014 01:54, Chris Angelico wrote: On Mon, Jan 6, 2014 at 12:23 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: (However, to the extent that Amazon has gained monopoly power over the book market, that reasoning may not apply. Amazon is not *technically* a monopoly, but they are clearly well on the way to becoming one, at which point the customer has no effective choice and the market is no longer free.) They don't need a monopoly on the whole book market, just on specific books - which they did have, in the cited case. I actually asked the author (translator, really - it's a translation of Alice in Wonderland) how he would prefer me to buy, as there are some who sell on Amazon and somewhere else. There was no alternative to Amazon, ergo no choice and the market was not free. Like so many things, one choice (I want to buy Ailice's Anters in Ferlielann) mandates another (Must buy through Amazon). I don't know what it cost Amazon to ship me two copies of a book, but still probably less than they got out of me, so they're still ahead. Even if they lost money on this particular deal, they're still way ahead because of all the people who decide it's not worth their time to spend an hour or so trying to get a replacement. So yep, this policy is serving Amazon fairly well. ChrisA So much for my You never know, we might even end up with a thread whereby the discussion is Python, the whole Python and nothing but the Python. :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Le vendredi 3 janvier 2014 12:14:41 UTC+1, Robin Becker a écrit : On 02/01/2014 18:37, Terry Reedy wrote: On 1/2/2014 12:36 PM, Robin Becker wrote: I just spent a large amount of effort porting reportlab to a version which works with both python2.7 and python3.3. I have a large number of functions etc which handle the conversions that differ between the two pythons. I am imagine that this was not fun. indeed :) For fairly sensible reasons we changed the internal default to use unicode rather than bytes. Do you mean 'from __future__ import unicode_literals'? No, previously we had default of utf8 encoded strings in the lower levels of the code and we accepted either unicode or utf8 string literals as inputs to text functions. As part of the port process we made the decision to change from default utf8 str (bytes) to default unicode. Am I correct in thinking that this change increases the capabilities of reportlab? For instance, easily producing an article with abstracts in English, Arabic, Russian, and Chinese? It's made no real difference to what we are able to produce or accept since utf8 or unicode can encode anything in the input and what can be produced depends on fonts mainly. After doing all that and making the tests ... I know some of these tests are fairly variable, but even for simple things like paragraph parsing 3.3 seems to be slower. Since both use unicode internally it can't be that can it, or is python 2.7's unicode faster? The new unicode implementation in 3.3 is faster for some operations and slower for others. It is definitely more space efficient, especially compared to a wide build system. It is definitely less buggy, especially compared to a narrow build system. Do your tests use any astral (non-BMP) chars? If so, do they pass on narrow 2.7 builds (like on Windows)? I'm not sure if we have any non-bmp characters in the tests. Simple CJK etc etc for the most part. I'm fairly certain we don't have any ability to handle composed glyphs (multi-codepoint) etc etc For one thing, indexing and slicing just works on all machines for all unicode strings. Code for 2.7 and 3.3 either a) does not index or slice, b) does not work for all text on 2.7 narrow builds, or c) has extra conditional code only for 2.7. To Robin Becker I know nothing about ReportLab except its existence. Your story is very interesting. As I pointed, I know nothing about the internal of ReportLab, the technical aspects: the Python part, the used api for the PDF creation). I have however some experience with the unicode TeX engine, XeTeX, understand I'm understanding a little bit what's happening behind the scene. The very interesting aspect in the way you are holding unicodes (strings). By comparing Python 2 with Python 3.3, you are comparing utf-8 with the the internal representation of Python 3.3 (the flexible string represenation). In one sense, more than comparing Py2 with Py3. It will be much more interesting to compare utf-8/Python internals at the light of Python 3.2 and Python 3.3. Python 3.2 has a decent unicode handling, Python 3.3 has an absurd (in mathematical sense) unicode handling. This is really shining with utf-8, where this flexible string representation is just doing the opposite of what a correct unicode implementation does! On the memory side, it is obvious to see it. sys.getsizeof('a'*1 + 'z') 10026 sys.getsizeof('a'*1 + '€') 20040 sys.getsizeof(('a'*1 + 'z').encode('utf-8')) 10018 sys.getsizeof(('a'*1 + '€').encode('utf-8')) 10020 On the performance side, it is much more complexe, but qualitatively, you may expect the same results. The funny aspect is that by working with utf-8 in that case, you are (or one has) forcing Python to work properly, but one pays on the side of the performance. And if one wishes to save memory, one has to pay on the side of performance. In othe words, attempting to do what Python is not able to do natively is just impossible! I'm skipping the very interesting composed glyphs subject (unicode normalization, ...), but I wish to point that with the flexible string representation, one reaches the top level of surrealism. For a tool which is supposed to handle these very specific unicode tasks... jmf -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
In article mailman.4882.1388808283.18130.python-l...@python.org, Mark Lawrence breamore...@yahoo.co.uk wrote: Surely everybody prefers fast but incorrect code in preference to something that is correct but slow? I realize I'm taking this statement out of context, but yes, sometimes fast is more important than correct. Sometimes the other way around. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Sun, Jan 5, 2014 at 12:55 AM, Roy Smith r...@panix.com wrote: In article mailman.4882.1388808283.18130.python-l...@python.org, Mark Lawrence breamore...@yahoo.co.uk wrote: Surely everybody prefers fast but incorrect code in preference to something that is correct but slow? I realize I'm taking this statement out of context, but yes, sometimes fast is more important than correct. Sometimes the other way around. More usually, it's sometimes better to be really fast and mostly correct than really really slow and entirely correct. That's why we use IEEE floating point instead of Decimal most of the time. Though I'm glad that Python 3 now deems the default int type to be capable of representing arbitrary integers (instead of dropping out to a separate long type as Py2 did), I think it's possibly worth optimizing small integers to machine words - but mainly, the int type focuses on correctness above performance, because the cost is low compared to the benefit. With float, the cost of arbitrary precision is extremely high, and the benefit much lower. With Unicode, the cost of perfect support is normally seen to be a doubling of internal memory usage (UTF-16 vs UCS-4). Pike and Python decided that the cost could, instead, be a tiny measure of complexity and actually *less* memory usage (compared to UTF-16, when lots of identifiers are ASCII). It's a system that works only when strings are immutable, but works beautifully there. Fortunately Pike doesn't have any, and Python has only one, idiot like jmf who completely misunderstands what's going on and uses microbenchmarks to prove obscure points... and then uses nonsense to try to prove... uhh... actually I'm not even sure what, sometimes. I wouldn't dare try to read his posts except that my mind's already in a rather broken state, as a combination of programming and Alice in Wonderland. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/4/14 9:17 AM, Chris Angelico wrote: On Sun, Jan 5, 2014 at 12:55 AM, Roy Smith r...@panix.com wrote: In article mailman.4882.1388808283.18130.python-l...@python.org, Mark Lawrence breamore...@yahoo.co.uk wrote: Surely everybody prefers fast but incorrect code in preference to something that is correct but slow? I realize I'm taking this statement out of context, but yes, sometimes fast is more important than correct. Sometimes the other way around. More usually, it's sometimes better to be really fast and mostly correct than really really slow and entirely correct. That's why we use IEEE floating point instead of Decimal most of the time. Though I'm glad that Python 3 now deems the default int type to be capable of representing arbitrary integers (instead of dropping out to a separate long type as Py2 did), I think it's possibly worth optimizing small integers to machine words - but mainly, the int type focuses on correctness above performance, because the cost is low compared to the benefit. With float, the cost of arbitrary precision is extremely high, and the benefit much lower. With Unicode, the cost of perfect support is normally seen to be a doubling of internal memory usage (UTF-16 vs UCS-4). Pike and Python decided that the cost could, instead, be a tiny measure of complexity and actually *less* memory usage (compared to UTF-16, when lots of identifiers are ASCII). It's a system that works only when strings are immutable, but works beautifully there. Fortunately Pike doesn't have any, and Python has only one, idiot like jmf who completely misunderstands what's going on and uses microbenchmarks to prove obscure points... and then uses nonsense to try to prove... uhh... actually I'm not even sure what, sometimes. I wouldn't dare try to read his posts except that my mind's already in a rather broken state, as a combination of programming and Alice in Wonderland. ChrisA I really wish we could discuss these things without baiting trolls. -- Ned Batchelder, http://nedbatchelder.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Le samedi 4 janvier 2014 15:17:40 UTC+1, Chris Angelico a écrit : On Sun, Jan 5, 2014 at 12:55 AM, Roy Smith r...@panix.com wrote: In article mailman.4882.1388808283.18130.python-l...@python.org, Mark Lawrence breamore...@yahoo.co.uk wrote: Surely everybody prefers fast but incorrect code in preference to something that is correct but slow? I realize I'm taking this statement out of context, but yes, sometimes fast is more important than correct. Sometimes the other way around. More usually, it's sometimes better to be really fast and mostly correct than really really slow and entirely correct. That's why we use IEEE floating point instead of Decimal most of the time. Though I'm glad that Python 3 now deems the default int type to be capable of representing arbitrary integers (instead of dropping out to a separate long type as Py2 did), I think it's possibly worth optimizing small integers to machine words - but mainly, the int type focuses on correctness above performance, because the cost is low compared to the benefit. With float, the cost of arbitrary precision is extremely high, and the benefit much lower. With Unicode, the cost of perfect support is normally seen to be a doubling of internal memory usage (UTF-16 vs UCS-4). Pike and Python decided that the cost could, instead, be a tiny measure of complexity and actually *less* memory usage (compared to UTF-16, when lots of identifiers are ASCII). It's a system that works only when strings are immutable, but works beautifully there. Fortunately Pike doesn't have any, and Python has only one, idiot like jmf who completely misunderstands what's going on and uses microbenchmarks to prove obscure points... and then uses nonsense to try to prove... uhh... actually I'm not even sure what, sometimes. I wouldn't dare try to read his posts except that my mind's already in a rather broken state, as a combination of programming and Alice in Wonderland. I do not mind to be considered as an idiot, but I'm definitively not blind. And I could add, I *never* saw once one soul, who is explaining what I'm doing wrong in the gazillion of examples I gave on this list. --- Back to ReportLab. Technically I would be really interested to see what could happen at the light of my previous post. jmf -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/4/2014 2:10 PM, wxjmfa...@gmail.com wrote: Le samedi 4 janvier 2014 15:17:40 UTC+1, Chris Angelico a écrit : any, and Python has only one, idiot like jmf who completely Chris, I appreciate the many contributions you make to this list, but that does not exempt you from out standard of conduct. misunderstands what's going on and uses microbenchmarks to prove obscure points... and then uses nonsense to try to prove... uhh... Troll baiting is a form of trolling. I think you are intelligent enough to know this. Please stop. I do not mind to be considered as an idiot, but I'm definitively not blind. And I could add, I *never* saw once one soul, who is explaining what I'm doing wrong in the gazillion of examples I gave on this list. If this is true, it is because you have ignored and not read my numerous, relatively polite posts. To repeat very briefly: 1. Cherry picking (presenting the most extreme case as representative). 2. Calling space saving a problem (repeatedly). 3. Ignoring bug fixes. 4. Repetition (of the 'gazillion example' without new content). Have you ever acknowledged, let alone thank people for, the fix for the one bad regression you did find. The FSR is still a work in progress. Just today, Serhiy pushed a patch speeding up the UTF-32 encoder, after previously speeding up the UTF-32 decoder. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Sun, Jan 5, 2014 at 9:46 AM, Terry Reedy tjre...@udel.edu wrote: On 1/4/2014 2:10 PM, wxjmfa...@gmail.com wrote: Le samedi 4 janvier 2014 15:17:40 UTC+1, Chris Angelico a écrit : any, and Python has only one, idiot like jmf who completely Chris, I appreciate the many contributions you make to this list, but that does not exempt you from out standard of conduct. misunderstands what's going on and uses microbenchmarks to prove obscure points... and then uses nonsense to try to prove... uhh... Troll baiting is a form of trolling. I think you are intelligent enough to know this. Please stop. My apologies. I withdraw the aforequoted post. You and Ned are correct, those comments were inappropriate. Sorry. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Roy Smith wrote: In article mailman.4882.1388808283.18130.python-l...@python.org, Mark Lawrence breamore...@yahoo.co.uk wrote: Surely everybody prefers fast but incorrect code in preference to something that is correct but slow? I realize I'm taking this statement out of context, but yes, sometimes fast is more important than correct. I know somebody who was once touring in the States, and ended up travelling cross-country by road with the roadies rather than flying. She tells me of the time someone pointed out that they were travelling in the wrong direction, away from their destination. The roadie driving replied Who cares? We're making fantastic time! (Ah, the seventies. So many drugs...) Fast is never more important than correct. It's just that sometimes you might compromise a little (or a lot) on what counts as correct in order for some speed. To give an example, say you want to solve the Travelling Salesman Problem, and find the shortest path through a whole lot of cities A, B, C, ..., Z. That's a Hard Problem, expensive to solve correctly. But if you loosen the requirements so that a correct solution no longer has to be the absolutely shortest path, and instead accept solutions which are nearly always close to the shortest (but without any guarantee of how close), then you can make the problem considerably easier to solve. But regardless of how fast your path-finder algorithm might become, you're unlikely to be satisfied with a solution that travels around in a circle from A to B a million times then shoots off straight to Z without passing through any of the other cities. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Sun, Jan 5, 2014 at 1:27 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: But regardless of how fast your path-finder algorithm might become, you're unlikely to be satisfied with a solution that travels around in a circle from A to B a million times then shoots off straight to Z without passing through any of the other cities. On the flip side, that might be the best salesman your company has ever known, if those three cities have the most customers! ChrisA wondering why nobody cares about the customers in TSP discussions -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 2014-01-05 02:32, Chris Angelico wrote: On Sun, Jan 5, 2014 at 1:27 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: But regardless of how fast your path-finder algorithm might become, you're unlikely to be satisfied with a solution that travels around in a circle from A to B a million times then shoots off straight to Z without passing through any of the other cities. On the flip side, that might be the best salesman your company has ever known, if those three cities have the most customers! ChrisA wondering why nobody cares about the customers in TSP discussions Or, for that matter, ISP customers who don't live in an urban area. :-) -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
wxjmfa...@gmail.com wrote: The very interesting aspect in the way you are holding unicodes (strings). By comparing Python 2 with Python 3.3, you are comparing utf-8 with the the internal representation of Python 3.3 (the flexible string represenation). This is incorrect. Python 2 has never used UTF-8 internally for Unicode strings. In narrow builds, it uses UTF-16, but makes no allowance for surrogate pairs in strings. In wide builds, it uses UTF-32. Other implementations, such as Jython or IronPython, may do something else. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Sun, Jan 5, 2014 at 1:41 PM, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: wxjmfa...@gmail.com wrote: The very interesting aspect in the way you are holding unicodes (strings). By comparing Python 2 with Python 3.3, you are comparing utf-8 with the the internal representation of Python 3.3 (the flexible string represenation). This is incorrect. Python 2 has never used UTF-8 internally for Unicode strings. In narrow builds, it uses UTF-16, but makes no allowance for surrogate pairs in strings. In wide builds, it uses UTF-32. That's for Python's unicode type. What Robin said was that they were using either a byte string (str) with UTF-8 data, or a Unicode string (unicode) with character data. So jmf was right, except that it's not specifically to do with Py2 vs Py3.3. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
I wrote: I realize I'm taking this statement out of context, but yes, sometimes fast is more important than correct. In article 52c8c301$0$29998$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Fast is never more important than correct. Sure it is. Let's imagine you're building a system which sorts packages for delivery. You sort 1 million packages every night and put them on trucks going out for final delivery. Some assumptions: Every second I can cut from the sort time saves me $0.01. If I mis-sort a package, it goes out on the wrong truck, doesn't get discovered until the end of the day, and ends up costing me $5 (including not just the direct cost of redelivering it, but also factoring in ill will and having to make the occasional refund for not meeting the promised delivery time). I've got a new sorting algorithm which is guaranteed to cut 10 seconds off the sorting time (i.e. $0.10 per package). The problem is, it makes a mistake 1% of the time. Let's see: 1 million packages x $0.10 = $100,000 saved per day because I sort them faster. 10,000 of them will go to the wrong place, and that will cost me $50,000 per day. By going fast and making mistakes once in a while, I increase my profit by $50,000 per day. The numbers above are fabricated, but I'm sure UPS, FexEx, and all the other package delivery companies are doing these sorts of analyses every day. I watch the UPS guy come to my house. He gets out of his truck, walks to my front door, rings the bell, waits approximately 5 microseconds, leaves the package on the porch, and goes back to his truck. I'm sure UPS has figured out that the amortized cost of the occasional stolen or lost package is less than the cost for the delivery guy to wait for me to come to the front door and sign for the delivery. Looking at another problem domain, let's say you're a contestant on Jeopardy. If you listen to the entire clue and spend 3 seconds making sure you know the correct answer before hitting the buzzer, it doesn't matter if you're right or wrong. Somebody else beat you to the buzzer, 2.5 seconds ago. Or, let's take an example from sports. I'm standing at home plate holding a bat. 60 feet away from me, the pitcher is about to throw a baseball towards me at darn close to 100 MPH (insert words like bowl and wicket as geographically appropriate). 400 ms later, the ball is going to be in the catcher's glove if you don't hit it. If you have an absolutely perfect algorithm to determining if it's a ball or a strike, which takes 500 ms to run, you're going back to the minor leagues. If you have a 300 ms algorithm which is right 75% of the time, you're heading to the hall of fame. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Sun, Jan 5, 2014 at 8:50 AM, Roy Smith r...@panix.com wrote: I wrote: I realize I'm taking this statement out of context, but yes, sometimes fast is more important than correct. In article 52c8c301$0$29998$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Fast is never more important than correct. Sure it is. Let's imagine you're building a system which sorts packages for delivery. You sort 1 million packages every night and put them on trucks going out for final delivery. Some assumptions: Every second I can cut from the sort time saves me $0.01. If I mis-sort a package, it goes out on the wrong truck, doesn't get discovered until the end of the day, and ends up costing me $5 (including not just the direct cost of redelivering it, but also factoring in ill will and having to make the occasional refund for not meeting the promised delivery time). I've got a new sorting algorithm which is guaranteed to cut 10 seconds off the sorting time (i.e. $0.10 per package). The problem is, it makes a mistake 1% of the time. Let's see: 1 million packages x $0.10 = $100,000 saved per day because I sort them faster. 10,000 of them will go to the wrong place, and that will cost me $50,000 per day. By going fast and making mistakes once in a while, I increase my profit by $50,000 per day. The numbers above are fabricated, but I'm sure UPS, FexEx, and all the other package delivery companies are doing these sorts of analyses every day. I watch the UPS guy come to my house. He gets out of his truck, walks to my front door, rings the bell, waits approximately 5 microseconds, leaves the package on the porch, and goes back to his truck. I'm sure UPS has figured out that the amortized cost of the occasional stolen or lost package is less than the cost for the delivery guy to wait for me to come to the front door and sign for the delivery. Looking at another problem domain, let's say you're a contestant on Jeopardy. If you listen to the entire clue and spend 3 seconds making sure you know the correct answer before hitting the buzzer, it doesn't matter if you're right or wrong. Somebody else beat you to the buzzer, 2.5 seconds ago. Or, let's take an example from sports. I'm standing at home plate holding a bat. 60 feet away from me, the pitcher is about to throw a baseball towards me at darn close to 100 MPH (insert words like bowl and wicket as geographically appropriate). 400 ms later, the ball is going to be in the catcher's glove if you don't hit it. If you have an absolutely perfect algorithm to determining if it's a ball or a strike, which takes 500 ms to run, you're going back to the minor leagues. If you have a 300 ms algorithm which is right 75% of the time, you're heading to the hall of fame. Neat examples -- thanks Only minor quibble isnt $5 cost of mis-sorting a gross underestimate? I am reminded of a passage of Dijkstra in Discipline of Programming -- something to this effect He laments the fact that hardware engineers were not including overflow checks in machine ALUs. He explained as follows: If a test is moderately balanced (statistically speaking) a programmer will not mind writing an if statement If however the test is very skew -- say if 99% times, else 1% -- he will tend to skimp on the test, producing 'buggy' code [EWD would never use the bad b word or course] The cost equation for hardware is very different -- once the investment in the silicon is done with -- fixed cost albeit high -- there is no variable cost to executing that circuitry once or a zillion times Moral of Story: Intel should take up FSR [Ducks and runs for cover] -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
In article mailman.4929.1388896998.18130.python-l...@python.org, Rustom Mody rustompm...@gmail.com wrote: On Sun, Jan 5, 2014 at 8:50 AM, Roy Smith r...@panix.com wrote: I wrote: I realize I'm taking this statement out of context, but yes, sometimes fast is more important than correct. In article 52c8c301$0$29998$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Fast is never more important than correct. Sure it is. Let's imagine you're building a system which sorts packages for delivery. You sort 1 million packages every night and put them on trucks going out for final delivery. Some assumptions: Every second I can cut from the sort time saves me $0.01. If I mis-sort a package, it goes out on the wrong truck, doesn't get discovered until the end of the day, and ends up costing me $5 (including not just the direct cost of redelivering it, but also factoring in ill will and having to make the occasional refund for not meeting the promised delivery time). I've got a new sorting algorithm which is guaranteed to cut 10 seconds off the sorting time (i.e. $0.10 per package). The problem is, it makes a mistake 1% of the time. Let's see: 1 million packages x $0.10 = $100,000 saved per day because I sort them faster. 10,000 of them will go to the wrong place, and that will cost me $50,000 per day. By going fast and making mistakes once in a while, I increase my profit by $50,000 per day. The numbers above are fabricated, but I'm sure UPS, FexEx, and all the other package delivery companies are doing these sorts of analyses every day. I watch the UPS guy come to my house. He gets out of his truck, walks to my front door, rings the bell, waits approximately 5 microseconds, leaves the package on the porch, and goes back to his truck. I'm sure UPS has figured out that the amortized cost of the occasional stolen or lost package is less than the cost for the delivery guy to wait for me to come to the front door and sign for the delivery. Looking at another problem domain, let's say you're a contestant on Jeopardy. If you listen to the entire clue and spend 3 seconds making sure you know the correct answer before hitting the buzzer, it doesn't matter if you're right or wrong. Somebody else beat you to the buzzer, 2.5 seconds ago. Or, let's take an example from sports. I'm standing at home plate holding a bat. 60 feet away from me, the pitcher is about to throw a baseball towards me at darn close to 100 MPH (insert words like bowl and wicket as geographically appropriate). 400 ms later, the ball is going to be in the catcher's glove if you don't hit it. If you have an absolutely perfect algorithm to determining if it's a ball or a strike, which takes 500 ms to run, you're going back to the minor leagues. If you have a 300 ms algorithm which is right 75% of the time, you're heading to the hall of fame. Neat examples -- thanks Only minor quibble isnt $5 cost of mis-sorting a gross underestimate? I have no idea. Like I said, the numbers are all fabricated. I do have a friend who used to work for UPS. He told me lots of UPS efficiency stories. One of them had to do with mis-routed packages. IIRC, the process for dealing with a mis-routed package was to NOT waste any time trying to figure out why it was mis-routed. It was just thrown back into the input hopper to go through the whole system again. The sorting software kept track of how many times it had sorted a particular package. Only after N attempts (where N was something like 3), was it kicked out of the automated process for human intervention. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Roy Smith wrote: I wrote: I realize I'm taking this statement out of context, but yes, sometimes fast is more important than correct. In article 52c8c301$0$29998$c3e8da3$54964...@news.astraweb.com, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote: Fast is never more important than correct. Sure it is. Sure it isn't. I think you stopped reading my post too early. None of your examples contradict what I am saying. They all involve exactly the same sort of compromise regarding correctness that I'm talking about, where you loosen what counts as correct for the purpose of getting extra speed. So, for example: Let's imagine you're building a system which sorts packages for delivery. You sort 1 million packages every night and put them on trucks going out for final delivery. What's your requirement, i.e. what counts as correct for the delivery algorithm being used? Is it that every parcel is delivered to the specified delivery address the first time? No it is not. What counts as correct for the delivery algorithm is something on the lines of No less than 95% of parcels will be sorted correctly and delivered directly; no more than 5% may be mis-sorted at most three times (or some similar requirement). It may even been that the requirements are even looser, e.g.: No more than 1% of parcels will be lost/damaged/stolen/destroyed in which case they don't care unless a particular driver loses or destroys more than 1% of his deliveries. But if it turns out that Fred is dumping every single one of his parcels straight into the river, the fact that he can make thirty deliveries in the time it takes other drivers to make one will not save his job. But it's much faster to dump the parcels in the river does not matter. What matters is that the deliveries are made within the bounds of allowable time and loss. Things get interesting when the people setting the requirements and the people responsible for meeting those requirements aren't able to agree. Then you have customers who complain that the software is buggy, and developers who complain that the customer requirements are impossible to provide. Sometimes they're both right. Looking at another problem domain, let's say you're a contestant on Jeopardy. If you listen to the entire clue and spend 3 seconds making sure you know the correct answer before hitting the buzzer, it doesn't matter if you're right or wrong. Somebody else beat you to the buzzer, 2.5 seconds ago. I've heard of Jeopardy, but never seen it. But I know about game shows, and in this case, what you care about is *winning the game*, not answering the questions correctly. Answering the questions correctly is only a means to the end, which is Win. If the rules allow it, your best strategy might even be to give wrong answers, every time! (It's not quite a game show, but the British quiz show QI is almost like that. The rules, if there are any, encourage *interesting* answers over correct answers. Occasionally that leads to panelists telling what can best be described as utter porkies[1].) If Jeopardy does not penalise wrong answers, the best strategy might be to jump in with an answer as quickly as possible, without caring too much about whether it is the right answer. But if Jeopardy penalises mistakes, then the best strategy might be to take as much time as you can to answer the question, and hope for others to make mistakes. That's often the strategy in Test cricket: play defensively, and wait for the opposition to make a mistake. Or, let's take an example from sports. I'm standing at home plate holding a bat. 60 feet away from me, the pitcher is about to throw a baseball towards me at darn close to 100 MPH (insert words like bowl and wicket as geographically appropriate). 400 ms later, the ball is going to be in the catcher's glove if you don't hit it. If you have an absolutely perfect algorithm to determining if it's a ball or a strike, which takes 500 ms to run, you're going back to the minor leagues. If you have a 300 ms algorithm which is right 75% of the time, you're heading to the hall of fame. And if you catch the ball, stick it in your pocket and race through all the bases, what's that? It's almost certainly faster than trying to play by the rules. If speed is all that matters, that's what people would do. But it isn't -- the correct strategy depends on many different factors, one of which is that you have a de facto time limit on deciding whether to swing or let the ball through. Your baseball example is no different from the example I gave before. Find the optimal path for the Travelling Salesman Problem in a week's time, versus Find a close to optimal path in three minutes is conceptually the same problem, with the same solution: an imperfect answer *now* can be better than a perfect answer *later*. [1] Porkies, or pork pies, from Cockney rhyming slang. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Sun, Jan 5, 2014 at 2:20 PM, Roy Smith r...@panix.com wrote: I've got a new sorting algorithm which is guaranteed to cut 10 seconds off the sorting time (i.e. $0.10 per package). The problem is, it makes a mistake 1% of the time. That's a valid line of argument in big business, these days, because we've been conditioned to accept low quality. But there are places where quality trumps all, and we're happy to pay for that. Allow me to expound two examples. 1) Amazon http://www.amazon.com/exec/obidos/ASIN/1782010165/evertype-20 I bought this book a while ago. It's about the size of a typical paperback. It arrived in a box too large for it on every dimension, with absolutely no packaging. I complained. Clearly their algorithm was: Most stuff will get there in good enough shape, so people can't be bothered complaining. And when they do complain, it's cheaper to ship them another for free than to debate with them on chat. Because that's what they did. Fortunately I bought the book for myself, not for a gift, because the *replacement* arrived in another box of the same size, with ... one little sausage for protection. That saved it in one dimension out of three, so it arrived only slightly used-looking instead of very used-looking. And this a brand new book. When I complained the second time, I was basically told any replacement we ship you will be exactly the same. Thanks. 2) Bad Monkey Productions http://kck.st/1bgG8Pl The cheapest the book itself will be is $60, and the limited edition early ones are more (I'm getting the gold level book, $200 for one of the first 25 books, with special sauce). The people producing this are absolutely committed to quality, as are the nearly 800 backers. If this project is delayed slightly in order to ensure that we get something fully awesome, I don't think there will be complaints. This promises to be a beautiful book that'll be treasured for generations, so quality's far FAR more important than the exact delivery date. I don't think we'll ever see type #2 become universal, for the same reason that people buy cheap Chinese imports in the supermarket rather than something that costs five times as much from a specialist. The expensive one might be better, but why bother? When the cheap one breaks, you just get another. The expensive one might fail too, so why take that risk? But it's always a tradeoff, and there'll always be a few companies around who offer the more expensive product. (We have a really high quality cheese slicer. It's still the best I've seen, after something like 20 years of usage.) Fast or right? It'd have to be really *really* fast to justify not being right, unless the lack of rightness is less than measurable (like representing time in nanoseconds - anything smaller than that is unlikely to be measurable on most computers). ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/2/2014 11:49 PM, Steven D'Aprano wrote: Robin Becker wrote: For fairly sensible reasons we changed the internal default to use unicode rather than bytes. After doing all that and making the tests compatible etc etc I have a version which runs in both and passes all its tests. However, for whatever reason the python 3.3 version runs slower For whatever reason is right, unfortunately there's no real way to tell from the limited information you give what that might be. Are you comparing a 2.7 wide or narrow build? Do your tests use any so-called astral characters (characters in the Supplementary Multilingual Planes, i.e. characters with ord() 0x)? If I remember correctly, some early alpha(?) versions of Python 3.3 consistently ran Unicode operations a small but measurable amount slower than 3.2 or 2.7. That especially effected Windows. But I understand that this was sped up in the release version of 3.3. There was more speedup in 3.3.2 and possibly even more in 3.3.3, so OP should run the latter. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
It's time to understand the Character Encoding Models and the math behind it. Unicode does not differ from any other coding scheme. How? With a sheet of paper and a pencil. jmf -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Fri, Jan 3, 2014 at 9:10 PM, wxjmfa...@gmail.com wrote: It's time to understand the Character Encoding Models and the math behind it. Unicode does not differ from any other coding scheme. How? With a sheet of paper and a pencil. One plus one is two, therefore Python is better than Haskell. Four times five is twelve, and four times six is thirteen, and four times seven is enough to make Alice think she's Mabel, and London is the capital of Paris, and the crocodile cheerfully grins. Therefore, by obvious analogy, Unicode times new-style classes equals a 64-bit process. I worked that out with a sheet of paper and a pencil. The pencil was a little help, but the paper was three sheets in the wind. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 02/01/2014 18:25, David Hutto wrote: Just because it's 3.3 doesn't matter...the main interest is in compatibility. Secondly, you used just one piece of code, which could be a fluke, try others, and check the PEP. You need to realize that evebn the older versions are benig worked on, and they have to be refined. So if you have a problem, use the older and import from the future would be my suggestion Suggesting that I use another piece of code to test python3 against python2 is a bit silly. I'm sure I can find stuff which runs faster under python3, but reportlab is the code I'm porting and that is going the wrong way. -- Robin Becker -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 02/01/2014 18:37, Terry Reedy wrote: On 1/2/2014 12:36 PM, Robin Becker wrote: I just spent a large amount of effort porting reportlab to a version which works with both python2.7 and python3.3. I have a large number of functions etc which handle the conversions that differ between the two pythons. I am imagine that this was not fun. indeed :) For fairly sensible reasons we changed the internal default to use unicode rather than bytes. Do you mean 'from __future__ import unicode_literals'? No, previously we had default of utf8 encoded strings in the lower levels of the code and we accepted either unicode or utf8 string literals as inputs to text functions. As part of the port process we made the decision to change from default utf8 str (bytes) to default unicode. Am I correct in thinking that this change increases the capabilities of reportlab? For instance, easily producing an article with abstracts in English, Arabic, Russian, and Chinese? It's made no real difference to what we are able to produce or accept since utf8 or unicode can encode anything in the input and what can be produced depends on fonts mainly. After doing all that and making the tests ... I know some of these tests are fairly variable, but even for simple things like paragraph parsing 3.3 seems to be slower. Since both use unicode internally it can't be that can it, or is python 2.7's unicode faster? The new unicode implementation in 3.3 is faster for some operations and slower for others. It is definitely more space efficient, especially compared to a wide build system. It is definitely less buggy, especially compared to a narrow build system. Do your tests use any astral (non-BMP) chars? If so, do they pass on narrow 2.7 builds (like on Windows)? I'm not sure if we have any non-bmp characters in the tests. Simple CJK etc etc for the most part. I'm fairly certain we don't have any ability to handle composed glyphs (multi-codepoint) etc etc For one thing, indexing and slicing just works on all machines for all unicode strings. Code for 2.7 and 3.3 either a) does not index or slice, b) does not work for all text on 2.7 narrow builds, or c) has extra conditional code only for 2.7. probably -- Robin Becker -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 02/01/2014 23:57, Antoine Pitrou wrote: .. Running a test suite is a completely broken benchmarking methodology. You should isolate workloads you are interested in and write a benchmark simulating them. I'm certain you're right, but individual bits of code like generating our reference manual also appear to be slower in 3.3. Regards Antoine. -- Robin Becker -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 03/01/2014 09:01, Terry Reedy wrote: There was more speedup in 3.3.2 and possibly even more in 3.3.3, so OP should run the latter. python 3.3.3 is what I use on windows. As for astral / non-bmp etc etc that's almost irrelevant for the sort of tests we're doing which are mostly simple english text. -- Robin Becker -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
In article mailman.4850.1388752146.18130.python-l...@python.org, Robin Becker ro...@reportlab.com wrote: On 03/01/2014 09:01, Terry Reedy wrote: There was more speedup in 3.3.2 and possibly even more in 3.3.3, so OP should run the latter. python 3.3.3 is what I use on windows. As for astral / non-bmp etc etc that's almost irrelevant for the sort of tests we're doing which are mostly simple english text. The sad part is, if you're accepting any text from external sources, you need to be able to deal with astral. I was doing a project a while ago importing 20-something million records into a MySQL database. Little did I know that FOUR of those records contained astral characters (which MySQL, at least the version I was using, couldn't handle). My way of dealing with those records was to nuke them. Longer term we ended up switching to Postgress. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Sat, Jan 4, 2014 at 1:57 AM, Roy Smith r...@panix.com wrote: I was doing a project a while ago importing 20-something million records into a MySQL database. Little did I know that FOUR of those records contained astral characters (which MySQL, at least the version I was using, couldn't handle). My way of dealing with those records was to nuke them. Longer term we ended up switching to Postgress. Look! Postgres means you don't lose data!! Seriously though, that's a much better long-term solution than destroying data. But MySQL does support the full Unicode range - just not in its UTF8 type. You have to specify UTF8MB4 - that is, maximum bytes 4 rather than the default of 3. According to [1], the UTF8MB4 encoding is stored as UTF-16, and UTF8 is stored as UCS-2. And according to [2], it's even possible to explicitly choose the mindblowing behaviour of UCS-2 for a data type that calls itself UTF8, so that a vague theoretical subsequent version of MySQL might be able to make UTF8 mean UTF-8, and people can choose to use the other alias. To my mind, this is a bug with backward-compatibility concerns. That means it can't be fixed in a point release. Fine. But the behaviour change is this used to throw an error, now it works. Surely that can be fixed in the next release. Or surely a version or two of deprecating UTF8 in favour of the two MB? types (and never ever returning UTF8 from any query), followed by a reintroduction of UTF8 as an alias for MB4, and the deprecation of MB3. Or am I spoiled by the quality of Python (and other) version numbering, where I can (largely) depend on functionality not changing in point releases? ChrisA [1] http://dev.mysql.com/doc/refman/5.7/en/charset-unicode-utf8mb4.html [2] http://dev.mysql.com/doc/refman/5.7/en/charset-unicode-utf8mb3.html -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 01/03/2014 02:24 AM, Chris Angelico wrote: I worked that out with a sheet of paper and a pencil. The pencil was a little help, but the paper was three sheets in the wind. Beautiful! -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/3/2014 7:28 AM, Robin Becker wrote: On 03/01/2014 09:01, Terry Reedy wrote: There was more speedup in 3.3.2 and possibly even more in 3.3.3, so OP should run the latter. python 3.3.3 is what I use on windows. As for astral / non-bmp etc etc that's almost irrelevant for the sort of tests we're doing which are mostly simple english text. If you do not test the cases where 2.7 is buggy and requires nasty workarounds, then I can understand why you do not so much appreciate 3.3 ;-). -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 03/01/2014 22:00, Terry Reedy wrote: On 1/3/2014 7:28 AM, Robin Becker wrote: On 03/01/2014 09:01, Terry Reedy wrote: There was more speedup in 3.3.2 and possibly even more in 3.3.3, so OP should run the latter. python 3.3.3 is what I use on windows. As for astral / non-bmp etc etc that's almost irrelevant for the sort of tests we're doing which are mostly simple english text. If you do not test the cases where 2.7 is buggy and requires nasty workarounds, then I can understand why you do not so much appreciate 3.3 ;-). Are you crazy? Surely everybody prefers fast but incorrect code in preference to something that is correct but slow? Except that Python 3.3.3 is often faster. And always (to my knowledge) correct. Upper Class Twit of the Year anybody? :) -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 02/01/2014 17:36, Robin Becker wrote: On 31/12/2013 15:41, Roy Smith wrote: I'm using 2.7 in production. I realize that at some point we'll need to upgrade to 3.x. We'll keep putting that off as long as the effort + dependencies + risk metric exceeds the perceived added value metric. We too are using python 2.4 - 2.7 in production. Different clients migrate at different speeds. To be honest, the perceived added value in 3.x is pretty low for us. What we're running now works. Switching to 3.x isn't going to increase our monthly average users, or our retention rate, or decrease our COGS, or increase our revenue. There's no killer features we need. In summary, the decision to migrate will be driven more by risk aversion, when the risk of staying on an obsolete, unsupported platform, exceeds the risk of moving to a new one. Or, there will be some third-party module that we must have which is no longer supported on 2.x. +1 If I were starting a new project today, I would probably start it in 3.x. +1 I just spent a large amount of effort porting reportlab to a version which works with both python2.7 and python3.3. I have a large number of functions etc which handle the conversions that differ between the two pythons. For fairly sensible reasons we changed the internal default to use unicode rather than bytes. After doing all that and making the tests compatible etc etc I have a version which runs in both and passes all its tests. However, for whatever reason the python 3.3 version runs slower 2.7 Ran 223 tests in 66.578s 3.3 Ran 223 tests in 75.703s I know some of these tests are fairly variable, but even for simple things like paragraph parsing 3.3 seems to be slower. Since both use unicode internally it can't be that can it, or is python 2.7's unicode faster? So far the superiority of 3.3 escapes me, but I'm tasked with enjoying this process so I'm sure there must be some new 'feature' that will help. Perhaps 'yield from' or 'raise from None' or ... In any case I think we will be maintaining python 2.x code for at least another 5 years; the version gap is then a real hindrance. Of interest https://mail.python.org/pipermail/python-dev/2012-October/121919.html ? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 31/12/2013 15:41, Roy Smith wrote: I'm using 2.7 in production. I realize that at some point we'll need to upgrade to 3.x. We'll keep putting that off as long as the effort + dependencies + risk metric exceeds the perceived added value metric. We too are using python 2.4 - 2.7 in production. Different clients migrate at different speeds. To be honest, the perceived added value in 3.x is pretty low for us. What we're running now works. Switching to 3.x isn't going to increase our monthly average users, or our retention rate, or decrease our COGS, or increase our revenue. There's no killer features we need. In summary, the decision to migrate will be driven more by risk aversion, when the risk of staying on an obsolete, unsupported platform, exceeds the risk of moving to a new one. Or, there will be some third-party module that we must have which is no longer supported on 2.x. +1 If I were starting a new project today, I would probably start it in 3.x. +1 I just spent a large amount of effort porting reportlab to a version which works with both python2.7 and python3.3. I have a large number of functions etc which handle the conversions that differ between the two pythons. For fairly sensible reasons we changed the internal default to use unicode rather than bytes. After doing all that and making the tests compatible etc etc I have a version which runs in both and passes all its tests. However, for whatever reason the python 3.3 version runs slower 2.7 Ran 223 tests in 66.578s 3.3 Ran 223 tests in 75.703s I know some of these tests are fairly variable, but even for simple things like paragraph parsing 3.3 seems to be slower. Since both use unicode internally it can't be that can it, or is python 2.7's unicode faster? So far the superiority of 3.3 escapes me, but I'm tasked with enjoying this process so I'm sure there must be some new 'feature' that will help. Perhaps 'yield from' or 'raise from None' or ... In any case I think we will be maintaining python 2.x code for at least another 5 years; the version gap is then a real hindrance. -- Robin Becker -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Just because it's 3.3 doesn't matter...the main interest is in compatibility. Secondly, you used just one piece of code, which could be a fluke, try others, and check the PEP. You need to realize that evebn the older versions are benig worked on, and they have to be refined. So if you have a problem, use the older and import from the future would be my suggestion On Thu, Jan 2, 2014 at 12:36 PM, Robin Becker ro...@reportlab.com wrote: On 31/12/2013 15:41, Roy Smith wrote: I'm using 2.7 in production. I realize that at some point we'll need to upgrade to 3.x. We'll keep putting that off as long as the effort + dependencies + risk metric exceeds the perceived added value metric. We too are using python 2.4 - 2.7 in production. Different clients migrate at different speeds. To be honest, the perceived added value in 3.x is pretty low for us. What we're running now works. Switching to 3.x isn't going to increase our monthly average users, or our retention rate, or decrease our COGS, or increase our revenue. There's no killer features we need. In summary, the decision to migrate will be driven more by risk aversion, when the risk of staying on an obsolete, unsupported platform, exceeds the risk of moving to a new one. Or, there will be some third-party module that we must have which is no longer supported on 2.x. +1 If I were starting a new project today, I would probably start it in 3.x. +1 I just spent a large amount of effort porting reportlab to a version which works with both python2.7 and python3.3. I have a large number of functions etc which handle the conversions that differ between the two pythons. For fairly sensible reasons we changed the internal default to use unicode rather than bytes. After doing all that and making the tests compatible etc etc I have a version which runs in both and passes all its tests. However, for whatever reason the python 3.3 version runs slower 2.7 Ran 223 tests in 66.578s 3.3 Ran 223 tests in 75.703s I know some of these tests are fairly variable, but even for simple things like paragraph parsing 3.3 seems to be slower. Since both use unicode internally it can't be that can it, or is python 2.7's unicode faster? So far the superiority of 3.3 escapes me, but I'm tasked with enjoying this process so I'm sure there must be some new 'feature' that will help. Perhaps 'yield from' or 'raise from None' or ... In any case I think we will be maintaining python 2.x code for at least another 5 years; the version gap is then a real hindrance. -- Robin Becker -- https://mail.python.org/mailman/listinfo/python-list -- Best Regards, David Hutto *CEO:* *http://www.hitwebdevelopment.com http://www.hitwebdevelopment.com* -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 1/2/2014 12:36 PM, Robin Becker wrote: I just spent a large amount of effort porting reportlab to a version which works with both python2.7 and python3.3. I have a large number of functions etc which handle the conversions that differ between the two pythons. I am imagine that this was not fun. [For those who do not know, reportlab produces pdf documents.] For fairly sensible reasons we changed the internal default to use unicode rather than bytes. Do you mean 'from __future__ import unicode_literals'? Am I correct in thinking that this change increases the capabilities of reportlab? For instance, easily producing an article with abstracts in English, Arabic, Russian, and Chinese? After doing all that and making the tests compatible etc etc I have a version which runs in both and passes all its tests. However, for whatever reason the python 3.3 version runs slower. Python 3 is slower in some things, like integer arithmetic with small ints. 2.7 Ran 223 tests in 66.578s 3.3 Ran 223 tests in 75.703s I know some of these tests are fairly variable, but even for simple things like paragraph parsing 3.3 seems to be slower. Since both use unicode internally it can't be that can it, or is python 2.7's unicode faster? The new unicode implementation in 3.3 is faster for some operations and slower for others. It is definitely more space efficient, especially compared to a wide build system. It is definitely less buggy, especially compared to a narrow build system. Do your tests use any astral (non-BMP) chars? If so, do they pass on narrow 2.7 builds (like on Windows)? So far the superiority of 3.3 escapes me, For one thing, indexing and slicing just works on all machines for all unicode strings. Code for 2.7 and 3.3 either a) does not index or slice, b) does not work for all text on 2.7 narrow builds, or c) has extra conditional code only for 2.7. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Hi, Robin Becker robin at reportlab.com writes: For fairly sensible reasons we changed the internal default to use unicode rather than bytes. After doing all that and making the tests compatible etc etc I have a version which runs in both and passes all its tests. However, for whatever reason the python 3.3 version runs slower 2.7 Ran 223 tests in 66.578s 3.3 Ran 223 tests in 75.703s Running a test suite is a completely broken benchmarking methodology. You should isolate workloads you are interested in and write a benchmark simulating them. Regards Antoine. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Robin Becker wrote: For fairly sensible reasons we changed the internal default to use unicode rather than bytes. After doing all that and making the tests compatible etc etc I have a version which runs in both and passes all its tests. However, for whatever reason the python 3.3 version runs slower For whatever reason is right, unfortunately there's no real way to tell from the limited information you give what that might be. Are you comparing a 2.7 wide or narrow build? Do your tests use any so-called astral characters (characters in the Supplementary Multilingual Planes, i.e. characters with ord() 0x)? If I remember correctly, some early alpha(?) versions of Python 3.3 consistently ran Unicode operations a small but measurable amount slower than 3.2 or 2.7. That especially effected Windows. But I understand that this was sped up in the release version of 3.3. There are some operations with Unicode strings in 3.3 which unavoidably are slower. If you happen to hit a combination of such operations (mostly to do with creating lots of new strings and then throwing them away without doing much work) your code may turn out to be a bit slower. But that's a pretty artificial set of code. Generally, test code doesn't make good benchmarks. Tests only get run once, in arbitrary order, it spends a lot of time setting up and tearing down test instances, there are all sorts of confounding factors. This plays merry hell with modern hardware optimizations. In addition, it's quite possible that you're seeing some other slow down (the unittest module?) and misinterpreting it as related to string handling. But without seeing your entire code base and all the tests, who can say for sure? 2.7 Ran 223 tests in 66.578s 3.3 Ran 223 tests in 75.703s I know some of these tests are fairly variable, but even for simple things like paragraph parsing 3.3 seems to be slower. Since both use unicode internally it can't be that can it, or is python 2.7's unicode faster? Faster in some circumstances, slower in others. If your application bottleneck is the availability of RAM for strings, 3.3 will potentially be faster since it can use anything up to 1/4 of the memory for strings. If your application doesn't use much memory, or if it uses lots of strings which get created then thrown away. So far the superiority of 3.3 escapes me, Yeah I know, I resisted migrating from 1.5 to 2.x for years. When I finally migrated to 2.3, at first I couldn't see any benefit either. New style classes? Super? Properties? Unified ints and longs? Big deal. Especially since I was still writing 1.5 compatible code and couldn't really take advantage of the new features. When I eventually gave up on supporting versions pre-2.3, it was a load off my shoulders. Now I can't wait to stop supporting 2.4 and 2.5, which will make things even easier. And when I can ignore everything below 3.3 will be a truly happy day. but I'm tasked with enjoying this process so I'm sure there must be some new 'feature' that will help. Perhaps 'yield from' or 'raise from None' or ... No, you have this completely backwards. New features don't help you support old versions of Python that lack those new features. New features are an incentive to drop support for old versions. In any case I think we will be maintaining python 2.x code for at least another 5 years; the version gap is then a real hindrance. Five years sounds about right. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 30/12/2013 22:38, Ethan Furman wrote: On 12/30/2013 01:29 PM, Mark Lawrence wrote: On 30/12/2013 20:49, Steven D'Aprano wrote: On Mon, 30 Dec 2013 19:41:44 +, Mark Lawrence wrote: http://alexgaynor.net/2013/dec/30/about-python-3/ may be of interest to some of you. I don't know whether to thank you for the link, or shout at you for sending eyeballs to look at such a pile of steaming bullshit. http://nuitka.net/posts/re-about-python-3.html is a response. Wow -- another steaming pile! Mark, are you going for a record? ;) -- ~Ethan~ I wasn't, but I am now http://blog.startifact.com/posts/alex-gaynor-on-python-3.html. The Python core developers somewhat gleefully slammed the door shut on Python 2.8 back in 2011, though., which refers to PEP 404 which I mentioned a month or so ago. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Steven D'Aprano wrote: On Mon, 30 Dec 2013 19:41:44 +, Mark Lawrence wrote: http://alexgaynor.net/2013/dec/30/about-python-3/ may be of interest to some of you. [...] I'd like to know where Alex gets the idea that the transition of Python 2 to 3 was supposed to be a five year plan. As far as I know, it was a ten year plan, I haven't been able to find anything in writing from Guido or the core developers stating that the transition period was expected to be ten years, although I haven't looked very hard. I strongly recall it being discussed, so unless you want to trawl the python-dev mailing list, you'll just have to take my word on it *wink* PEP 3000 makes it clear that Guido van Rossum expected the transition period to be longer than usual: [quote] I expect that there will be parallel Python 2.x and 3.x releases for some time; the Python 2.x releases will continue for a longer time than the traditional 2.x.y bugfix releases. Typically, we stop releasing bugfix versions for 2.x once version 2.(x+1) has been released. But I expect there to be at least one or two new 2.x releases even after 3.0 (final) has been released, probably well into 3.1 or 3.2. [end quote] http://www.python.org/dev/peps/pep-3000/ A five year transition period, as suggested by Alex Gaynor, simply makes no sense. Normal support for a single release is four or five years, e.g. Python 2.4 and 2.5 release schedules: * Python 2.4 alpha-1 was released on July 9 2004; the final security update was December 19 2008; * Python 2.5 alpha-1 was released on April 5 2006; the final security update was May 26 2011. (Dates may be approximate, especially the alpha dates. I'm taking them from PEP 320 and 356.) Now in fairness, Guido's comment about well into 3.1 or 3.2 turned out to be rather naive in retrospect. 3.4 alpha has been released, and support for 2.7 is expected to continue for *at least* two more years: http://www.python.org/dev/peps/pep-0373/#maintenance-releases which means that 2.7 probably won't become unmaintained until 3.5 is out. In hindsight, this is probably a good thing. The early releases of 3.x made a few mistakes, and it's best to skip them and go straight to 3.3 or better, for example: - 3.0 was buggy enough that support for it was dropped almost immediately; - built-in function callable() is missing from 3.1; - 3.1 and 3.2 both have exception chaining, but there's no way to suppress the chained exceptions until 3.3; - 3.1 and 3.2 don't allow u'' strings for compatibility with 2.x. The 2.8 un-release schedule goes into more detail about the transition, and why there won't be an official 2.8 blessed by the core developers: http://www.python.org/dev/peps/pep-0404/ (Python is open source -- nothing is stopping people from forking the language or picking up support for 2.7. I wonder how many Python3 naysayers volunteer to support 2.x once the core devs drop it?) As of June this year, over 75% of the top fifty projects hosted on PyPI supported Python 3: http://py3ksupport.appspot.com/ and the Python Wall Of Shame turned majority green, becoming the Wall Of Superpowers, at least six months ago. (Back in June, I noted that it had changed colour some time ago.) Alex's claim that almost no code is written for Python 3 is, well, I'll be kind and describe it as counter-factual. Alex points out that the companies he is speaking to have no plans to migrate to Python 3. Well, duh. In my experience, most companies don't even begin planning to migrate until six months *after* support has ended for the systems they rely on. (Perhaps a tiny exaggeration, but not much.) I won't speak for the Windows or Mac world, but in the Linux world, Python 3 usage depends on the main Linux distros. Yes, ArchLinux has been Python 3 for years now, but ArchLinux is bleeding edge. Fedora is likely to be the first mainstream distro to move to Python 3: https://fedoraproject.org/wiki/Changes/Python_3_as_Default Once Fedora moves, I expect Ubuntu will follow. Debian, Centos and RedHat will probably be more conservative, but they *will* follow, at their own pace. What are the alternatives? They're not going to drop Python, nor are they going to take over support of 2.x forever. (RedHat/Centos are still supporting Python 2.4 and possibly even 2.3, at least in name, but I haven't seen any security updates come through Centos for a long time.) Once the system Python is Python 3, the default, no-brainer choice for most Python coding will be Python 3. Alex says: [quote] Why aren't people using Python 3? First, I think it's because of a lack of urgency. Many years ago, before I knew how to program, the decision to have Python 3 releases live in parallel to Python 2 releases was made. In retrospect this was a mistake, it resulted in a complete lack of urgency for the community to move, and the lack of urgency has given way to lethargy. [end
Re: Blog about python 3
Mark Lawrence wrote: http://blog.startifact.com/posts/alex-gaynor-on-python-3.html. I quote: ...perhaps a brave group of volunteers will stand up and fork Python 2, and take the incremental steps forward. This will have to remain just an idle suggestion, as I'm not volunteering myself. I expect that as excuses for not migrating get fewer, and the deadline for Python 2.7 end-of-life starts to loom closer, more and more haters^W Concerned People will whine about the lack of version 2.8 and ask for *somebody else* to fork Python. I find it, hmmm, interesting, that so many of these Concerned People who say that they're worried about splitting the Python community[1] end up suggesting that we *split the community* into those who have moved forward to Python 3 and those who won't. [1] As if the community is a single amorphous group. It is not. It is made up of web developers using Zope or Django, and scientists using scipy, and linguists using NLTK, and system administrators using nothing but the stdlib, and school kids learning how to program, and professionals who know seventeen different programming languages, and EVE Online gamers using Stackless, and Java guys using Jython, and many more besides, most of whom are sure that their little tiny part of the Python ecosystem is representative of everyone else when in fact they hardly talk at all. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Mon, Dec 30, 2013 at 2:38 PM, Ethan Furman et...@stoneleaf.us wrote: Wow -- another steaming pile! Mark, are you going for a record? ;) Indeed. Every post that disagrees with my opinion and understanding of the situation is complete BS and a conspiracy to spread fear, uncertainty, and doubt. Henceforth I will explain few to no specific disagreements, nor will I give anyone the benefit of the doubt, because that would be silly. -- Devin -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
Steven D'Aprano steve+comp.lang.python at pearwood.info writes: I expect that as excuses for not migrating get fewer, and the deadline for Python 2.7 end-of-life starts to loom closer, more and more haters^W Concerned People will whine about the lack of version 2.8 and ask for *somebody else* to fork Python. I find it, hmmm, interesting, that so many of these Concerned People who say that they're worried about splitting the Python community[1] end up suggesting that we *split the community* into those who have moved forward to Python 3 and those who won't. Indeed. This would be extremely destructive (not to mention alienating the people doing *actual* maintenance and enhancements on Python-and-its-stdlib, of which at least 95% are committed to the original plan for 3.x to slowly supercede 2.x). Regards Antoine. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
In article mailman.4753.1388499265.18130.python-l...@python.org, Antoine Pitrou solip...@pitrou.net wrote: Steven D'Aprano steve+comp.lang.python at pearwood.info writes: I expect that as excuses for not migrating get fewer, and the deadline for Python 2.7 end-of-life starts to loom closer, more and more haters^W Concerned People will whine about the lack of version 2.8 and ask for *somebody else* to fork Python. I find it, hmmm, interesting, that so many of these Concerned People who say that they're worried about splitting the Python community[1] end up suggesting that we *split the community* into those who have moved forward to Python 3 and those who won't. Indeed. This would be extremely destructive (not to mention alienating the people doing *actual* maintenance and enhancements on Python-and-its-stdlib, of which at least 95% are committed to the original plan for 3.x to slowly supercede 2.x). Regards Antoine. I'm using 2.7 in production. I realize that at some point we'll need to upgrade to 3.x. We'll keep putting that off as long as the effort + dependencies + risk metric exceeds the perceived added value metric. I can't imagine the migration will happen in 2014. Maybe not even in 2015. Beyond that, my crystal ball only shows darkness. But, in any case, going with a fork of 2.x has zero appeal. Given the choice between effort + risk to move forward vs. effort + risk to move sideways, I'll move forward every time. To be honest, the perceived added value in 3.x is pretty low for us. What we're running now works. Switching to 3.x isn't going to increase our monthly average users, or our retention rate, or decrease our COGS, or increase our revenue. There's no killer features we need. In summary, the decision to migrate will be driven more by risk aversion, when the risk of staying on an obsolete, unsupported platform, exceeds the risk of moving to a new one. Or, there will be some third-party module that we must have which is no longer supported on 2.x. If I were starting a new project today, I would probably start it in 3.x. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Wed, Jan 1, 2014 at 2:41 AM, Roy Smith r...@panix.com wrote: To be honest, the perceived added value in 3.x is pretty low for us. What we're running now works. Switching to 3.x isn't going to increase our monthly average users, or our retention rate, or decrease our COGS, or increase our revenue. There's no killer features we need. In summary, the decision to migrate will be driven more by risk aversion, when the risk of staying on an obsolete, unsupported platform, exceeds the risk of moving to a new one. Or, there will be some third-party module that we must have which is no longer supported on 2.x. The biggest killer feature for most deployments is likely to be that Unicode just works everywhere. Any new module added to Py3 can be back-ported to Py2 (with some amount of work - might be trivial, might be a huge job), and syntactic changes are seldom a killer feature, but being able to depend on *every single library function* working perfectly with the full Unicode range means you don't have to test every branch of your code. If that's not going to draw you, then yeah, there's not a lot to justify switching. You won't get more users, it'll increase your costs (though by a fixed amount, not an ongoing cost), and old code is trustworthy code, new code is bug city. If I were starting a new project today, I would probably start it in 3.x. And that's the right attitude (though I would drop the probably). Eventually it'll become more critical to upgrade (once Py2 security patches stop coming through, maybe), and when that day does finally come, you'll be glad you have just your 2013 codebases rather than the additional ones dating from 2014 and on until whatever day that is. The past is Py2; the future is Py3. In between, use whichever one makes better business sense. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 31/12/2013 15:41, Roy Smith wrote: In article mailman.4753.1388499265.18130.python-l...@python.org, Antoine Pitrou solip...@pitrou.net wrote: Steven D'Aprano steve+comp.lang.python at pearwood.info writes: I expect that as excuses for not migrating get fewer, and the deadline for Python 2.7 end-of-life starts to loom closer, more and more haters^W Concerned People will whine about the lack of version 2.8 and ask for *somebody else* to fork Python. I find it, hmmm, interesting, that so many of these Concerned People who say that they're worried about splitting the Python community[1] end up suggesting that we *split the community* into those who have moved forward to Python 3 and those who won't. Indeed. This would be extremely destructive (not to mention alienating the people doing *actual* maintenance and enhancements on Python-and-its-stdlib, of which at least 95% are committed to the original plan for 3.x to slowly supercede 2.x). Regards Antoine. I'm using 2.7 in production. I realize that at some point we'll need to upgrade to 3.x. We'll keep putting that off as long as the effort + dependencies + risk metric exceeds the perceived added value metric. Do you use any of the features that were backported from 3.x to 2.7, or could you have stayed with 2.6 or an even older version? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Mon, 30 Dec 2013 19:41:44 +, Mark Lawrence wrote: http://alexgaynor.net/2013/dec/30/about-python-3/ may be of interest to some of you. I don't know whether to thank you for the link, or shout at you for sending eyeballs to look at such a pile of steaming bullshit. I'd like to know where Alex gets the idea that the transition of Python 2 to 3 was supposed to be a five year plan. As far as I know, it was a ten year plan, and we're well ahead of expectations of where we would be at this point of time. People *are* using Python 3, the major Linux distros are planning to move to Python 3, the Python Wall Of Shame stopped being a wall of shame a long time ago (I think it was a year ago? or at least six months ago). Alex's article is, basically, FUD. More comments will have to follow later. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 30/12/2013 20:49, Steven D'Aprano wrote: On Mon, 30 Dec 2013 19:41:44 +, Mark Lawrence wrote: http://alexgaynor.net/2013/dec/30/about-python-3/ may be of interest to some of you. I don't know whether to thank you for the link, or shout at you for sending eyeballs to look at such a pile of steaming bullshit. I'd like to know where Alex gets the idea that the transition of Python 2 to 3 was supposed to be a five year plan. As far as I know, it was a ten year plan, and we're well ahead of expectations of where we would be at this point of time. People *are* using Python 3, the major Linux distros are planning to move to Python 3, the Python Wall Of Shame stopped being a wall of shame a long time ago (I think it was a year ago? or at least six months ago). Alex's article is, basically, FUD. More comments will have to follow later. http://nuitka.net/posts/re-about-python-3.html is a response. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 12/30/2013 01:29 PM, Mark Lawrence wrote: On 30/12/2013 20:49, Steven D'Aprano wrote: On Mon, 30 Dec 2013 19:41:44 +, Mark Lawrence wrote: http://alexgaynor.net/2013/dec/30/about-python-3/ may be of interest to some of you. I don't know whether to thank you for the link, or shout at you for sending eyeballs to look at such a pile of steaming bullshit. http://nuitka.net/posts/re-about-python-3.html is a response. Wow -- another steaming pile! Mark, are you going for a record? ;) -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Tue, Dec 31, 2013 at 9:38 AM, Ethan Furman et...@stoneleaf.us wrote: On 12/30/2013 01:29 PM, Mark Lawrence wrote: On 30/12/2013 20:49, Steven D'Aprano wrote: On Mon, 30 Dec 2013 19:41:44 +, Mark Lawrence wrote: http://alexgaynor.net/2013/dec/30/about-python-3/ may be of interest to some of you. I don't know whether to thank you for the link, or shout at you for sending eyeballs to look at such a pile of steaming bullshit. http://nuitka.net/posts/re-about-python-3.html is a response. Wow -- another steaming pile! Mark, are you going for a record? ;) Does this steam? http://rosuav.blogspot.com/2013/12/about-python-3-response.html ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 30/12/2013 22:38, Ethan Furman wrote: On 12/30/2013 01:29 PM, Mark Lawrence wrote: On 30/12/2013 20:49, Steven D'Aprano wrote: On Mon, 30 Dec 2013 19:41:44 +, Mark Lawrence wrote: http://alexgaynor.net/2013/dec/30/about-python-3/ may be of interest to some of you. I don't know whether to thank you for the link, or shout at you for sending eyeballs to look at such a pile of steaming bullshit. http://nuitka.net/posts/re-about-python-3.html is a response. Wow -- another steaming pile! Mark, are you going for a record? ;) -- ~Ethan~ Merely pointing out the existence of these little gems in order to find out people's feelings about them. You never know, we might even end up with a thread whereby the discussion is Python, the whole Python and nothing but the Python. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On Tue, Dec 31, 2013 at 3:38 PM, Mark Lawrence breamore...@yahoo.co.uk wrote: You never know, we might even end up with a thread whereby the discussion is Python, the whole Python and nothing but the Python. What, on python-list??! [1] That would be a silly idea. We should avoid such theories with all vigor. ChrisA [1] In C, that would be interpreted as What, on python-list| and would confuse everyone. -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 12/30/2013 08:25 PM, Devin Jeanpierre wrote: On Mon, Dec 30, 2013 at 2:38 PM, Ethan Furman et...@stoneleaf.us wrote: Wow -- another steaming pile! Mark, are you going for a record? ;) Indeed. Every post that disagrees with my opinion and understanding of the situation is complete BS and a conspiracy to spread fear, uncertainty, and doubt. Henceforth I will explain few to no specific disagreements, nor will I give anyone the benefit of the doubt, because that would be silly. Couldn't of said it better myself! Well, except for the my opinion part -- obviously it's not my opinion, but reality! ;) -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Blog about python 3
On 31/12/2013 01:09, Chris Angelico wrote: On Tue, Dec 31, 2013 at 9:38 AM, Ethan Furman et...@stoneleaf.us wrote: On 12/30/2013 01:29 PM, Mark Lawrence wrote: On 30/12/2013 20:49, Steven D'Aprano wrote: On Mon, 30 Dec 2013 19:41:44 +, Mark Lawrence wrote: http://alexgaynor.net/2013/dec/30/about-python-3/ may be of interest to some of you. I don't know whether to thank you for the link, or shout at you for sending eyeballs to look at such a pile of steaming bullshit. http://nuitka.net/posts/re-about-python-3.html is a response. Wow -- another steaming pile! Mark, are you going for a record? ;) Does this steam? http://rosuav.blogspot.com/2013/12/about-python-3-response.html ChrisA I'd have said restrained. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list