On Monday, March 9, 2015 at 12:05:05 PM UTC+5:30, Steven D'Aprano wrote:
Chris Angelico wrote:
As to the notion of rejecting the construction of strings containing
these invalid codepoints, I'm not sure. Are there any languages out
there that have a Unicode string type that requires that
Ben Finney ben+pyt...@benfinney.id.au:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:
'\udd00' should be a SyntaxError.
I find your argument convincing, that attempting to construct a
Unicode string of a lone surrogate should be an error.
Then we're back to square one:
On Mon, Mar 9, 2015 at 5:34 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
Chris Angelico wrote:
As to the notion of rejecting the construction of strings containing
these invalid codepoints, I'm not sure. Are there any languages out
there that have a Unicode string type that
Chris Angelico wrote:
As to the notion of rejecting the construction of strings containing
these invalid codepoints, I'm not sure. Are there any languages out
there that have a Unicode string type that requires that all
codepoints be valid (no surrogates, no U+FFFE, etc)?
U+FFFE and U+
On Mon, Mar 9, 2015 at 5:25 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
Marko Rauhamaa wrote:
Chris Angelico ros...@gmail.com:
Once again, you appear to be surprised that invalid data is failing.
Why is this so strange? U+DD00 is not a valid character.
But it is a valid
On Mon, Mar 9, 2015 at 5:25 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
Perhaps the bug is not UTF-8's inability to encode lone
surrogates, but that Python allows you to create lone surrogates in the
first place. That's not a rhetorical question. It's a genuine question.
As
Rustom Mody wrote:
On Saturday, March 7, 2015 at 4:39:48 PM UTC+5:30, Steven D'Aprano wrote:
Rustom Mody wrote:
This includes not just bug-prone-system code such as Java and Windows
but seemingly working code such as python 3.
What Unicode bugs do you think Python 3.3 and above have?
Marko Rauhamaa wrote:
Chris Angelico ros...@gmail.com:
Once again, you appear to be surprised that invalid data is failing.
Why is this so strange? U+DD00 is not a valid character.
But it is a valid non-character code point.
It is quite correct to throw this error.
'\udd00' is a
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Marko Rauhamaa wrote:
'\udd00' is a valid str object:
Is it though? Perhaps the bug is not UTF-8's inability to encode lone
surrogates, but that Python allows you to create lone surrogates in
the first place. That's not a rhetorical
Steven D'Aprano wrote:
Marko Rauhamaa wrote:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
Can you explain?
In Python terms, there are bytes objects b that don't satisfy:
Chris Angelico ros...@gmail.com:
Once again, you appear to be surprised that invalid data is failing.
Why is this so strange? U+DD00 is not a valid character. It is quite
correct to throw this error.
'\udd00' is a valid str object:
'\udd00'
'\udd00'
'\udd00'.encode('utf-32')
On Sun, Mar 8, 2015 at 7:09 PM, Marko Rauhamaa ma...@pacujo.net wrote:
Chris Angelico ros...@gmail.com:
Once again, you appear to be surprised that invalid data is failing.
Why is this so strange? U+DD00 is not a valid character. It is quite
correct to throw this error.
'\udd00' is a valid
On Monday, March 9, 2015 at 7:39:42 AM UTC+5:30, Cameron Simpson wrote:
On 07Mar2015 22:09, Steven D'Aprano wrote:
Rustom Mody wrote:
[...big snip...]
Some parts are here some earlier and from my memory.
If details wrong please correct:
- 200 million records
- Containing 4 strings with
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:
'\udd00' should be a SyntaxError.
I find your argument convincing, that attempting to construct a Unicode
string of a lone surrogate should be an error.
Shouldn't the error type be a ValueError, though? The statement is not,
to my
On 07Mar2015 22:09, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info
wrote:
Rustom Mody wrote:
[...big snip...]
Some parts are here some earlier and from my memory.
If details wrong please correct:
- 200 million records
- Containing 4 strings with SMP characters
- System made with python
On Mon, Mar 9, 2015 at 1:09 PM, Ben Finney ben+pyt...@benfinney.id.au wrote:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:
'\udd00' should be a SyntaxError.
I find your argument convincing, that attempting to construct a Unicode
string of a lone surrogate should be an error.
On Sun, Mar 8, 2015, at 22:09, Ben Finney wrote:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info writes:
'\udd00' should be a SyntaxError.
I find your argument convincing, that attempting to construct a Unicode
string of a lone surrogate should be an error.
Shouldn't the error
Marko Rauhamaa wrote:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Marko Rauhamaa wrote:
'\udd00' is a valid str object:
Is it though? Perhaps the bug is not UTF-8's inability to encode lone
surrogates, but that Python allows you to create lone surrogates in
the first place.
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
For those cases where you do wish to take an arbitrary byte stream and
round-trip it, Python now provides an error handler for that.
py import random
py b = bytes([random.randint(0, 255) for _ in range(1)])
py s = b.decode('utf-8')
On Saturday, March 7, 2015 at 4:39:48 PM UTC+5:30, Steven D'Aprano wrote:
Rustom Mody wrote:
This includes not just bug-prone-system code such as Java and Windows but
seemingly working code such as python 3.
What Unicode bugs do you think Python 3.3 and above have?
Literal/Legalistic
On Sun, Mar 8, 2015 at 6:20 PM, Marko Rauhamaa ma...@pacujo.net wrote:
* it still isn't bijective between str and bytes:
'\udd00'.encode('utf-8', errors='surrogateescape')
Traceback (most recent call last):
File stdin, line 1, in module
UnicodeEncodeError: 'utf-8' codec can't
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
Can you explain?
In Python terms, there are bytes objects b that don't satisfy:
b.decode('utf-8').encode('utf-8') == b
Marko
--
On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa ma...@pacujo.net wrote:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
Can you explain?
In Python terms, there are bytes objects b that
On Sun, Mar 8, 2015 at 3:25 AM, Marko Rauhamaa ma...@pacujo.net wrote:
Chris Angelico ros...@gmail.com:
On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa ma...@pacujo.net wrote:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from
On 07/03/2015 16:25, Marko Rauhamaa wrote:
Chris Angelico ros...@gmail.com:
On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa ma...@pacujo.net wrote:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from its not being
a bijective
Chris Angelico ros...@gmail.com:
On Sun, Mar 8, 2015 at 3:25 AM, Marko Rauhamaa ma...@pacujo.net wrote:
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
Here's an example:
b = b'\x80'
Yes, it generates an exception. IOW, UTF-8 is not a
On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa ma...@pacujo.net wrote:
There are two things happening here:
1) The underlying file system is not UTF-8, and you can't depend on
that,
Correct. Linux pathnames are octet strings regardless of the locale.
That's why Linux developers should
On Sun, Mar 8, 2015 at 5:34 AM, Dan Sommers d...@tombstonezero.net wrote:
I think we're all agreeing: not all file systems are the same, and
Python doesn't smooth out all of the bumps, even for something that
seems as simple as displaying the names of files in a directory. And
that's *after*
On 07/03/2015 16:48, Marko Rauhamaa wrote:
Mark Lawrence breamore...@yahoo.co.uk:
On 07/03/2015 16:25, Marko Rauhamaa wrote:
Here's an example:
b = b'\x80'
Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
from str objects to bytes objects.
Python 2 might, Python
Dan Sommers d...@tombstonezero.net:
I think we're all agreeing: not all file systems are the same, and
Python doesn't smooth out all of the bumps, even for something that
seems as simple as displaying the names of files in a directory. And
that's *after* we've agreed that filesystems contain
On Sun, Mar 8, 2015 at 3:40 AM, Mark Lawrence breamore...@yahoo.co.uk wrote:
Here's an example:
b = b'\x80'
Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
from str objects to bytes objects.
Python 2 might, Python 3 doesn't.
He was talking about this line of
Mark Lawrence breamore...@yahoo.co.uk:
It would clearly help if you were to type in the correct UK English
accent.
Your ad-hominem-to-contribution ratio is alarmingly high.
Marko
--
https://mail.python.org/mailman/listinfo/python-list
On Sun, Mar 8, 2015 at 4:14 AM, Marko Rauhamaa ma...@pacujo.net wrote:
See:
$ mkdir /tmp/xyz
$ touch /tmp/xyz/
\x80'
$ python3
Python 3.3.2 (default, Dec 4 2014, 12:49:00)
[GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux
Type help, copyright, credits or license for more
On 07/03/2015 17:16, Marko Rauhamaa wrote:
Mark Lawrence breamore...@yahoo.co.uk:
It would clearly help if you were to type in the correct UK English
accent.
Your ad-hominem-to-contribution ratio is alarmingly high.
Marko
You've been a PITA ever since you first joined this list, what
On Sun, 08 Mar 2015 05:13:09 +1100, Chris Angelico wrote:
On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers d...@tombstonezero.net wrote:
On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa ma...@pacujo.net wrote:
Correct. Linux pathnames are
On Sun, Mar 8, 2015 at 3:54 AM, Marko Rauhamaa ma...@pacujo.net wrote:
You can't operate on file names and text files using Python strings. Or
at least, you will need to add (nontrivial) exception catching logic.
You can't operate on a JPG file using a Unicode string, nor an array
of integers.
Chris Angelico ros...@gmail.com:
If you really REALLY can't use the bytes() type to work with something
that is, yaknow, bytes, then you could use an alternative encoding
that has a value for every byte. It's still not Unicode text, so it
doesn't much matter which encoding you use. But it's
On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa ma...@pacujo.net wrote:
Correct. Linux pathnames are octet strings regardless of the locale.
That's why Linux developers should refer to filenames using bytes.
Unfortunately, Python
On 07/03/2015 18:34, Dan Sommers wrote:
On Sun, 08 Mar 2015 05:13:09 +1100, Chris Angelico wrote:
On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers d...@tombstonezero.net wrote:
On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa
Mark Lawrence breamore...@yahoo.co.uk:
On 07/03/2015 16:25, Marko Rauhamaa wrote:
Here's an example:
b = b'\x80'
Yes, it generates an exception. IOW, UTF-8 is not a bijective mapping
from str objects to bytes objects.
Python 2 might, Python 3 doesn't.
Python 3.3.2 (default, Dec
On Sun, Mar 8, 2015 at 3:54 AM, Marko Rauhamaa ma...@pacujo.net wrote:
All you've proven is that there are bit patterns which are not UTF-8
streams...
And that causes problems.
Demonstrate.
ChrisA
--
https://mail.python.org/mailman/listinfo/python-list
Chris Angelico ros...@gmail.com:
On Sun, Mar 8, 2015 at 4:14 AM, Marko Rauhamaa ma...@pacujo.net wrote:
File names encoded with Latin-X are quite commonplace even in UTF-8
locales.
That is not a problem with UTF-8, though. I don't understand how
you're blaming UTF-8 for that.
I'm saying it
On Sun, Mar 8, 2015 at 5:02 AM, Dan Sommers d...@tombstonezero.net wrote:
On Sun, 08 Mar 2015 04:59:56 +1100, Chris Angelico wrote:
On Sun, Mar 8, 2015 at 4:50 AM, Marko Rauhamaa ma...@pacujo.net wrote:
Correct. Linux pathnames are octet strings regardless of the locale.
That's why Linux
--- Original Message -
From: Chris Angelico ros...@gmail.com
To:
Cc: python-list@python.org python-list@python.org
Sent: Saturday, March 7, 2015 6:26 PM
Subject: Re: Newbie question about text encoding
On Sun, Mar 8, 2015 at 4:14 AM, Marko Rauhamaa ma...@pacujo.net wrote:
See
On Sat, 07 Mar 2015 19:00:47 +, Mark Lawrence wrote:
Isn't pathlib
https://docs.python.org/3/library/pathlib.html#module-pathlib
effectively a more recent attempt at smoothing or even removing (some
of) the bumps? Has anybody here got experience of it as I've never
used it?
I almost
Chris Angelico ros...@gmail.com:
On Sun, Mar 8, 2015 at 2:48 AM, Marko Rauhamaa ma...@pacujo.net wrote:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
Can you explain?
In Python
On Sat, Mar 7, 2015 at 10:09 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
Stop using MySQL, which is a joke of a database[1], and use Postgres which
does not have this problem.
I agree with the recommendation, though to be fair to MySQL, it is now
possible to store full
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Rustom Mody wrote:
My conclusion: Early adopters of unicode -- Windows and Java -- were
punished for their early adoption. You can blame the unicode
consortium, you can blame the babel of human languages, particularly
that some use
On 07/03/2015 12:02, Chris Angelico wrote:
On Sat, Mar 7, 2015 at 10:53 PM, Marko Rauhamaa ma...@pacujo.net wrote:
The main dream was a fixed-width encoding scheme. People thought 16 bits
would be enough. The dream is so precious and true to us in the West
that people don't want to give it up.
On 07/03/2015 11:09, Steven D'Aprano wrote:
Rustom Mody wrote:
This includes not just bug-prone-system code such as Java and Windows but
seemingly working code such as python 3.
What Unicode bugs do you think Python 3.3 and above have?
Methinks somebody has been drinking too much loony
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
Can you explain?
As far as I am aware, every code point has one and only one valid UTF-8
encoding, and every UTF-8 encoding has one and only one valid code point.
There are *invalid* UTF-8
Rustom Mody wrote:
On Thursday, March 5, 2015 at 7:36:32 PM UTC+5:30, Steven D'Aprano wrote:
[...]
Chris is suggesting that going from BMP to all of Unicode is not the hard
part. Going from ASCII to the BMP part of Unicode is the hard part. If
you can do that, you can go the rest of the way
On Sat, Mar 7, 2015 at 10:53 PM, Marko Rauhamaa ma...@pacujo.net wrote:
The main dream was a fixed-width encoding scheme. People thought 16 bits
would be enough. The dream is so precious and true to us in the West
that people don't want to give it up.
So... use Pike, or Python 3.3+?
ChrisA
--
On Saturday, March 7, 2015 at 11:41:53 AM UTC+5:30, Terry Reedy wrote:
On 3/6/2015 11:20 AM, Rustom Mody wrote:
=
pp =
print (pp)
=
Try open it in idle3 and you get (at least I get):
$ idle3 ff.py
Traceback (most recent call last):
File /usr/bin/idle3,
On Saturday, March 7, 2015 at 11:49:44 PM UTC+5:30, Mark Lawrence wrote:
On 07/03/2015 17:16, Marko Rauhamaa wrote:
Mark Lawrence:
It would clearly help if you were to type in the correct UK English
accent.
Your ad-hominem-to-contribution ratio is alarmingly high.
Marko
Marko Rauhamaa wrote:
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info:
Marko Rauhamaa wrote:
That said, UTF-8 does suffer badly from its not being
a bijective mapping.
Can you explain?
In Python terms, there are bytes objects b that don't satisfy:
On Sat, Mar 7, 2015 at 1:03 AM, random...@fastmail.us wrote:
On Fri, Mar 6, 2015, at 08:39, Chris Angelico wrote:
Number of code points is the most logical way to length-limit
something. If you want to allow users to set their display names but
not to make arbitrarily long ones, limiting them
On Fri, Mar 6, 2015, at 09:11, Chris Angelico wrote:
To prevent people from putting three paragraphs of lipsum in and
calling it a username.
Limiting by UTF-8 bytes or UTF-16 units works just as well for that.
So you truncate to the desired length, then if the first character of
the
On Sat, Mar 7, 2015 at 1:50 AM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
Rustom Mody wrote:
On Friday, March 6, 2015 at 10:50:35 AM UTC+5:30, Chris Angelico wrote:
[snip example of an analogous situation with NULs]
Strawman.
Sigh. If I had a dollar for every time
On Fri, Mar 6, 2015, at 08:39, Chris Angelico wrote:
Number of code points is the most logical way to length-limit
something. If you want to allow users to set their display names but
not to make arbitrarily long ones, limiting them to X code points is
the safest way (and preferably do an NFC
Rustom Mody wrote:
On Friday, March 6, 2015 at 10:50:35 AM UTC+5:30, Chris Angelico wrote:
[snip example of an analogous situation with NULs]
Strawman.
Sigh. If I had a dollar for every time somebody cried Strawman! when what
they really should say is Yes, that's a good argument, I'm afraid
On Fri, Mar 6, 2015 at 8:02 PM, Rustom Mody rustompm...@gmail.com wrote:
Broken systems can be shown up by anything. Suppose you have a program
that breaks when it gets a NUL character (not unknown in C code); is
the fault with the Unicode consortium for allocating something at
codepoint 0, or
On Fri, Mar 6, 2015, at 04:06, Rustom Mody wrote:
Also:
Can a programmer who is away from UTF-16 in one part of the system (say
by using python3)
assume he is safe all over?
The most common failure of UTF-16 support, supposedly, is in programs
misusing the number of code units (for length or
On Sat, Mar 7, 2015 at 12:33 AM, random...@fastmail.us wrote:
However, when do you _really_ want the number of characters? You may
want to use it for, for example, the number of columns in a 'monospace'
font, which you've already screwed up because you haven't accounted for
double-wide
random...@fastmail.us wrote:
My point is there are very few
problems to which count of Unicode code points is the only right
answer - that UTF-32 is good enough for but that are meaningfully
impacted by a naive usage of UTF-16, to the point where UTF-16 is
something you have to be safe from.
On Friday, March 6, 2015 at 8:20:22 PM UTC+5:30, Steven D'Aprano wrote:
Rustom Mody wrote:
On Friday, March 6, 2015 at 10:50:35 AM UTC+5:30, Chris Angelico wrote:
[snip example of an analogous situation with NULs]
Strawman.
Sigh. If I had a dollar for every time somebody cried
On Sat, Mar 7, 2015 at 3:20 AM, Rustom Mody rustompm...@gmail.com wrote:
C's string is not bug-prone its plain buggy as it cannot represent strings
with nulls.
I would not go that far for UTF-16.
It is bug-inviting but it can also be implemented correctly
C's standard library string handling
On Friday, March 6, 2015 at 2:33:11 PM UTC+5:30, Rustom Mody wrote:
Lets please stick to UTF-16 shall we?
Now tell me:
- Is it broken or not?
- Is it widely used or not?
- Should programmers be careful of it or not?
- Should programmers be warned about it or not?
Also:
Can a programmer
On Friday, March 6, 2015 at 10:50:35 AM UTC+5:30, Chris Angelico wrote:
On Fri, Mar 6, 2015 at 3:53 PM, Rustom Mody wrote:
My conclusion: Early adopters of unicode -- Windows and Java -- were
punished
for their early adoption. You can blame the unicode consortium, you can
blame the
On Friday, March 6, 2015 at 3:24:48 PM UTC+5:30, Chris Angelico wrote:
On Fri, Mar 6, 2015 at 8:02 PM, Rustom Mody wrote:
Broken systems can be shown up by anything. Suppose you have a program
that breaks when it gets a NUL character (not unknown in C code); is
the fault with the Unicode
On 3/6/2015 11:20 AM, Rustom Mody wrote:
=
pp =
print (pp)
=
Try open it in idle3 and you get (at least I get):
$ idle3 ff.py
Traceback (most recent call last):
File /usr/bin/idle3, line 5, in module
main()
File /usr/lib/python3.4/idlelib/PyShell.py, line 1562, in
On Thu, Mar 5, 2015, at 09:06, Steven D'Aprano wrote:
I mostly agree with Chris. Supporting *just* the BMP is non-trivial in
UTF-8
and UTF-32, since that goes against the grain of the system. You would
have
to program in artificial restrictions that otherwise don't exist.
UTF-8 is already
random...@fastmail.us wrote:
On Thu, Mar 5, 2015, at 09:06, Steven D'Aprano wrote:
I mostly agree with Chris. Supporting *just* the BMP is non-trivial in
UTF-8
and UTF-32, since that goes against the grain of the system. You would
have
to program in artificial restrictions that otherwise
Rustom Mody wrote:
On Wednesday, March 4, 2015 at 10:25:24 AM UTC+5:30, Chris Angelico wrote:
On Wed, Mar 4, 2015 at 3:45 PM, Rustom Mody wrote:
It lists some examples of software that somehow break/goof going from
BMP-only unicode to 7.0 unicode.
IOW the suggestion is that the the
On Fri, Mar 6, 2015 at 3:53 PM, Rustom Mody rustompm...@gmail.com wrote:
My conclusion: Early adopters of unicode -- Windows and Java -- were punished
for their early adoption. You can blame the unicode consortium, you can
blame the babel of human languages, particularly that some use
On Thursday, March 5, 2015 at 7:36:32 PM UTC+5:30, Steven D'Aprano wrote:
Rustom Mody wrote:
On Wednesday, March 4, 2015 at 10:25:24 AM UTC+5:30, Chris Angelico wrote:
On Wed, Mar 4, 2015 at 3:45 PM, Rustom Mody wrote:
It lists some examples of software that somehow break/goof going
On Wed, Mar 4, 2015 at 5:03 AM, Rustom Mody rustompm...@gmail.com wrote:
What I was trying to say expanded here
http://blog.languager.org/2015/03/whimsical-unicode.html
[Hope the word 'whimsical' is less jarring and more accurate than
'gibberish']
Re footnote #4: ½ is a single character for
On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
On 2/26/2015 8:24 AM, Chris Angelico wrote:
On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody wrote:
Wrote something up on why we should stop using ASCII:
http://blog.languager.org/2015/02/universal-unicode.html
I
On 3/3/2015 1:03 PM, Rustom Mody wrote:
On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
You should add emoticons, but not call them or the above 'gibberish'.
I think that this part of your post is more 'unprofessional' than the
character blocks. It is very jarring
On Wednesday, March 4, 2015 at 10:25:24 AM UTC+5:30, Chris Angelico wrote:
On Wed, Mar 4, 2015 at 3:45 PM, Rustom Mody wrote:
It lists some examples of software that somehow break/goof going from
BMP-only
unicode to 7.0 unicode.
IOW the suggestion is that the the two-way
On Wednesday, March 4, 2015 at 12:14:11 AM UTC+5:30, Chris Angelico wrote:
On Wed, Mar 4, 2015 at 5:03 AM, Rustom Mody wrote:
What I was trying to say expanded here
http://blog.languager.org/2015/03/whimsical-unicode.html
[Hope the word 'whimsical' is less jarring and more accurate than
Rustom Mody wrote:
On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
On 2/26/2015 8:24 AM, Chris Angelico wrote:
On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody wrote:
Wrote something up on why we should stop using ASCII:
On Wednesday, March 4, 2015 at 9:35:28 AM UTC+5:30, Rustom Mody wrote:
On Wednesday, March 4, 2015 at 8:24:40 AM UTC+5:30, Steven D'Aprano wrote:
Rustom Mody wrote:
On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
On 2/26/2015 8:24 AM, Chris Angelico wrote:
On Wednesday, March 4, 2015 at 12:07:06 AM UTC+5:30, jmf wrote:
Le mardi 3 mars 2015 19:04:06 UTC+1, Rustom Mody a écrit :
On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
On 2/26/2015 8:24 AM, Chris Angelico wrote:
On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody
On Wednesday, March 4, 2015 at 8:24:40 AM UTC+5:30, Steven D'Aprano wrote:
Rustom Mody wrote:
On Thursday, February 26, 2015 at 10:33:44 PM UTC+5:30, Terry Reedy wrote:
On 2/26/2015 8:24 AM, Chris Angelico wrote:
On Thu, Feb 26, 2015 at 11:40 PM, Rustom Mody wrote:
Wrote something up
On Wed, Mar 4, 2015 at 1:54 PM, Steven D'Aprano
steve+comp.lang.pyt...@pearwood.info wrote:
It is easy to mock what is not important to you. I daresay kids adding emoji
to their 10 character tweets would mock all the useless maths symbols in
Unicode too.
Definitely! Who ever sings do you wanna
On Wed, Mar 4, 2015 at 3:45 PM, Rustom Mody rustompm...@gmail.com wrote:
It lists some examples of software that somehow break/goof going from BMP-only
unicode to 7.0 unicode.
IOW the suggestion is that the the two-way classification
- ASCII
- Unicode
is less useful and accurate than the
On 02/27/2015 06:54 AM, Steven D'Aprano wrote:
Dave Angel wrote:
On 02/27/2015 12:58 AM, Steven D'Aprano wrote:
Dave Angel wrote:
(Although I believe Seymour Cray was quoted as saying that virtual
memory is a crock, because you can't fake what you ain't got.)
If I recall correctly, disk
On Sat, 28 Feb 2015 03:12:16 +1100, Chris Angelico wrote:
On Sat, Feb 28, 2015 at 3:00 AM, alister
alister.nospam.w...@ntlworld.com wrote:
I think there is a case for bringing back the overlay file, or at least
loading larger programs in sections only loading the routines as they
are
On Sat, Feb 28, 2015 at 3:45 AM, alister
alister.nospam.w...@ntlworld.com wrote:
On Sat, 28 Feb 2015 03:12:16 +1100, Chris Angelico wrote:
On Sat, Feb 28, 2015 at 3:00 AM, alister
alister.nospam.w...@ntlworld.com wrote:
I think there is a case for bringing back the overlay file, or at least
On 2015-02-27, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:
Dave Angel wrote:
On 02/27/2015 12:58 AM, Steven D'Aprano wrote:
Dave Angel wrote:
(Although I believe Seymour Cray was quoted as saying that virtual
memory is a crock, because you can't fake what you ain't got.)
On Sat, Feb 28, 2015 at 1:02 AM, Dave Angel da...@davea.name wrote:
The term virtual memory is used for many aspects of the modern memory
architecture. But I presume you're using it in the sense of running in a
swapfile as opposed to running in physical RAM.
Given that this started with a
On 2015-02-27, Grant Edwards invalid@invalid.invalid wrote:
On 2015-02-27, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:
Dave Angel wrote:
On 02/27/2015 12:58 AM, Steven D'Aprano wrote: Dave Angel wrote:
(Although I believe Seymour Cray was quoted as saying that virtual
memory
On Sat, 28 Feb 2015 01:22:15 +1100, Chris Angelico wrote:
If you're trying to use the pagefile/swapfile as if it's more memory (I
have 256MB of memory, but 10GB of swap space, so that's 10GB of
memory!), then yes, these performance considerations are huge. But
suppose you need to run a
On Sat, Feb 28, 2015 at 3:00 AM, alister
alister.nospam.w...@ntlworld.com wrote:
I think there is a case for bringing back the overlay file, or at least
loading larger programs in sections
only loading the routines as they are required could speed up the start
time of many large applications.
On 02/27/2015 09:22 AM, Chris Angelico wrote:
On Sat, Feb 28, 2015 at 1:02 AM, Dave Angel da...@davea.name wrote:
The term virtual memory is used for many aspects of the modern memory
architecture. But I presume you're using it in the sense of running in a
swapfile as opposed to running in
On 2015-02-27 16:45, alister wrote:
On Sat, 28 Feb 2015 03:12:16 +1100, Chris Angelico wrote:
On Sat, Feb 28, 2015 at 3:00 AM, alister
alister.nospam.w...@ntlworld.com wrote:
I think there is a case for bringing back the overlay file, or at least
loading larger programs in sections only
On 02/27/2015 11:00 AM, alister wrote:
On Sat, 28 Feb 2015 01:22:15 +1100, Chris Angelico wrote:
If you're trying to use the pagefile/swapfile as if it's more memory (I
have 256MB of memory, but 10GB of swap space, so that's 10GB of
memory!), then yes, these performance considerations are
On Sat, Feb 28, 2015 at 7:52 AM, Dave Angel da...@davea.name wrote:
If that's the case on the architectures you're talking about, then the
problem of slow loading is not triggered by the memory usage, but by lots of
initialization code. THAT's what should be deferred for seldom-used
portions
On Fri, 27 Feb 2015 19:14:00 +, MRAB wrote:
I suppose you could load the basic parts first so that the user can
start working, and then load the additional features in the background.
quite possible
my opinion on this is very fluid
it may work for some applications, it probably wouldn't
1 - 100 of 132 matches
Mail list logo