On Fri, 21 Jul 2017 06:05 pm, Chris Angelico wrote:
>> But emoji sequences will often require four code points, three of which will
>> be in the supplementary planes.
>>
>> http://unicode.org/emoji/charts/emoji-zwj-sequences.html
>
> "Often"? I doubt that; a lot of emoji don't require that many.
On Fri, Jul 21, 2017 at 4:34 PM, Steve D'Aprano
wrote:
> On Fri, 21 Jul 2017 01:43 pm, Chris Angelico wrote:
>
>> Strings with all code
>> points on the BMP and no combining characters are still able to be
>> represented as they are today, again with the empty
On Fri, 21 Jul 2017 01:43 pm, Chris Angelico wrote:
> Strings with all code
> points on the BMP and no combining characters are still able to be
> represented as they are today, again with the empty secondary array.
I presume that since the problem we're trying to solve here is that certain
On Fri, Jul 21, 2017 at 1:20 PM, Steve D'Aprano
wrote:
> On Fri, 21 Jul 2017 04:05 am, Marko Rauhamaa wrote:
>
>> If any string code point is 1114112 or greater
>
> By definition, no Unicode code point can ever have an ordinal value greater
> than
> 0x10 =
On Fri, 21 Jul 2017 04:05 am, Marko Rauhamaa wrote:
> If any string code point is 1114112 or greater
By definition, no Unicode code point can ever have an ordinal value greater than
0x10 = 1114111.
So I don't know what you're talking about, but it isn't Unicode. If you want to
invent your
Chris Angelico :
> Actually, the implementation I detailed was far SIMPLER than I thought
> it would be; I started writing that post trying to prove that it was
> impossible, but it turns out it isn't actually impossible. Just highly
> impractical.
The existing str
On Fri, Jul 21, 2017 at 2:10 AM, Random832 wrote:
> On Thu, Jul 20, 2017, at 01:15, Steven D'Aprano wrote:
>> I haven't really been paying attention to Marko's suggestion in detail,
>> but if we're talking about a whole new data type, how about a list of
>> nodes, where
On Fri, Jul 21, 2017 at 2:46 AM, Rhodri James wrote:
> On 20/07/17 16:18, Rustom Mody wrote:
>>
>> So coming to the point:
>> Its not whether Einstein or Mencken¹ is right but rather that Mencken
>> applies to
>> 1 whereas Einstein applies to 3
>>
>> And (IMHO) text should
On 20/07/17 16:18, Rustom Mody wrote:
So coming to the point:
Its not whether Einstein or Mencken¹ is right but rather that Mencken applies
to
1 whereas Einstein applies to 3
And (IMHO) text should be squarely classed in 3 not 1
The gmas of this world have made shopping lists, written (and
On Thu, Jul 20, 2017, at 01:15, Steven D'Aprano wrote:
> I haven't really been paying attention to Marko's suggestion in detail,
> but if we're talking about a whole new data type, how about a list of
> nodes, where each node's data is a decomposed string object guaranteed to
> be either:
How
On Thursday, July 20, 2017 at 3:21:52 AM UTC+5:30, Rick Johnson wrote:
> On Tuesday, July 18, 2017 at 10:07:41 PM UTC-5, Steve D'Aprano wrote:
> > On Wed, 19 Jul 2017 12:10 am, Rustom Mody wrote:
>
> [...]
>
> > > Einstein: If you can't explain something to a six-year-
> > > old, you really
On Thu, 20 Jul 2017 12:40:08 +1000, Chris Angelico wrote:
> On Thu, Jul 20, 2017 at 12:12 PM, Steve D'Aprano
> wrote:
>> On Thu, 20 Jul 2017 08:12 am, Gregory Ewing wrote:
>>
>>> Chris Angelico wrote:
>> [snip overly complex and complicated string implementation]
>>
On Thu, Jul 20, 2017 at 12:12 PM, Steve D'Aprano
wrote:
> On Thu, 20 Jul 2017 08:12 am, Gregory Ewing wrote:
>
>> Chris Angelico wrote:
> [snip overly complex and complicated string implementation]
>
An accurate description, but in my own defense, I had misunderstood
On Thu, 20 Jul 2017 08:12 am, Gregory Ewing wrote:
> Chris Angelico wrote:
[snip overly complex and complicated string implementation]
> +1. We should totally do this just to troll the RUE!
You're an evil, wicked man, and I love it.
--
Steve
“Cheer up,” they said, “things could be worse.”
On Thu, 20 Jul 2017 01:30 am, Random832 wrote:
> On Tue, Jul 18, 2017, at 22:49, Steve D'Aprano wrote:
>> > What about Emoji?
>> > U+1F469 WOMAN is two columns wide on its own.
>> > U+1F4BB PERSONAL COMPUTER is two columns wide on its own.
>> > U+200D ZERO WIDTH JOINER is zero columns wide on its
On Thu, 20 Jul 2017 04:34 am, Mikhail V wrote:
> It is also pretty obvious that these Caps makes it harder to read in general.
> (more obvious that excessive diacritics, like in French)
No it isn't.
--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough,
Random832 writes:
> On Tue, Jul 18, 2017, at 19:21, Gregory Ewing wrote:
> > Random832 wrote:
> > > What about Emoji?
> > > U+1F469 WOMAN is two columns wide on its own.
> > > U+1F4BB PERSONAL COMPUTER is two columns wide on its own.
>
> Emoji comes from Japanese 絵文字 -
On Wednesday, July 19, 2017 at 5:29:23 AM UTC-5, Rhodri James wrote:
> when Acorn were developing their version of extended ASCII
> in the late 80s, they asked three different University
> lecturers in Welsh what extra characters they needed, and
> got three different answers.
And who would have
On Wednesday, July 19, 2017 at 1:57:47 AM UTC-5, Steven D'Aprano wrote:
> On Wed, 19 Jul 2017 17:51:49 +1200, Gregory Ewing wrote:
>
> > Chris Angelico wrote:
> >> Once you NFC or NFD normalize both strings, identical strings will
> >> generally have identical codepoints... You should then be
On Tuesday, July 18, 2017 at 10:37:18 PM UTC-5, Steve D'Aprano wrote:
> On Wed, 19 Jul 2017 10:34 am, Mikhail V wrote:
>
> > Ok, in this narrow context I can also agree.
> > But in slightly wider context that phrase may sound almost like:
> > "neither geometrical shape is better than the other as
On Tuesday, July 18, 2017 at 7:35:13 PM UTC-5, Mikhail V wrote:
> ChrisA wrote:
> >On Wed, Jul 19, 2017 at 6:05 AM, Mikhail V wrote:
> >> On 2017-07-18, Steve D'Aprano wrote:
> > > > _Neither system is right or wrong, or better than the
> > > > other._
> > >
> > >
On Tuesday, July 18, 2017 at 10:24:54 PM UTC-5, Steve D'Aprano wrote:
> On Wed, 19 Jul 2017 10:08 am, Ben Finney wrote:
>
> > Gregory Ewing writes:
> >
> > > The term "emoji" is becoming rather strained these days.
> > > The idea of "woman" and "personal computer"
Chris Angelico wrote:
* Strings with all codepoints < 256 are represented as they currently
are (one byte per char). There are no combining characters in the
first 256 codepoints anyway.
* Strings with all codepoints < 65536 and no combining characters,
ditto (two bytes per char).
* Strings with
Grant Edwards wrote:
Maybe it was a mistaken spelling of 'fortuned'?
Most likely. Interestingly, several sites claimed to be able to
tell me things about it. One of them tried to find poetry
related to it (didn't find any, though).
Another one offered to show me how to pronounce it, and it
On Tuesday, July 18, 2017 at 10:07:41 PM UTC-5, Steve D'Aprano wrote:
> On Wed, 19 Jul 2017 12:10 am, Rustom Mody wrote:
[...]
> > Einstein: If you can't explain something to a six-year-
> > old, you really don't understand it yourself.
> >
>
> [...]
>
> Think about it: it simply is nonsense. If
On 7/19/2017 4:28 AM, Steven D'Aprano wrote:
On Tue, 18 Jul 2017 10:11:39 -0400, Random832 wrote:
On Fri, Jul 14, 2017, at 04:15, Marko Rauhamaa wrote:
Consider, for example, a Python source code
editor where you want to limit the length of the line based on the
number of characters more
Steven D'Aprano wrote:
>On Wed, 19 Jul 2017 10:34 am, Mikhail V wrote:
>> Ok, in this narrow context I can also agree.
>> But in slightly wider context that phrase may sound almost like:
>> "neither geometrical shape is better than the other as a basis
>> for a wheel. If you have polygonal
On 2017-07-19 09:29, Marko Rauhamaa wrote:
Gregory Ewing :
Marko Rauhamaa wrote:
* a final "v" receives a superfluous "e" ("love")
It's not superfluous there, it's preventing "love" from looking like
it should rhyme with "of".
I'm pretty sure that wasn't the
On 19/07/17 04:19, Rustom Mody wrote:
> On Wednesday, July 19, 2017 at 3:00:21 AM UTC+5:30, Marko Rauhamaa wrote:
>> Chris Angelico :
>>
>>> Let me give you one concrete example: the letter "ö". In English, it
>>> is (very occasionally) used to indicate diaeresis, where a pair of
>>> letters is
On Thu, Jul 20, 2017 at 1:45 AM, Marko Rauhamaa wrote:
> So let's assume we will expand str to accommodate the requirements of
> grapheme clusters.
>
> All existing code would still produce only traditional strings. The only
> way to introduce the new "super code points" is by
Chris Angelico :
> Now, this is a performance question, and it's not unreasonable to talk
> about semantics first and let performance wait for later. But when you
> consider how many ASCII-only strings Python uses internally (the names
> of basically every global function and
On Tue, Jul 18, 2017, at 22:49, Steve D'Aprano wrote:
> > What about Emoji?
> > U+1F469 WOMAN is two columns wide on its own.
> > U+1F4BB PERSONAL COMPUTER is two columns wide on its own.
> > U+200D ZERO WIDTH JOINER is zero columns wide on its own.
>
>
> What about them? In a monospaced font,
On Tue, Jul 18, 2017, at 19:21, Gregory Ewing wrote:
> Random832 wrote:
> > What about Emoji?
> > U+1F469 WOMAN is two columns wide on its own.
> > U+1F4BB PERSONAL COMPUTER is two columns wide on its own.
>
> The term "emoji" is becoming rather strained these days.
> The idea of "woman" and
On Wed, Jul 19, 2017 at 11:42 PM, Marko Rauhamaa wrote:
> Chris Angelico :
>
>> Perhaps we don't have the same understanding of "constant time". Or
>> are you saying that you actually store and represent this as those
>> arbitrary-precision integers? Every
Grant Edwards :
> On 2017-07-19, Gregory Ewing wrote:
>> Grant Edwards wrote:
>>>vacuum, continuum, squush, fortuuned
>>
>> Fortuuned? Where did you find that?
>
> It was in the scowl-7.1 wordlist I had laying around:
>
>
On 2017-07-19, Gregory Ewing wrote:
> Grant Edwards wrote:
>>vacuum, continuum, squush, fortuuned
>
> Fortuuned? Where did you find that?
It was in the scowl-7.1 wordlist I had laying around:
http://wordlist.aspell.net/
However, the scowl website now claims
Chris Angelico :
> Perhaps we don't have the same understanding of "constant time". Or
> are you saying that you actually store and represent this as those
> arbitrary-precision integers? Every character of every string has to
> be a multiprecision integer?
Yes, although feel
On Wed, Jul 19, 2017 at 10:13 PM, Marko Rauhamaa wrote:
> Chris Angelico :
>
>> On Wed, Jul 19, 2017 at 7:53 PM, Marko Rauhamaa wrote:
>>> Here's a proposal:
>>>
>>>* introduce a building (predefined) class Text
>>>
>>>* conceptually,
Chris Angelico :
> On Wed, Jul 19, 2017 at 7:53 PM, Marko Rauhamaa wrote:
>> Here's a proposal:
>>
>>* introduce a building (predefined) class Text
>>
>>* conceptually, a Text object is a sequence of "real" characters
>>
>>* you can access each
On Wed, Jul 19, 2017 at 7:53 PM, Marko Rauhamaa wrote:
> Here's a proposal:
>
>* introduce a building (predefined) class Text
>
>* conceptually, a Text object is a sequence of "real" characters
>
>* you can access each "real" character by its position in O(1)
>
>
On 19/07/17 09:17, Steven D'Aprano wrote:
On Tue, 18 Jul 2017 16:37:37 +0100, Rhodri James wrote:
(For the record, one of my grandmothers would have been baffled by this
conversation, and the other one would have had definite opinions on
whether accents were distinct characters or not,
Chris Angelico :
> To be quite honest, I wouldn't care about that possibility. If I could
> design regex semantics purely from an idealistic POV, I would say that
> [xyzã], regardless of its encoding, will match any of the four
> characters "x", "y", "z", "ã".
>
> Earlier I
On Tue, 18 Jul 2017 10:11:39 -0400, Random832 wrote:
> On Fri, Jul 14, 2017, at 04:15, Marko Rauhamaa wrote:
>> Consider, for example, a Python source code
>> editor where you want to limit the length of the line based on the
>> number of characters more typically than based on the number of
On Tue, 18 Jul 2017 16:37:37 +0100, Rhodri James wrote:
> (For the record, one of my grandmothers would have been baffled by this
> conversation, and the other one would have had definite opinions on
> whether accents were distinct characters or not, followed by a
> digression into whether "ŵ"
Gregory Ewing :
> Marko Rauhamaa wrote:
>> * a final "v" receives a superfluous "e" ("love")
>
> It's not superfluous there, it's preventing "love" from looking like
> it should rhyme with "of".
I'm pretty sure that wasn't the original motivation. If I had to guess,
Gregory Ewing :
> Marko Rauhamaa wrote:
>>> * the final consonant of a single-syllable word is doubled only if the
>>> consonant is "k", "l" or "s" ("kick", "kill", "kiss")
>>
>> ... or "f" ("stiff") or "z" ("buzz")
>
> or sometimes "r" ("burr"), or "t" ("butt").
Marko Rauhamaa wrote:
For all we know, someone somewhere might be cooking up a language that
depends on "q̈".
It makes perfectly good sense to me. It's the second derivative
of q with respect to time.
--
Greg
--
https://mail.python.org/mailman/listinfo/python-list
Marko Rauhamaa wrote:
* "v" is never doubled ("shovel")
Except for all the words that Grant listed before.
* a final "v" receives a superfluous "e" ("love")
It's not superfluous there, it's preventing "love" from looking like
it should rhyme with "of". (Of course you just have to know
On Wed, Jul 19, 2017 at 4:49 PM, Steven D'Aprano wrote:
> The *really* tricky part is if you receive a string from the user
> intended as a regular expression. If they provide
>
> [xyzã]
>
> as part of a regex, and you receive ã in denormalized form
>
> U+0061 LATIN SMALL
Marko Rauhamaa wrote:
* the final consonant of a single-syllable word is doubled only if the
consonant is "k", "l" or "s" ("kick", "kill", "kiss")
... or "f" ("stiff") or "z" ("buzz")
or sometimes "r" ("burr"), or "t" ("butt").
--
Greg
--
Grant Edwards wrote:
vacuum, continuum, squush, fortuuned
Fortuuned? Where did you find that?
Google gives me a bizarre set of results, none of which
appear to be an English dictionary definition.
--
Greg
--
https://mail.python.org/mailman/listinfo/python-list
On Wed, 19 Jul 2017 17:51:49 +1200, Gregory Ewing wrote:
> Chris Angelico wrote:
>> Once you NFC or NFD normalize both strings, identical strings will
>> generally have identical codepoints... You should then be able to use
>> normal regular expressions to match correctly.
>
> Except that if you
Chris Angelico wrote:
Once you NFC or NFD normalize both strings, identical strings will
generally have identical codepoints... You should then be able to use normal
regular expressions to
match correctly.
Except that if you want to match a set of characters,
you can't reliably use [...], you
On Mon, 17 Jul 2017 04:12 am, Ben Finney wrote:
> Steven D'Aprano writes:
>
>> On Sun, 16 Jul 2017 12:33:10 +1000, Ben Finney wrote:
>>
>> > And yet the ASCII and Unicode standard says code point 0x0A (U+000A
>> > LINE FEED) is a character, by definition.
>> [...]
>> > > Is
On Wed, 19 Jul 2017 10:34 am, Mikhail V wrote:
> Ok, in this narrow context I can also agree.
> But in slightly wider context that phrase may sound almost like:
> "neither geometrical shape is better than the other as a basis
> for a wheel. If you have polygonal wheels, they are still called
On Wed, 19 Jul 2017 10:08 am, Ben Finney wrote:
> Gregory Ewing writes:
>
>> The term "emoji" is becoming rather strained these days.
>> The idea of "woman" and "personal computer" being emotions
>> is an interesting one...
>
> I think of “emoji” as “not actually a
On Wed, 19 Jul 2017 12:10 am, Rustom Mody wrote:
> On Monday, July 17, 2017 at 10:14:00 PM UTC+5:30, Rhodri James wrote:
>> On 17/07/17 05:10, Rustom Mody wrote:
>> > Hint1: Ask your grandmother whether unicode's notion of character makes
>> > sense. Ask 10 gmas from 10 language-L's
>> > Hint2:
On Wed, 19 Jul 2017 12:29 am, Random832 wrote:
> On Sun, Jul 16, 2017, at 01:37, Steven D'Aprano wrote:
>> In a *well-designed* *bug-free* monospaced font, all code points should
>> be either zero-width or one column wide. Or two columns, if the font
>> supports East Asian fullwidth characters.
>
On Tue, 18 Jul 2017 11:59 pm, Chris Angelico wrote:
>> (I don't think any native English words use a double-V or double-U, but the
>> possibility exists.)
>
> vacuum.
Nice. Also continuum and residuum.
For double V, we have savvy, skivvy, flivver (an old slang term for cars).
--
Steve
On Wednesday, July 19, 2017 at 3:00:21 AM UTC+5:30, Marko Rauhamaa wrote:
> Chris Angelico :
>
> > Let me give you one concrete example: the letter "ö". In English, it
> > is (very occasionally) used to indicate diaeresis, where a pair of
> > letters is not a double letter - for example,
On Wed, 19 Jul 2017 12:09 am, Random832 wrote:
> On Fri, Jul 14, 2017, at 08:33, Chris Angelico wrote:
>> What do you mean about regular expressions? You can use REs with
>> normalized strings. And if you have any valid definition of "real
>> character", you can use it equally on an
On Wed, Jul 19, 2017 at 10:34 AM, Mikhail V wrote:
> Ok, in this narrow context I can also agree.
> But in slightly wider context that phrase may sound almost like:
> "neither geometrical shape is better than the other as a basis
> for a wheel. If you have polygonal wheels,
ChrisA wrote:
>On Wed, Jul 19, 2017 at 6:05 AM, Mikhail V wrote:
>> On 2017-07-18, Steve D'Aprano wrote:
>>
>>> That's neither better nor worse than the system used by English and French,
>>> where letters with dicritics are not distinct letters, but guides to
>>>
Gregory Ewing writes:
> The term "emoji" is becoming rather strained these days.
> The idea of "woman" and "personal computer" being emotions
> is an interesting one...
I think of “emoji” as “not actually a character in any system anyone
would use for writing
Marko Rauhamaa wrote:
>What did you think of my concrete examples, then? (Say, finding
>"Alvárez" with the regular expression "Alv[aá]rez".)
I think that should match both "Alvarez" and "Alvárez" ...?
But firstly, I feel like I need to _guess_ what ideas you
are presenting. Unless I open up Vim
Random832 wrote:
What about Emoji?
U+1F469 WOMAN is two columns wide on its own.
U+1F4BB PERSONAL COMPUTER is two columns wide on its own.
The term "emoji" is becoming rather strained these days.
The idea of "woman" and "personal computer" being emotions
is an interesting one...
--
Greg
--
Steve D'Aprano wrote:
(I don't think any native English words use a double-V or double-U, but the
possibility exists.)
vacuum
savvy
(Vacuum is arguably Latin, but we've been using it for long
enough that it's at least as English as most of the other
words we use.)
--
Greg
--
På Tue, 18 Jul 2017 11:27:03 -0400
Dennis Lee Bieber skrev:
> Probably would have to go to words predating the Roman occupation
> (which probably means a dialect closer to Welsh or other Gaelic).
> Everything later is an import (anglo-saxon being germanic tribes
Chris Angelico :
> Let me give you one concrete example: the letter "ö". In English, it
> is (very occasionally) used to indicate diaeresis, where a pair of
> letters is not a double letter - for example, "coöperate". (You can
> also hyphenate, "co-operate".) In German, it is
On Wed, Jul 19, 2017 at 6:05 AM, Mikhail V wrote:
> On 2017-07-18, Steve D'Aprano wrote:
>
>> That's neither better nor worse than the system used by English and French,
>> where letters with dicritics are not distinct letters, but guides to
On 2017-07-18, Steve D'Aprano wrote:
> That's neither better nor worse than the system used by English and French,
> where letters with dicritics are not distinct letters, but guides to
> pronunciation.
>_Neither system is right or wrong, or better than the
On Wed, Jul 19, 2017 at 4:56 AM, Marko Rauhamaa wrote:
> Chris Angelico :
>> What I *think* you're asking for is for square brackets in a regex to
>> count combining characters with their preceding base character.
>
> Yes. My example tries to match a single
Chris Angelico :
> On Wed, Jul 19, 2017 at 4:31 AM, Marko Rauhamaa wrote:
>> Chris Angelico :
>>
>>> On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa wrote:
Yes. Also, not every letter can be normalized to a single
On Wed, Jul 19, 2017 at 4:31 AM, Marko Rauhamaa wrote:
> Chris Angelico :
>
>> On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa wrote:
>>> Yes. Also, not every letter can be normalized to a single codepoint so
>>> NFC is not a way out. For
Chris Angelico :
> On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa wrote:
>> Yes. Also, not every letter can be normalized to a single codepoint so
>> NFC is not a way out. For example,
>>
>> re.match("^[q̈]$", "q̈")
>>
>> returns None regardless of
On Wed, Jul 19, 2017 at 3:01 AM, Marko Rauhamaa wrote:
> Chris Angelico :
>
>> what you're more likely to want is "match the letter á", and you don't
>> care whether it's represented as U+0061 U+0301 or as U+00E1. That's
>> where Unicode normalization comes in.
On 2017-07-18, Anders Wegge Keller wrote:
> På Tue, 18 Jul 2017 23:59:33 +1000
> Chris Angelico skrev:
>> On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano
>
>
>>> (I don't think any native English words use a double-V or double-U, but
>>> the possibility
On 18/07/17 17:03, Marko Rauhamaa wrote:
Random832:
As for double-v, a quick search through /usr/share/dict/words reveals
"civvies", "divvy", "revved/revving", "savvy" and "skivvy", and
various conjugations thereof. All following, more or less, the rule of
using a
Marko Rauhamaa :
> * the final consonant of a single-syllable word is doubled only if the
>consonant is "k", "l" or "s" ("kick", "kill", "kiss")
... or "f" ("stiff") or "z" ("buzz")
Marko
--
https://mail.python.org/mailman/listinfo/python-list
Chris Angelico :
> what you're more likely to want is "match the letter á", and you don't
> care whether it's represented as U+0061 U+0301 or as U+00E1. That's
> where Unicode normalization comes in.
Yes. Also, not every letter can be normalized to a single codepoint so
NFC is
On Wed, Jul 19, 2017 at 1:40 AM, Rhodri James wrote:
> On 18/07/17 16:27, Dennis Lee Bieber wrote:
>>
>> On Tue, 18 Jul 2017 10:38:48 -0400, Random832
>> declaimed the following:
>>
>>> Define "native" then. My interpretation of "native English
On Wed, Jul 19, 2017 at 12:09 AM, Random832 wrote:
> On Fri, Jul 14, 2017, at 08:33, Chris Angelico wrote:
>> What do you mean about regular expressions? You can use REs with
>> normalized strings. And if you have any valid definition of "real
>> character", you can use it
Random832 :
> As for double-v, a quick search through /usr/share/dict/words reveals
> "civvies", "divvy", "revved/revving", "savvy" and "skivvy", and
> various conjugations thereof. All following, more or less, the rule of
> using a double consonant after a short vowel in
On 18/07/17 16:27, Dennis Lee Bieber wrote:
On Tue, 18 Jul 2017 10:38:48 -0400, Random832
declaimed the following:
Define "native" then. My interpretation of "native English words" is
"anything you wouldn't have to put in italics to use in a sentence".
Which would also
On 18/07/17 15:10, Rustom Mody wrote:
On Monday, July 17, 2017 at 10:14:00 PM UTC+5:30, Rhodri James wrote:
On 17/07/17 05:10, Rustom Mody wrote:
Hint1: Ask your grandmother whether unicode's notion of character makes sense.
Ask 10 gmas from 10 language-L's
Hint2: When in doubt gma usually is
On 2017-07-18, Steve D'Aprano wrote:
> (I don't think any native English words use a double-V or double-U, but the
> possibility exists.)
double-v:
flivver, navvy, bivvy, bevvy, trivvet, divvy, skivvy, skivvies,
etc. and various gerund and past tense verbs:
On Tue, Jul 18, 2017, at 10:23, Anders Wegge Keller wrote:
> På Tue, 18 Jul 2017 23:59:33 +1000
> Chris Angelico skrev:
> > On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano
> >> (I don't think any native English words use a double-V or double-U, but
> >> the possibility exists.)
On Sun, Jul 16, 2017, at 01:37, Steven D'Aprano wrote:
> In a *well-designed* *bug-free* monospaced font, all code points should
> be either zero-width or one column wide. Or two columns, if the font
> supports East Asian fullwidth characters.
What about Emoji?
U+1F469 WOMAN is two columns wide
På Tue, 18 Jul 2017 23:59:33 +1000
Chris Angelico skrev:
> On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano
>> (I don't think any native English words use a double-V or double-U, but
>> the possibility exists.)
> vacuum.
That's latin.
--
//Wegge
--
On Monday, July 17, 2017 at 10:14:00 PM UTC+5:30, Rhodri James wrote:
> On 17/07/17 05:10, Rustom Mody wrote:
> > Hint1: Ask your grandmother whether unicode's notion of character makes
> > sense.
> > Ask 10 gmas from 10 language-L's
> > Hint2: When in doubt gma usually is right
>
> "For every
On Fri, Jul 14, 2017, at 04:15, Marko Rauhamaa wrote:
> Consider, for example, a Python source code
> editor where you want to limit the length of the line based on the
> number of characters more typically than based on the number of pixels.
Even there you need to go based on the width in
On Fri, Jul 14, 2017, at 08:33, Chris Angelico wrote:
> What do you mean about regular expressions? You can use REs with
> normalized strings. And if you have any valid definition of "real
> character", you can use it equally on an NFC-normalized or
> NFD-normalized string than any other. They're
On Tue, Jul 18, 2017 at 11:11 PM, Steve D'Aprano
wrote:
> On Tue, 18 Jul 2017 08:01 am, Mikhail V wrote:
>
>> And just in case still its not clear: this is not
>> solved by adding dirt around the letter: if there is
>> enough significance of the phoneme distinction
On Tue, 18 Jul 2017 08:01 am, Mikhail V wrote:
> And just in case still its not clear: this is not
> solved by adding dirt around the letter: if there is
> enough significance of the phoneme distinction then
> one should add a distinct letter for a syntax in question.
It isn't "dirt", any more
Mikhail V :
> And just in case still its not clear: this is not solved by adding
> dirt around the letter: if there is enough significance of the phoneme
> distinction then one should add a distinct letter for a syntax in
> question.
The letters of Finnish are:
Steve D'Aprano wrote:
I don't think that it is even a given that "atomic units of language" exist. To
quote a Hindi speaker earlier in this thread, की is a letter, and yet it can be
decomposed into की = क + ई, so it isn't "atomic". If letters aren't atomic,
then what are?
They're like
ChrisA wrote:
>Yep! Nobody would take any notice of the fact that you just put dots
>on all those letters. It's not like it's going to make any difference
>to anything. We're not dealing with matters of life and death here.
>Oh wait.
On 17/07/17 05:10, Rustom Mody wrote:
Hint1: Ask your grandmother whether unicode's notion of character makes sense.
Ask 10 gmas from 10 language-L's
Hint2: When in doubt gma usually is right
"For every complex problem there is an answer that is clear, simple and
wrong." (H.L. Mencken).
On Tue, Jul 18, 2017 at 1:36 AM, Steve D'Aprano
wrote:
> On Mon, 17 Jul 2017 02:10 pm, Rustom Mody wrote:
>> Hint1: Ask your grandmother whether unicode's notion of character makes
>> sense.
>
> What on earth makes you think that my grandmother is a valid judge of
On Mon, 17 Jul 2017 02:10 pm, Rustom Mody wrote:
>> Please don't feed the trolls.
>
> Its usually called 'joke' Steven! Did the word fall out of your dictionary
> in the last upgrade?
> Rick was no more trolling than Marko
Funny you say that. I often think Marko is trolling, but if he is, he
1 - 100 of 174 matches
Mail list logo