Hello again everyone,
Though I initially took the shoo-away, there have been some comments
made since then that I feel compelled to rebut. To avoid spamming the
list, I’ve combined my responses into a single message.
Before that, I will say, again, for the record: I know this NOOP idea
is unlikel
Shawn Steele wrote:
> Even more complicated is that, as pointed out by others, it's pretty
> much impossible to say "these n codepoints should be ignored and have
> no meaning" because some process would try to use codepoints 1-3 for
> some private meaning. Another would use codepoint 1 for their
level protocols. Whether those be word breaking, sentence parsing,
formatting, buffer sizing or whatever.
-Shawn
-Original Message-----
From: Unicode On Behalf Of Richard Wordingham via
Unicode
Sent: Wednesday, July 3, 2019 4:20 PM
To: unicode@unicode.org
Subject: Re: Unicode "no-op&quo
On Wed, 3 Jul 2019 17:51:29 -0400
"Mark E. Shoulson via Unicode" wrote:
> I think the idea being considered at the outset was not so complex as
> these (and indeed, the point of the character was to avoid making
> these kinds of decisions).
Shawn Steele appeared to be claiming that there was no
What you're asking for, then, is completely possible and achievable—but
not in the Unicode Standard. It's out of scope for Unicode, it sounds
like. You've said you realize it won't happen in Unicode, but it still
can happen. Go forth and implement it, then: make your higher-level
protocol an
, Richard Wordingham via Unicode wrote:
On Sat, 22 Jun 2019 23:56:50 +
Shawn Steele via Unicode wrote:
+ the list. For some reason the list's reply header is confusing.
From: Shawn Steele
Sent: Saturday, June 22, 2019 4:55 PM
To: Sławomir Osipiuk
Subject: RE: Unicode "no-op"
Philippe Verdy [mailto:verd...@wanadoo.fr]
*Sent:* Wednesday, July 03, 2019 04:49
*To:* Sławomir Osipiuk
*Cc:* unicode Unicode Discussion
*Subject:* Re: Unicode "no-op" Character?
Your goal is **impossible** to reach with Unicode. Assume sich
character is "added" to the UCS, the
On 7/3/2019 10:47 AM, Sławomir Osipiuk via Unicode wrote:
Is my idea impossible, useless, or contradictory? Not at all.
What you are proposing is in the realm of higher-level protocols.
You could develop such a protocol, and then write processes that honored
it, or try to convince others to
On Wed, Jul 3, 2019 at 8:47 AM Sławomir Osipiuk via Unicode <
unicode@unicode.org> wrote:
> Security gateways filter it out completely, as a matter of best practice
> and security-in-depth.
>
>
>
> A process, let’s call it Process W, adds a bunch of U+000F to a string it
> received, or built, or a
, useless, or contradictory? Not at all.
From: Mark Davis ☕️ [mailto:m...@macchiato.com]
Sent: Wednesday, July 03, 2019 13:33
To: Sławomir Osipiuk
Cc: verdy_p; unicode Unicode Discussion
Subject: Re: Unicode "no-op" Character?
Your goal is not achievable. We can't wave a
ng gets overridden, not overloaded. That’s what makes it
> special.
>
>
>
> I don’t expect to see any of this in official Unicode. But I take
> exception to the idea that I’m suggesting something impossible.
>
>
>
>
>
> *From:* Philippe Verdy [mailto:verd...@wanado
take exception to
the idea that I’m suggesting something impossible.
From: Philippe Verdy [mailto:verd...@wanadoo.fr]
Sent: Wednesday, July 03, 2019 04:49
To: Sławomir Osipiuk
Cc: unicode Unicode Discussion
Subject: Re: Unicode "no-op" Character?
Your goal is **impossible** to
s used for arbitrary length integers or other variable length structures where terminator characters like 0x00 may be part of the data.
Gesendet: Mittwoch, 03. Juli 2019 um 10:49 Uhr
Von: "Philippe Verdy via Unicode"
An: "Sławomir Osipiuk"
Cc: "unicode Unicode Di
Also consider that C0 controls (like STX and ETX) can already be used for
packetizing, but immediately comes the need for escaping (DLE has been used
for that goal, jsut before the character to preserve in the stream content,
notably before DLE itself, or STX and ETX).
There's then no need at all o
Le mer. 3 juil. 2019 à 06:09, Sławomir Osipiuk a
écrit :
> I don’t think you understood me at all. I can packetize a string with any
> character that is guaranteed not to appear in the text.
>
Your goal is **impossible** to reach with Unicode. Assume sich character is
"added" to the UCS, then it
a tool like that would make some tasks much faster and
simpler. Your proposed solution doesn’t.
From: Philippe Verdy [mailto:verd...@wanadoo.fr]
Sent: Saturday, June 29, 2019 15:47
To: Sławomir Osipiuk
Cc: Shawn Steele; unicode Unicode Discussion
Subject: Re: Unicode "no-op" Charact
t of the noop character is
> the absence of a character).
>
>
>
> As should be obvious, I’m not recommending this as good practice.
>
>
>
>
>
> *From:* Shawn Steele [mailto:shawn.ste...@microsoft.com]
> *Sent:* Saturday, June 22, 2019 19:57
> *To:* Sławomir Osipiuk;
All right. Thanks to everyone who offered suggestions. I think the final
choice will depend on the specific application, if I ever face this puzzle
again.
If nothing else, this discussion has helped me formulate what exactly it is
I'm imagining, which is actually a bit different that was I star
On Mon, Jun 24, 2019 at 5:35 PM David Starner via Unicode <
unicode@unicode.org> wrote:
> On Sun, Jun 23, 2019 at 10:41 PM Shawn Steele via Unicode
> wrote:
>
> IMO, since it's unlikely that anyone expects
> that they can transmit a NUL through an arbitrary channel, unlike a
> random private use
On Sun, Jun 23, 2019 at 10:41 PM Shawn Steele via Unicode
wrote:
> Which leads us to the key. The desire is for a character that has no public
> meaning, but has some sort of private meaning. In other words it has a
> private use. Oddly enough, there is a group of characters intended for
> p
self, but I can still dream.
-Original Message-
From: Shawn Steele [mailto:shawn.ste...@microsoft.com]
Sent: Monday, June 24, 2019 01:39
To: Sławomir Osipiuk; unicode@unicode.org
Cc: 'Richard Wordingham'
Subject: RE: Unicode "no-op" Character?
But... it's not
Sent: Saturday, June 22, 2019 6:10 PM
To: unicode@unicode.org
Cc: 'Richard Wordingham'
Subject: RE: Unicode "no-op" Character?
That's the key to the no-op idea. The no-op character could not ever be assumed
to survive interchange with another process. It'd be canoni
: 'Richard Wordingham'
Subject: RE: Unicode "no-op" Character?
The string should always be sanitized before being checked for exploits
ly, it's too late now.
-Original Message-
From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Richard
Wordingham via Unicode
Sent: Sunday, June 23, 2019 04:37
To: unicode@unicode.org
Subject: Re: Unicode "no-op" Character?
Discardables are a security risk, as
On Sat, 22 Jun 2019 21:10:08 -0400
Sławomir Osipiuk via Unicode wrote:
> In fact, that might be the best description: It's not just an
> "ignorable", it's a "discardable". Unicode doesn't have that, does it?
No, though the byte order mark at the start of a file comes close.
Discardables are a se
On Sat, 22 Jun 2019 23:56:50 +
Shawn Steele via Unicode wrote:
> + the list. For some reason the list's reply header is confusing.
>
> From: Shawn Steele
> Sent: Saturday, June 22, 2019 4:55 PM
> To: Sławomir Osipiuk
> Subject: RE: Unicode "no-op" Cha
:unicode-boun...@unicode.org] On Behalf Of Richard
Wordingham via Unicode
Sent: Saturday, June 22, 2019 20:59
To: unicode@unicode.org
Cc: Shawn Steele
Subject: Re: Unicode "no-op" Character?
If they're conveying an invisible message, one would have to strip out
original ZWNBSP/WJ/
On Sat, 22 Jun 2019 23:56:11 +
Shawn Steele via Unicode wrote:
> Assuming you were using any of those characters as "markup", how
> would you know when they were intentionally in the string and not
> part of your marking system?
If they're conveying an invisible message, one would have to st
ice.
From: Shawn Steele [mailto:shawn.ste...@microsoft.com]
Sent: Saturday, June 22, 2019 19:57
To: Sławomir Osipiuk; unicode@unicode.org
Subject: RE: Unicode "no-op" Character?
+ the list. For some reason the list's reply header is confusing.
From: Shawn Steele
Sent: Saturda
o: unicode@unicode.org
Subject: Re: Unicode "no-op" Character?
On Sat, 22 Jun 2019 17:50:49 -0400
Sławomir Osipiuk via Unicode wrote:
> If faced with the same problem today, I’d probably just go with U+FEFF
> (really only need a single char, not a whole delimited substring) or
+ the list. For some reason the list's reply header is confusing.
From: Shawn Steele
Sent: Saturday, June 22, 2019 4:55 PM
To: Sławomir Osipiuk
Subject: RE: Unicode "no-op" Character?
The original comment about putting it between the base character and the
combining diacritic
On Sat, 22 Jun 2019 17:50:49 -0400
Sławomir Osipiuk via Unicode wrote:
> If faced with the same problem today, I’d
> probably just go with U+FEFF (really only need a single char, not a
> whole delimited substring) or a different C0 control (maybe SI/LS0)
> and clean up the string if it needs to b
convincing enough case for it.
From: J Decker [mailto:d3c...@gmail.com]
Sent: Saturday, June 22, 2019 17:19
To: Sławomir Osipiuk
Cc: Unicode Discussion
Subject: Re: Unicode "no-op" Character?
But it doesn't appear anything actually 'supports' that.
On Sat, Jun 22, 2019 at 2:04 PM Sławomir Osipiuk via Unicode <
unicode@unicode.org> wrote:
> I see there is no such character, which I pretty much expected after
> Google didn’t help.
>
>
>
> The original problem I had was solved long ago but the recent article
> about watermarking reminded me of
I see there is no such character, which I pretty much expected after Google
didn't help.
The original problem I had was solved long ago but the recent article about
watermarking reminded me of it, and my question was mostly out of curiosity.
The task wasn't, strictly speaking, about "padding",
Sławomir Osipiuk wrote:
> Does Unicode include a character that does nothing at all? I'm talking
> about something that can be used for padding data without affecting
> interpretation of other characters, including combining chars and
> ligatures.
I join Shawn Steele in wondering what your "data
Perhaps a codepoint from a private use area and another processing step to
add/ remove them would work for you?
On Sat, Jun 22, 2019, 1:39 AM Mark Davis ☕️ via Unicode
wrote:
> There nothing like what you are describing. Examples:
>
>1. Display — There are a few of the Default Ignorables tha
There nothing like what you are describing. Examples:
1. Display — There are a few of the Default Ignorables that are always
treated as invisible, and have little effect on other characters. However,
even those will generally interfere with the display of sequences (be
between 'q' and
Op zaterdag 22 juni 2019 02:14 schreef Sławomir Osipiuk via Unicode:
Does Unicode include a character that does nothing at all? I'm
talking about something that can be used for padding data without
affecting interpretation of other characters, including combining
chars and ligatures. I.e. a char
Sounds like a great use for ZWNBSP (zero width non-breaking space) 0xFEFF
(Also used as BOM)
or that doesn't break; maybe 'ZERO WIDTH SPACE' (U+200B)
On Fri, Jun 21, 2019 at 9:48 PM Sławomir Osipiuk via Unicode <
unicode@unicode.org> wrote:
> Does Unicode include a character that does nothing at
I'm curious what you'd use it for?
From: Unicode On Behalf Of Slawomir Osipiuk via
Unicode
Sent: Friday, June 21, 2019 5:14 PM
To: unicode@unicode.org
Subject: Unicode "no-op" Character?
Does Unicode include a character that does nothing at all? I'm talking about
something that can be used for
41 matches
Mail list logo