Re: [Pharo-users] STON encoding of slashes
> On 18 Jan 2017, at 16:38, Sven Van Caekenberghewrote: > > So my conclusion would be (while writing), always escape $\ and not $/, in > pure STON mode (the default), escape $' and not $", in JSON mode, escape $" > and not $'. I implemented these changes in writing behaviour: === Name: STON-Core-SvenVanCaekenberghe.81 Author: SvenVanCaekenberghe Time: 31 January 2017, 11:43:15.520367 pm UUID: d83172d8-f01e-4e63-9382-515399ffa7bc Ancestors: STON-Core-SvenVanCaekenberghe.80 Change the encoding of characters while writing so that in default STON mode only the following named character escapes are used: \b \t \n \f \' and \\ while in JSON mode \' is replaced by \" - this means that / is normally not escaped. Add STONWriter>>#escape:with: as API Adjust 2 unit tests to reflect this change Update time tag of STONWriter class>>#initialize === Name: STON-Tests-SvenVanCaekenberghe.71 Author: SvenVanCaekenberghe Time: 31 January 2017, 11:43:43.88292 pm UUID: 65045513-6f48-43b7-a112-89dabf34a8f8 Ancestors: STON-Tests-SvenVanCaekenberghe.70 Change the encoding of characters while writing so that in default STON mode only the following named character escapes are used: \b \t \n \f \' and \\ while in JSON mode \' is replaced by \" - this means that / is normally not escaped. Add STONWriter>>#escape:with: as API Adjust 2 unit tests to reflect this change Update time tag of STONWriter class>>#initialize === Sven
Re: [Pharo-users] STON encoding of slashes
On Wed, Jan 18, 2017 at 04:38:15PM +0100, Sven Van Caekenberghe wrote: > Being a superset means that you get a simple JSON parser (and even limited > writer) for free once you install STON (or once it is part of the Pharo > image, as it is now). It also means that we can fall back to the JSON spec as > a guide in discussion like this one. It also helps people understand what > STON is, by analogy but also differences with JSON. > > So my conclusion would be (while writing), always escape $\ and not $/, in > pure STON mode (the default), escape $' and not $", in JSON mode, escape $" > and not $'. Those would be the changes. Agreed ? > Yes for JSON, I guess yes for STON. (For me STON is effectively opaque representation (because of its DFS strategy), so I don't care too much about how it is stored (as long as it is not binary)) Thanks! Peter > > Peter > > > >>> On 18 Jan 2017, at 15:25, Peter Uhnakwrote: > >>> > >>> On Wed, Jan 18, 2017 at 11:11:06AM +0100, Christophe Demarey wrote: > > > Le 18 janv. 2017 à 09:51, Sven Van Caekenberghe a écrit : > > > > Hi Christophe, > > > >> STON toString: 'g...@github.com:foo/bar.git’ => > >> ''g...@github.com:foo\/bar.git’' > >> It used to be ''g...@github.com:foo/bar.git’’. > >>> > > In other words, it was an implementation error (omission). Note that > > JSON also has this escape. > >>> > >>> Yes and no for JSON. > >>> > >>> Only " and \ has to be escaped. Escaping anything else will give it > >>> special meaning with the exception of / which will just produce the same > >>> thing, because it is a special snowflake. :) > >>> > >>> quoted from https://tools.ietf.org/html/rfc7159#section-7: > >>> > >>> - Unicode characters may be placed within the quotation marks, except for > >>> the characters that must be escaped: quotation mark, reverse solidus, and > >>> the control characters (U+ through U+001F). > >>> - Alternatively, there are two-character sequence escape representations > >>> of some popular characters. > >>> > >>> "/" (U+002F) doesn't fall into control character range, but the > >>> alternative section does permit it escaping it. > >>> > >>> In other words, JSON strings "\/", and "/", and "\u002f" are equivalent. > >>> > >>> But JSON itself doesn't require you to escape "/" (just like you are not > >>> required to escape "hi" into "\u0068\u0069", although you can). > >>> > >>> Note that other systems do not escape / by default: > >>> > >>> (Pharo) > >>> NeoJSONWriter toString: 'g...@github.com:foo/bar.git' => > >>> "g...@github.com:foo/bar.git" > >>> > >>> (JavaScript) > >>> JSON.stringify('g...@github.com:foo/bar.git') => > >>> "g...@github.com:foo/bar.git" > >>> > >>> (Ruby) > >>> require 'json' > >>> puts 'g...@github.com:foo/bar.git'.to_json => > >>> "g...@github.com:foo/bar.git" > >>> > >>> Peter > >>> > >> > >> > > > >
Re: [Pharo-users] STON encoding of slashes
> On 18 Jan 2017, at 16:20, Peter Uhnakwrote: > > On Wed, Jan 18, 2017 at 03:38:17PM +0100, Sven Van Caekenberghe wrote: >> So talking only about the encoding/writing phase, the conclusion would be > > 10. Generators > > A JSON generator produces JSON text. The resulting text MUST > strictly conform to the JSON grammar. > > I guess the ABNF table is the best reference; but we want to generate the > most compact form available (=that still conforms to the syntax). > >> - not to escape $/ > > For covenience's sake compact is better > * I guess the speciality of \/ is somehow related to JavaScript strings > interpreting both '/' and '\/' into '/'? > >> - escape everything with code points [0,31], using named escapes if they >> exist, else \u > > yes, the named ones have Pharo equivalent too > > s := STON fromString: '"\b\f\n\r\t"'. > s asArray "{Character backspace. Character newPage. Character lf. Character > cr. Character tab}" > >> - escape $\ itself > > yes > >> >> That leaves the question about $' and $". >> >> $' is used in STON as string delimiter, so it has to be escaped. >> >> - escape $' > > We already had discussion about $' and JSON > http://forum.world.st/STON-doesn-t-produce-valid-JSON-it-shouldn-t-escape-quation-mark-td4923777.html Yes, I know, that change was OK. > It must not be escaped in JSON, but that raises question about STON being > superset of JSON; is such thing achievable given this disparity? > >> >> Right now, $" is also escaped. Should that remain the case, or only in JSON >> compatibility mode (where $" is used as string delimiter) ? >> >> - do not escape $" >> >> In JSON mode, escape $" and not $' then ? > > That's how it should be in JSON. > >> When parsing, all named and other escapes are always accepted, as they are >> now. > > 9. Parsers > > A JSON parser MUST accept all texts that conform to the JSON grammar. > A JSON parser MAY accept non-JSON forms or extensions. > > My question about STON and JSON: What is the benefit of STON being superset > of JSON? To me it feels like an arbitrary restriction for STON; would STON > benefit from dropping this requirement and instead only worry about good > smalltalk object representation? (And leave JSON to NeoJSON or something.) Being a superset means that you get a simple JSON parser (and even limited writer) for free once you install STON (or once it is part of the Pharo image, as it is now). It also means that we can fall back to the JSON spec as a guide in discussion like this one. It also helps people understand what STON is, by analogy but also differences with JSON. So my conclusion would be (while writing), always escape $\ and not $/, in pure STON mode (the default), escape $' and not $", in JSON mode, escape $" and not $'. Those would be the changes. Agreed ? > Peter > >>> On 18 Jan 2017, at 15:25, Peter Uhnak wrote: >>> >>> On Wed, Jan 18, 2017 at 11:11:06AM +0100, Christophe Demarey wrote: > Le 18 janv. 2017 à 09:51, Sven Van Caekenberghe a écrit : > > Hi Christophe, > >> STON toString: 'g...@github.com:foo/bar.git’ => >> ''g...@github.com:foo\/bar.git’' >> It used to be ''g...@github.com:foo/bar.git’’. >>> > In other words, it was an implementation error (omission). Note that JSON > also has this escape. >>> >>> Yes and no for JSON. >>> >>> Only " and \ has to be escaped. Escaping anything else will give it special >>> meaning with the exception of / which will just produce the same thing, >>> because it is a special snowflake. :) >>> >>> quoted from https://tools.ietf.org/html/rfc7159#section-7: >>> >>> - Unicode characters may be placed within the quotation marks, except for >>> the characters that must be escaped: quotation mark, reverse solidus, and >>> the control characters (U+ through U+001F). >>> - Alternatively, there are two-character sequence escape representations of >>> some popular characters. >>> >>> "/" (U+002F) doesn't fall into control character range, but the alternative >>> section does permit it escaping it. >>> >>> In other words, JSON strings "\/", and "/", and "\u002f" are equivalent. >>> >>> But JSON itself doesn't require you to escape "/" (just like you are not >>> required to escape "hi" into "\u0068\u0069", although you can). >>> >>> Note that other systems do not escape / by default: >>> >>> (Pharo) >>> NeoJSONWriter toString: 'g...@github.com:foo/bar.git' => >>> "g...@github.com:foo/bar.git" >>> >>> (JavaScript) >>> JSON.stringify('g...@github.com:foo/bar.git') => >>> "g...@github.com:foo/bar.git" >>> >>> (Ruby) >>> require 'json' >>> puts 'g...@github.com:foo/bar.git'.to_json => "g...@github.com:foo/bar.git" >>> >>> Peter >>> >> >> >
Re: [Pharo-users] STON encoding of slashes
On Wed, Jan 18, 2017 at 03:38:17PM +0100, Sven Van Caekenberghe wrote: > So talking only about the encoding/writing phase, the conclusion would be 10. Generators A JSON generator produces JSON text. The resulting text MUST strictly conform to the JSON grammar. I guess the ABNF table is the best reference; but we want to generate the most compact form available (=that still conforms to the syntax). > - not to escape $/ For covenience's sake compact is better * I guess the speciality of \/ is somehow related to JavaScript strings interpreting both '/' and '\/' into '/'? > - escape everything with code points [0,31], using named escapes if they > exist, else \u yes, the named ones have Pharo equivalent too s := STON fromString: '"\b\f\n\r\t"'. s asArray "{Character backspace. Character newPage. Character lf. Character cr. Character tab}" > - escape $\ itself yes > > That leaves the question about $' and $". > > $' is used in STON as string delimiter, so it has to be escaped. > > - escape $' We already had discussion about $' and JSON http://forum.world.st/STON-doesn-t-produce-valid-JSON-it-shouldn-t-escape-quation-mark-td4923777.html It must not be escaped in JSON, but that raises question about STON being superset of JSON; is such thing achievable given this disparity? > > Right now, $" is also escaped. Should that remain the case, or only in JSON > compatibility mode (where $" is used as string delimiter) ? > > - do not escape $" > > In JSON mode, escape $" and not $' then ? That's how it should be in JSON. > When parsing, all named and other escapes are always accepted, as they are > now. 9. Parsers A JSON parser MUST accept all texts that conform to the JSON grammar. A JSON parser MAY accept non-JSON forms or extensions. My question about STON and JSON: What is the benefit of STON being superset of JSON? To me it feels like an arbitrary restriction for STON; would STON benefit from dropping this requirement and instead only worry about good smalltalk object representation? (And leave JSON to NeoJSON or something.) Peter > > On 18 Jan 2017, at 15:25, Peter Uhnakwrote: > > > > On Wed, Jan 18, 2017 at 11:11:06AM +0100, Christophe Demarey wrote: > >> > >>> Le 18 janv. 2017 à 09:51, Sven Van Caekenberghe a écrit : > >>> > >>> Hi Christophe, > >>> > STON toString: 'g...@github.com:foo/bar.git’ => > ''g...@github.com:foo\/bar.git’' > It used to be ''g...@github.com:foo/bar.git’’. > > > >>> In other words, it was an implementation error (omission). Note that JSON > >>> also has this escape. > > > > Yes and no for JSON. > > > > Only " and \ has to be escaped. Escaping anything else will give it special > > meaning with the exception of / which will just produce the same thing, > > because it is a special snowflake. :) > > > > quoted from https://tools.ietf.org/html/rfc7159#section-7: > > > > - Unicode characters may be placed within the quotation marks, except for > > the characters that must be escaped: quotation mark, reverse solidus, and > > the control characters (U+ through U+001F). > > - Alternatively, there are two-character sequence escape representations of > > some popular characters. > > > > "/" (U+002F) doesn't fall into control character range, but the alternative > > section does permit it escaping it. > > > > In other words, JSON strings "\/", and "/", and "\u002f" are equivalent. > > > > But JSON itself doesn't require you to escape "/" (just like you are not > > required to escape "hi" into "\u0068\u0069", although you can). > > > > Note that other systems do not escape / by default: > > > > (Pharo) > > NeoJSONWriter toString: 'g...@github.com:foo/bar.git' => > > "g...@github.com:foo/bar.git" > > > > (JavaScript) > > JSON.stringify('g...@github.com:foo/bar.git') => > > "g...@github.com:foo/bar.git" > > > > (Ruby) > > require 'json' > > puts 'g...@github.com:foo/bar.git'.to_json => "g...@github.com:foo/bar.git" > > > > Peter > > > >
Re: [Pharo-users] STON encoding of slashes
So talking only about the encoding/writing phase, the conclusion would be - not to escape $/ - escape everything with code points [0,31], using named escapes if they exist, else \u - escape $\ itself That leaves the question about $' and $". $' is used in STON as string delimiter, so it has to be escaped. - escape $' Right now, $" is also escaped. Should that remain the case, or only in JSON compatibility mode (where $" is used as string delimiter) ? - do not escape $" In JSON mode, escape $" and not $' then ? When parsing, all named and other escapes are always accepted, as they are now. > On 18 Jan 2017, at 15:25, Peter Uhnakwrote: > > On Wed, Jan 18, 2017 at 11:11:06AM +0100, Christophe Demarey wrote: >> >>> Le 18 janv. 2017 à 09:51, Sven Van Caekenberghe a écrit : >>> >>> Hi Christophe, >>> STON toString: 'g...@github.com:foo/bar.git’ => ''g...@github.com:foo\/bar.git’' It used to be ''g...@github.com:foo/bar.git’’. > >>> In other words, it was an implementation error (omission). Note that JSON >>> also has this escape. > > Yes and no for JSON. > > Only " and \ has to be escaped. Escaping anything else will give it special > meaning with the exception of / which will just produce the same thing, > because it is a special snowflake. :) > > quoted from https://tools.ietf.org/html/rfc7159#section-7: > > - Unicode characters may be placed within the quotation marks, except for the > characters that must be escaped: quotation mark, reverse solidus, and the > control characters (U+ through U+001F). > - Alternatively, there are two-character sequence escape representations of > some popular characters. > > "/" (U+002F) doesn't fall into control character range, but the alternative > section does permit it escaping it. > > In other words, JSON strings "\/", and "/", and "\u002f" are equivalent. > > But JSON itself doesn't require you to escape "/" (just like you are not > required to escape "hi" into "\u0068\u0069", although you can). > > Note that other systems do not escape / by default: > > (Pharo) > NeoJSONWriter toString: 'g...@github.com:foo/bar.git' => > "g...@github.com:foo/bar.git" > > (JavaScript) > JSON.stringify('g...@github.com:foo/bar.git') => "g...@github.com:foo/bar.git" > > (Ruby) > require 'json' > puts 'g...@github.com:foo/bar.git'.to_json => "g...@github.com:foo/bar.git" > > Peter >
Re: [Pharo-users] STON encoding of slashes
On Wed, Jan 18, 2017 at 11:11:06AM +0100, Christophe Demarey wrote: > > > Le 18 janv. 2017 à 09:51, Sven Van Caekenberghea écrit : > > > > Hi Christophe, > > > >> STON toString: 'g...@github.com:foo/bar.git’ => > >> ''g...@github.com:foo\/bar.git’' > >> It used to be ''g...@github.com:foo/bar.git’’. > > In other words, it was an implementation error (omission). Note that JSON > > also has this escape. Yes and no for JSON. Only " and \ has to be escaped. Escaping anything else will give it special meaning with the exception of / which will just produce the same thing, because it is a special snowflake. :) quoted from https://tools.ietf.org/html/rfc7159#section-7: - Unicode characters may be placed within the quotation marks, except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+ through U+001F). - Alternatively, there are two-character sequence escape representations of some popular characters. "/" (U+002F) doesn't fall into control character range, but the alternative section does permit it escaping it. In other words, JSON strings "\/", and "/", and "\u002f" are equivalent. But JSON itself doesn't require you to escape "/" (just like you are not required to escape "hi" into "\u0068\u0069", although you can). Note that other systems do not escape / by default: (Pharo) NeoJSONWriter toString: 'g...@github.com:foo/bar.git' => "g...@github.com:foo/bar.git" (JavaScript) JSON.stringify('g...@github.com:foo/bar.git') => "g...@github.com:foo/bar.git" (Ruby) require 'json' puts 'g...@github.com:foo/bar.git'.to_json => "g...@github.com:foo/bar.git" Peter
[Pharo-users] STON encoding of slashes
Hi, I just noticed that STON encoding of forward slashes changed. STON toString: 'g...@github.com:foo/bar.git’ => ''g...@github.com:foo\/bar.git’' It used to be ''g...@github.com:foo/bar.git’’. Is it on purpose? Thanks, Christophe