Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)
Peter Bex peter@xs4all.nl writes: The attached patch supplements Peter's to handle these cases. I've signed this off, and added a few test cases for this as well. Find this version attached. Tested, signed off and pushed. Thanks! Moritz ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)
On Sun, May 26, 2013 at 09:30:39PM -0700, Evan Hanson wrote: On 2013/05/26 11:09P, Peter Bex wrote: Here are two patches for adding R7RS named character and string escapes sequences. I don't believe the second patch is sufficient for adding intraline whitespace escapes; it only handles LF-terminated lines, while R7RS specifies that CR, CRLF and LF line endings should all be collapsed. Ah, very good. I indeed overlooked that one. Thanks! The attached patch supplements Peter's to handle these cases. I've signed this off, and added a few test cases for this as well. Find this version attached. Cheers, Peter -- http://www.more-magic.net From ffae85a6998808b35392ecdb4d061e4885d39b12 Mon Sep 17 00:00:00 2001 From: Evan Hanson ev...@foldling.org Date: Mon, 27 May 2013 16:18:53 +1200 Subject: [PATCH] handle CR CRLF-terminated lines when collapsing intraline whitespace Signed-off-by: Peter Bex peter@xs4all.nl --- library.scm | 9 - tests/r7rs-tests.scm | 6 ++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/library.scm b/library.scm index 68165d9..d8264da 100644 --- a/library.scm +++ b/library.scm @@ -2521,12 +2521,19 @@ EOF (loop (##sys#read-char-0 port) (r-cons-codepoint n lst)) ))) ((#\\ #\' #\ #\|) (loop (##sys#read-char-0 port) (cons c lst))) - ((#\newline #\space #\tab) + ((#\newline #\return #\space #\tab) ;; Read escaped intraline ws* nl intraline ws* (let eat-ws ((c c) (nl? #f)) (case c ((#\space #\tab) (eat-ws (##sys#read-char-0 port) nl?)) + ((#\return) +(if nl? +(loop c lst) +(let ((nc (##sys#read-char-0 port))) + (if (eq? nc #\newline) ; collapse \r\n + (eat-ws (##sys#read-char-0 port) #t) + (eat-ws nc #t) ((#\newline) (if nl? (loop c lst) diff --git a/tests/r7rs-tests.scm b/tests/r7rs-tests.scm index c0f6ebd..89f2e7d 100644 --- a/tests/r7rs-tests.scm +++ b/tests/r7rs-tests.scm @@ -111,11 +111,17 @@ ;; *ONE* line ending following a backslash escape, along with any ;; preceding or trailing intraline whitespace is collapsed and ignored. (test #\E escaped-char (string-append (string #\newline)END)) +;; This also works with CR instead of LF... +(test #\E escaped-char (string-append (string #\return)END)) +;; And CRLF, too +(test #\E escaped-char (string-append (string #\return) (string #\newline) END)) (test #\E escaped-char (string-append (string #\newline) END)) (test #\E escaped-char (string-append (string #\newline) END)) (test #\E escaped-char (string-append (string #\newline)END)) ;; But not more than one! (test #\newline escaped-char (string-append (string #\newline) (string #\newline) END)) +;; CR and LF both counted +(test #\newline escaped-char (string-append (string #\return) (string #\newline) END)) ;; Tabs count as intraline whitespace too (test #\E escaped-char (string-append (string #\tab) (string #\newline) (string #\tab)END)) ;; Edge case -- 1.7.12 ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)
On Mon, May 27, 2013 at 6:09 AM, Peter Bex peter@xs4all.nl wrote: For the backslash escapes inside strings, we already had everything that R7RS dictates, except for two things: hex escapes and \intraline whitespace*newlineintraline whitespace* syntax. I think the former is useful, but it's rather tricky so I want to take my time implementing it. The latter is silly and rather pointless, but I've implemented it for completeness. Note that Chibi's behavior seems to be incorrect wrt to the spec. The string: \ x should be read like \n\nx, I think. Chibi drops all the whitespace and newlines, instead of just the whitespace surrounding the *first* newline. I've made Chicken follow what I think is what the spec says. Yes, the spec is correct here, thanks. I'll fix Chibi and add an R7RS test case. [Agreed about it being silly and pointless.] There's also escaped strings without newlines like \foo. It's unspecified what an implementation should do with that, like any other escaped character (like, for example, \p). This is intentionally unspecified - you can do what you like here. A pedantic impl would signal an error. -- Alex ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)
Alex Shinn scripsit: It's unspecified what an implementation should do with that, like any other escaped character (like, for example, \p). This is intentionally unspecified - you can do what you like here. A pedantic impl would signal an error. Indeed. If things like \p were specified (to mean p, for example), we wouldn't be able to add a meaning to them in R8RS. -- John Cowanco...@ccil.orghttp://ccil.org/~cowan The present impossibility of giving a scientific explanation is no proof that there is no scientific explanation. The unexplained is not to be identified with the unexplainable, and the strange and extraordinary nature of a fact is not a justification for attributing it to powers above nature. --The Catholic Encyclopedia, s.v. telepathy (1913) ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers
[Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)
Hi all, Here are two patches for adding R7RS named character and string escapes sequences. Turns out the r7rs-tasks wiki page is incorrect; we already had \a and #\alarm support. The only two standard named characters that needed to be added was #\null (where we had #\nul) and #\escape where we had #\esc, and the strings only need hex escapes and escaped indentation support. The new character names are now the preferred way to print #\x00 and #\x1b for WRITE and friends. This means that CHICKEN's s-expression output can be READ by R7RS Schemes. Only, older CHICKENs can't read s-expressions output by newer ones. IMHO this does not require a Change Request because it does not strictly break backwards compatibility: newer chickens can still read s-expressions output by older chickens, and any program will continue working as it has before. For the backslash escapes inside strings, we already had everything that R7RS dictates, except for two things: hex escapes and \intraline whitespace*newlineintraline whitespace* syntax. I think the former is useful, but it's rather tricky so I want to take my time implementing it. The latter is silly and rather pointless, but I've implemented it for completeness. Note that Chibi's behavior seems to be incorrect wrt to the spec. The string: \ x should be read like \n\nx, I think. Chibi drops all the whitespace and newlines, instead of just the whitespace surrounding the *first* newline. I've made Chicken follow what I think is what the spec says. There's also escaped strings without newlines like \foo. It's unspecified what an implementation should do with that, like any other escaped character (like, for example, \p). Chibi also simply collapses the whitespace away, and I've decided to follow this behaviour, but to make our handling of this more consistent with how we handle other undefined escape sequences, it also shows a read-warning. I did not add tests for this because it's unspecified and nobody should be relying on this. Cheers, Peter -- http://www.more-magic.net From fe97ba69a041c34c2d98a24baca03617f26c1df4 Mon Sep 17 00:00:00 2001 From: Peter Bex peter@xs4all.nl Date: Sun, 26 May 2013 19:38:17 +0200 Subject: [PATCH 1/2] Added #\null and #\escape character literal names, for R7RS compatibility. WRITE now uses this instead of the old #\nul and #\esc, so other impls can READ s-expression written by CHICKEN. Unfortunately, the flip side is that older CHICKENs can't READ data written by newer CHICKENs. --- NEWS | 2 ++ library.scm | 4 tests/r7rs-tests.scm | 23 +++ 3 files changed, 29 insertions(+) diff --git a/NEWS b/NEWS index 07a8a5a..f2d28fc 100644 --- a/NEWS +++ b/NEWS @@ -22,6 +22,8 @@ and #!rest in type-declarations (suggested by Joerg Wittenberger). - Vectors, SRFI-4 number vectors and blobs are now self-evaluating for R7RS compatibility. Being literal constants, they are implicitly quoted. + - For R7RS compatibility, named character literals #\escape and #\null are + supported as aliases for #\esc and #\nul. WRITE will output R7RS names. - Compiler - the inline declaration does not force inlining anymore as recursive diff --git a/library.scm b/library.scm index f42ddd7..cd33b09 100644 --- a/library.scm +++ b/library.scm @@ -1527,6 +1527,8 @@ EOF (and-let* ([a (assq x names-to-chars)]) (##sys#slot a 1) ) ] ) ) ) ) ) +;; TODO: Use the character names here in the next release? Or just +;; use the numbers everywhere, for clarity? (char-name 'space #\space) (char-name 'tab #\tab) (char-name 'linefeed #\linefeed) @@ -1534,8 +1536,10 @@ EOF (char-name 'vtab (integer-char 11)) (char-name 'delete (integer-char 127)) (char-name 'esc (integer-char 27)) +(char-name 'escape (integer-char 27)) (char-name 'alarm (integer-char 7)) (char-name 'nul (integer-char 0)) +(char-name 'null (integer-char 0)) (char-name 'return #\return) (char-name 'page (integer-char 12)) (char-name 'backspace (integer-char 8)) diff --git a/tests/r7rs-tests.scm b/tests/r7rs-tests.scm index bc21294..368f9f4 100644 --- a/tests/r7rs-tests.scm +++ b/tests/r7rs-tests.scm @@ -71,6 +71,29 @@ +(SECTION 6 6) + + +(define (integer-named-char x) + (with-output-to-string (lambda () (write (integer-char x) + +(test #\\alarm integer-named-char #x07) +(test #\\backspace integer-named-char #x08) +(test #\\delete integer-named-char #x7f) +(test #\\escape integer-named-char #x1b) +(test #\\newline integer-named-char #x0a) +(test #\\null integer-named-char #x00) +(test #\\return integer-named-char #x0d) +(test #\\space integer-named-char #x20) +(test #\\tab integer-named-char #x09) + + + +;; NOT YET (is ambiguous with existing \xNN syntax in Chicken) +#;(test #\tab escaped-char x9;) +#;(test #\tab escaped-char x09;) + + (SECTION 6 8) ;; Symbols are implicitly quoted inside self-evaluating vectors. -- 1.8.2.3 From
Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)
On 2013/05/26 11:09P, Peter Bex wrote: Here are two patches for adding R7RS named character and string escapes sequences. I don't believe the second patch is sufficient for adding intraline whitespace escapes; it only handles LF-terminated lines, while R7RS specifies that CR, CRLF and LF line endings should all be collapsed. The attached patch supplements Peter's to handle these cases. Evan From f5b06523300b5b8880460d7cb3f935c300d85074 Mon Sep 17 00:00:00 2001 From: Evan Hanson ev...@foldling.org Date: Mon, 27 May 2013 16:18:53 +1200 Subject: [PATCH] handle CR CRLF-terminated lines when collapsing intraline whitespace --- library.scm |9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/library.scm b/library.scm index 68165d9..d8264da 100644 --- a/library.scm +++ b/library.scm @@ -2521,12 +2521,19 @@ EOF (loop (##sys#read-char-0 port) (r-cons-codepoint n lst)) ))) ((#\\ #\' #\ #\|) (loop (##sys#read-char-0 port) (cons c lst))) - ((#\newline #\space #\tab) + ((#\newline #\return #\space #\tab) ;; Read escaped intraline ws* nl intraline ws* (let eat-ws ((c c) (nl? #f)) (case c ((#\space #\tab) (eat-ws (##sys#read-char-0 port) nl?)) + ((#\return) +(if nl? +(loop c lst) +(let ((nc (##sys#read-char-0 port))) + (if (eq? nc #\newline) ; collapse \r\n + (eat-ws (##sys#read-char-0 port) #t) + (eat-ws nc #t) ((#\newline) (if nl? (loop c lst) -- 1.7.10.4 ___ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers