Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)

2013-05-30 Thread Moritz Heidkamp
Peter Bex peter@xs4all.nl writes:
 The attached patch supplements Peter's to handle these cases.

 I've signed this off, and added a few test cases for this as well.
 Find this version attached.

Tested, signed off and pushed.

Thanks!
Moritz

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)

2013-05-27 Thread Peter Bex
On Sun, May 26, 2013 at 09:30:39PM -0700, Evan Hanson wrote:
 On 2013/05/26 11:09P, Peter Bex wrote:
  Here are two patches for adding R7RS named character and string escapes
  sequences.
 
 I don't believe the second patch is sufficient for adding intraline
 whitespace escapes; it only handles LF-terminated lines, while R7RS
 specifies that CR, CRLF and LF line endings should all be collapsed.

Ah, very good.  I indeed overlooked that one.  Thanks!

 The attached patch supplements Peter's to handle these cases.

I've signed this off, and added a few test cases for this as well.
Find this version attached.

Cheers,
Peter
-- 
http://www.more-magic.net
From ffae85a6998808b35392ecdb4d061e4885d39b12 Mon Sep 17 00:00:00 2001
From: Evan Hanson ev...@foldling.org
Date: Mon, 27 May 2013 16:18:53 +1200
Subject: [PATCH] handle CR  CRLF-terminated lines when collapsing intraline
 whitespace

Signed-off-by: Peter Bex peter@xs4all.nl
---
 library.scm  | 9 -
 tests/r7rs-tests.scm | 6 ++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/library.scm b/library.scm
index 68165d9..d8264da 100644
--- a/library.scm
+++ b/library.scm
@@ -2521,12 +2521,19 @@ EOF
  (loop (##sys#read-char-0 port) (r-cons-codepoint 
n lst)) )))
   ((#\\ #\' #\ #\|)
(loop (##sys#read-char-0 port) (cons c lst)))
-  ((#\newline #\space #\tab)
+  ((#\newline #\return #\space #\tab)
;; Read escaped intraline ws* nl intraline ws*
(let eat-ws ((c c) (nl? #f))
  (case c
((#\space #\tab)
 (eat-ws (##sys#read-char-0 port) nl?))
+   ((#\return)
+(if nl?
+(loop c lst)
+(let ((nc (##sys#read-char-0 port)))
+  (if (eq? nc #\newline) ; collapse \r\n
+  (eat-ws (##sys#read-char-0 port) #t)
+  (eat-ws nc #t)
((#\newline)
 (if nl?
 (loop c lst)
diff --git a/tests/r7rs-tests.scm b/tests/r7rs-tests.scm
index c0f6ebd..89f2e7d 100644
--- a/tests/r7rs-tests.scm
+++ b/tests/r7rs-tests.scm
@@ -111,11 +111,17 @@
 ;; *ONE* line ending following a backslash escape, along with any
 ;; preceding or trailing intraline whitespace is collapsed and ignored.
 (test #\E escaped-char (string-append (string #\newline)END))
+;; This also works with CR instead of LF...
+(test #\E escaped-char (string-append (string #\return)END))
+;; And CRLF, too
+(test #\E escaped-char (string-append (string #\return) (string #\newline)
END))
 (test #\E escaped-char (string-append  (string #\newline) END))
 (test #\E escaped-char (string-append  (string #\newline) END))
 (test #\E escaped-char (string-append   (string #\newline)END))
 ;; But not more than one!
 (test #\newline escaped-char (string-append   (string #\newline)  
(string #\newline)  END))
+;; CR and LF both counted
+(test #\newline escaped-char (string-append   (string #\return)  
(string #\newline)  END))
 ;; Tabs count as intraline whitespace too
 (test #\E escaped-char (string-append (string #\tab) (string #\newline) 
(string #\tab)END))
 ;; Edge case
-- 
1.7.12

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)

2013-05-27 Thread Alex Shinn
On Mon, May 27, 2013 at 6:09 AM, Peter Bex peter@xs4all.nl wrote:


 For the backslash escapes inside strings, we already had everything
 that R7RS dictates, except for two things: hex escapes and
 \intraline whitespace*newlineintraline whitespace* syntax.
 I think the former is useful, but it's rather tricky so I want to
 take my time implementing it.  The latter is silly and rather pointless,
 but I've implemented it for completeness.

 Note that Chibi's behavior seems to be incorrect wrt to the spec.
 The string: \


 x should be read like \n\nx, I think.  Chibi drops all the
 whitespace and newlines, instead of just the whitespace surrounding
 the *first* newline.  I've made Chicken follow what I think is what
 the spec says.


Yes, the spec is correct here, thanks.  I'll fix Chibi and add an R7RS
test case.  [Agreed about it being silly and pointless.]

There's also escaped strings without newlines like \foo.
 It's unspecified what an implementation should do with that, like
 any other escaped character (like, for example, \p).


This is intentionally unspecified - you can do what you like here.
A pedantic impl would signal an error.

-- 
Alex
___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)

2013-05-27 Thread John Cowan
Alex Shinn scripsit:

  It's unspecified what an implementation should do with that, like
  any other escaped character (like, for example, \p).
 
 This is intentionally unspecified - you can do what you like here.
 A pedantic impl would signal an error.

Indeed.  If things like \p were specified (to mean p, for example),
we wouldn't be able to add a meaning to them in R8RS.

-- 
John Cowanco...@ccil.orghttp://ccil.org/~cowan
The present impossibility of giving a scientific explanation is no proof
that there is no scientific explanation. The unexplained is not to be
identified with the unexplainable, and the strange and extraordinary
nature of a fact is not a justification for attributing it to powers
above nature.  --The Catholic Encyclopedia, s.v. telepathy (1913)

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


[Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)

2013-05-26 Thread Peter Bex
Hi all,

Here are two patches for adding R7RS named character and string escapes
sequences.  Turns out the r7rs-tasks wiki page is incorrect; we already
had \a and #\alarm support.  The only two standard named characters
that needed to be added was #\null (where we had #\nul) and #\escape
where we had #\esc, and the strings only need hex escapes and escaped
indentation support.

The new character names are now the preferred way to print #\x00 and
#\x1b for WRITE and friends.  This means that CHICKEN's s-expression
output can be READ by R7RS Schemes.  Only, older CHICKENs can't read
s-expressions output by newer ones.  IMHO this does not require a
Change Request because it does not strictly break backwards
compatibility: newer chickens can still read s-expressions output by
older chickens, and any program will continue working as it has before.

For the backslash escapes inside strings, we already had everything
that R7RS dictates, except for two things: hex escapes and 
\intraline whitespace*newlineintraline whitespace* syntax.
I think the former is useful, but it's rather tricky so I want to
take my time implementing it.  The latter is silly and rather pointless,
but I've implemented it for completeness.

Note that Chibi's behavior seems to be incorrect wrt to the spec.
The string: \
   
 
x should be read like \n\nx, I think.  Chibi drops all the
whitespace and newlines, instead of just the whitespace surrounding
the *first* newline.  I've made Chicken follow what I think is what
the spec says.

There's also escaped strings without newlines like \foo.
It's unspecified what an implementation should do with that, like
any other escaped character (like, for example, \p).  Chibi also
simply collapses the whitespace away, and I've decided to follow
this behaviour, but to make our handling of this more consistent with
how we handle other undefined escape sequences, it also shows a
read-warning.  I did not add tests for this because it's unspecified
and nobody should be relying on this.

Cheers,
Peter
-- 
http://www.more-magic.net
From fe97ba69a041c34c2d98a24baca03617f26c1df4 Mon Sep 17 00:00:00 2001
From: Peter Bex peter@xs4all.nl
Date: Sun, 26 May 2013 19:38:17 +0200
Subject: [PATCH 1/2] Added #\null and #\escape character literal names, for
 R7RS compatibility.

WRITE now uses this instead of the old #\nul and #\esc, so other impls
can READ s-expression written by CHICKEN.  Unfortunately, the flip side
is that older CHICKENs can't READ data written by newer CHICKENs.
---
 NEWS |  2 ++
 library.scm  |  4 
 tests/r7rs-tests.scm | 23 +++
 3 files changed, 29 insertions(+)

diff --git a/NEWS b/NEWS
index 07a8a5a..f2d28fc 100644
--- a/NEWS
+++ b/NEWS
@@ -22,6 +22,8 @@
 and #!rest in type-declarations (suggested by Joerg Wittenberger).
   - Vectors, SRFI-4 number vectors and blobs are now self-evaluating for
  R7RS compatibility.  Being literal constants, they are implicitly quoted.
+  - For R7RS compatibility, named character literals #\escape and #\null are
+ supported as aliases for #\esc and #\nul.  WRITE will output R7RS names.
 
 - Compiler
   - the inline declaration does not force inlining anymore as recursive
diff --git a/library.scm b/library.scm
index f42ddd7..cd33b09 100644
--- a/library.scm
+++ b/library.scm
@@ -1527,6 +1527,8 @@ EOF
   (and-let* ([a (assq x names-to-chars)])
 (##sys#slot a 1) ) ] ) ) ) ) )
 
+;; TODO: Use the character names here in the next release?  Or just
+;; use the numbers everywhere, for clarity?
 (char-name 'space #\space)
 (char-name 'tab #\tab)
 (char-name 'linefeed #\linefeed)
@@ -1534,8 +1536,10 @@ EOF
 (char-name 'vtab (integer-char 11))
 (char-name 'delete (integer-char 127))
 (char-name 'esc (integer-char 27))
+(char-name 'escape (integer-char 27))
 (char-name 'alarm (integer-char 7))
 (char-name 'nul (integer-char 0))
+(char-name 'null (integer-char 0))
 (char-name 'return #\return)
 (char-name 'page (integer-char 12))
 (char-name 'backspace (integer-char 8))
diff --git a/tests/r7rs-tests.scm b/tests/r7rs-tests.scm
index bc21294..368f9f4 100644
--- a/tests/r7rs-tests.scm
+++ b/tests/r7rs-tests.scm
@@ -71,6 +71,29 @@
 
 
 
+(SECTION 6 6)
+
+
+(define (integer-named-char x)
+  (with-output-to-string (lambda () (write (integer-char x)
+
+(test #\\alarm integer-named-char #x07)
+(test #\\backspace integer-named-char #x08)
+(test #\\delete integer-named-char #x7f)
+(test #\\escape integer-named-char #x1b)
+(test #\\newline integer-named-char #x0a)
+(test #\\null integer-named-char #x00)
+(test #\\return integer-named-char #x0d)
+(test #\\space integer-named-char #x20)
+(test #\\tab integer-named-char #x09)
+
+
+
+;; NOT YET (is ambiguous with existing \xNN syntax in Chicken)
+#;(test #\tab escaped-char x9;)
+#;(test #\tab escaped-char x09;)
+
+
 (SECTION 6 8)
 
 ;; Symbols are implicitly quoted inside self-evaluating vectors.
-- 
1.8.2.3

From 

Re: [Chicken-hackers] [PATCH] Add support for R7RS named characters and string escapes (except hex escapes)

2013-05-26 Thread Evan Hanson
On 2013/05/26 11:09P, Peter Bex wrote:
 Here are two patches for adding R7RS named character and string escapes
 sequences.

I don't believe the second patch is sufficient for adding intraline
whitespace escapes; it only handles LF-terminated lines, while R7RS
specifies that CR, CRLF and LF line endings should all be collapsed.

The attached patch supplements Peter's to handle these cases.

Evan
From f5b06523300b5b8880460d7cb3f935c300d85074 Mon Sep 17 00:00:00 2001
From: Evan Hanson ev...@foldling.org
Date: Mon, 27 May 2013 16:18:53 +1200
Subject: [PATCH] handle CR  CRLF-terminated lines when collapsing intraline
 whitespace

---
 library.scm |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/library.scm b/library.scm
index 68165d9..d8264da 100644
--- a/library.scm
+++ b/library.scm
@@ -2521,12 +2521,19 @@ EOF
  (loop (##sys#read-char-0 port) (r-cons-codepoint 
n lst)) )))
   ((#\\ #\' #\ #\|)
(loop (##sys#read-char-0 port) (cons c lst)))
-  ((#\newline #\space #\tab)
+  ((#\newline #\return #\space #\tab)
;; Read escaped intraline ws* nl intraline ws*
(let eat-ws ((c c) (nl? #f))
  (case c
((#\space #\tab)
 (eat-ws (##sys#read-char-0 port) nl?))
+   ((#\return)
+(if nl?
+(loop c lst)
+(let ((nc (##sys#read-char-0 port)))
+  (if (eq? nc #\newline) ; collapse \r\n
+  (eat-ws (##sys#read-char-0 port) #t)
+  (eat-ws nc #t)
((#\newline)
 (if nl?
 (loop c lst)
-- 
1.7.10.4

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers