lloda pushed a commit to branch main
in repository guile.
commit 12754161d522ac1d876693f5c717c7b8d46f11a8
Author: Rob Browning <[email protected]>
AuthorDate: Thu Apr 10 13:52:02 2025 -0500
Add initial upstream code for srfi-207: String-notated bytevectors
https://srfi.schemers.org/srfi-207/srfi-207.html
Add the key files from the upstream implementation, excepting the
Guile-specific top-level module to integrate it all. Omit
parse.scm because we're going to provide an alternative.
These files are the unmodified upstream code from the upstream
repository:
https://github.com/scheme-requests-for-implementation/srfi-207.git
as of this commit (errata-2-2-gd646cfa):
commit d646cfa6171912398e9325862ba1e72d75f301f9
Merge: 692530f 21a5c30
Author: Arthur A. Gleckler <[email protected]>
Date: Fri Feb 21 19:28:07 2025 -0800
Merge pull request #16 from Zipheir/master
Fix incorrect example.
For now add the html documentation as-is; we'll convert it to texinfo
later.
* doc/ref/srfi-207.html: Add new file.
* module/srfi/srfi-207/upstream/base64.scm: Add new file.
* module/srfi/srfi-207/upstream/bytestrings-impl.scm: Add new file.
* test-suite/tests/srfi-207.test: Add new file.
---
doc/ref/srfi-207.html | 417 +++++++++++++++++
module/srfi/srfi-207/upstream/base64.scm | 182 ++++++++
module/srfi/srfi-207/upstream/bytestrings-impl.scm | 499 +++++++++++++++++++++
test-suite/tests/srfi-207.test | 444 ++++++++++++++++++
4 files changed, 1542 insertions(+)
diff --git a/doc/ref/srfi-207.html b/doc/ref/srfi-207.html
new file mode 100644
index 000000000..886abffe6
--- /dev/null
+++ b/doc/ref/srfi-207.html
@@ -0,0 +1,417 @@
+<!DOCTYPE html>
+<html lang="en">
+ <head>
+ <meta charset="utf-8">
+ <title>SRFI 207: String-notated bytevectors</title>
+ <link href="/favicon.png" rel="icon" sizes="192x192" type="image/png">
+ <link rel="stylesheet" href="https://srfi.schemers.org/srfi.css"
type="text/css">
+ <style>pre.example { margin-left: 2em; }</style>
+ <meta name="viewport" content="width=device-width, initial-scale=1"></head>
+ <body>
+ <h1><a href="https://srfi.schemers.org/"><img class="srfi-logo"
src="https://srfi.schemers.org/srfi-logo.svg" alt="SRFI logo" /></a>207:
String-notated bytevectors</h1>
+
+<p>by
+ Daphne Preston-Kendal (external notation),
+ John Cowan (procedure design),
+ Wolfgang Corcoran-Mathe (implementation)
+</p>
+
+<h2 id="status">Status</h2>
+
+<p>This SRFI is currently in <em>final</em> status. Here is <a
href="https://srfi.schemers.org/srfi-process.html">an explanation</a> of each
status that a SRFI can hold. To provide input on this SRFI, please send email
to <code><a
href="mailto:srfi+minus+207+at+srfi+dotschemers+dot+org">srfi-207@<span
class="antispam">nospam</span>srfi.schemers.org</a></code>. To subscribe to
the list, follow <a
href="https://srfi.schemers.org/srfi-list-subscribe.html">these
instructions</a>. You can [...]
+<ul>
+ <li>Received: 2020-08-15</li>
+ <li>Draft #1 published: 2020-08-15</li>
+ <li>Draft #2 published: 2020-08-17</li>
+ <li>Draft #3 published: 2020-09-09</li>
+ <li>Draft #4 published: 2020-10-05</li>
+ <li>Draft #5 published: 2020-10-12</li>
+ <li>Draft #6 published: 2020-10-15</li>
+ <li>Draft #7 published: 2020-10-24</li>
+ <li>Finalized: 2020-10-29</li>
+ <li>Revised to fix errata:
+ <ul>
+ <li>2021-03-10 (Fix <a href="#errata-1">description</a>
+ of <code>make-bytestring!</code>.)</li>
+ <li>2025-02-06 (Fix <a href="#errata-2">explanation</a> of how
+ to compare bytevectors for equality.)</li></ul></li>
+</ul>
+
+<h2 id="abstract">Abstract</h2>
+
+<p>To ease the human reading and writing of Scheme code involving
+binary data that for mnemonic reasons corresponds
+as a whole or in part to ASCII-coded text, a notation
+for bytevectors is defined which allows printable ASCII characters
+to be used literally without being converted to their corresponding
+integer forms. In addition, this SRFI provides a set of procedures
+known as the bytestring library
+for constructing a bytevector from a sequence of integers,
+characters, strings, and/or bytevectors, and for manipulating
+bytevectors as if they were strings as far as possible.
+
+<h2 id="rationale">Rationale</h2>
+
+<p>Binary file formats are usually not self-describing, and if they are,
+the descriptive portion is itself binary, which makes it hard for human beings
+to interpret. To assist with this problem, it is common to have a
+human-readable section at the beginning of the file, or in some cases
+at the beginning of each distinct section of the file.
+For historical reasons and to avoid text encoding complications, it is usual
+for this human-readable section to be expressed as ASCII text.</p>
+
+<p>For example, ZIP files begin with the hex bytes <code>50 4B</code>
+which are the ASCII encoding for the characters "PK", the initials
+of Phil Katz, the inventor of ZIP format. As another example,
+the GIF image format begins with <code>47 49 46 38 39 61</code>,
+the ASCII encoding for "GIF89a", where "89a" is the format version.
+A third example is the PNG image format, where the file header
+begins <code>89 50 4E 47</code>. The first byte is intentionally
+non-ASCII, but the next three are "PNG". Furthermore, a PNG
+file is divided into chunks, each of which contains a 4-byte
+"chunk type" code. The letters in the chunk type are mnemonics
+for its purpose, such as "PLTE" for a palette, "bKGD" for a
+default background color, and "iTXt" for descriptive text in UTF-8.</p>
+
+<p>When bytevectors contain string data of this kind, it is much more
tractable for
+human programmers to deal with them in the form <code>#u8"recursion"</code>
+than in the form <code>#u8(114 101 99 117 114 115 105 111 110)</code>.
+This is true even when non-ASCII bytes are incorporated
+into the bytevector: the complete 8-bit PNG file header can be written as
+<code>#u8"\x89;PNG\r\n\x1A;\n"</code>
+instead of <code>#u8(0x89 0x50 0x4E 0x47 0x0D 0x0A 0x1A 0x0A)</code>.</p>
+
+<p>In addition, this SRFI provides bytevectors with additional procedures that
closely resemble those provided for strings. For example, bytevectors can be
padded or trimmed, compared case-sensitively or case-insensitively, searched,
joined, and split.
+
+<p>In this specification it is assumed that bytevectors are as defined in
R7RS-small section 6.9. Implementations may also consider them equivalent to
R6RS bytevectors (R6RS 4.3.4) or <a
href="https://srfi.schemers.org/srfi-4/srfi-4.html">SRFI 4</a>
<code>u8vector</code>s, depending which kind of homogeneous vectors of unsigned
8-bit integers an implementation supports.
+
+<h2 id="specification">Specification</h2>
+
+<p>Most of the procedures of this SRFI begin with <code>bytestring-</code>
+in order to distinguish them from other bytevector procedures.
+This does not mean that they accept or return a separate bytestring type:
+bytestrings and bytevectors are exactly the same type.</p>
+
+<h3>External notation</h3>
+
+<p>The basic form of a string-notated bytevector is:
+
+<blockquote><code>#u8"</code> <var>content</var> <code>"</code></blockquote>
+
+<p>To avoid character encoding issues within string-notated bytevectors, only
printable ASCII characters (that is, Unicode codepoints in the range from
U+0020 to U+007E inclusive) are allowed to be used within the
<var>content</var> of a string-notated bytevector. All other characters must be
expressed through mnemonic or inline hex escapes, and <code>"</code> and
<code>\</code> must also be escaped as in normal Scheme strings.
+
+<p>Within the <var>content</var> of a string-notated bytevector:
+
+<ul>
+ <li>the sequence <code>\"</code> represents the integer 34;
+ <li>the sequence <code>\\</code> represents the integer 92;
+ <li>the following mnemonic sequences represent the corresponding integers:
+ <table>
+ <tr><th>Seq. <th>Integer
+ <tr><td><code>\a</code> <td>7
+ <tr><td><code>\b</code> <td>8
+ <tr><td><code>\t</code> <td>9
+ <tr><td><code>\n</code> <td>10
+ <tr><td><code>\r</code> <td>13
+ <tr><td><code>\|</code> <td>124
+ </table>
+ <li>the sequence <code>\x</code> followed by zero or more <code>0</code>
characters, followed by one or two hexadecimal digits, followed by
<code>;</code> represents the integer specified by the hexadecimal digits;
+ <li>the sequence <code>\</code> followed by zero or more intraline
whitespace characters, followed by a newline, followed by zero or more further
intraline whitespace characters, is ignored and corresponds to no entry in the
resulting bytevector;
+ <li>any other printable ASCII character represents the character number of
that character in the ASCII/Unicode code chart; and
+ <li>it is an error to use any other character or sequence beginning with
<code>\</code> within a string-notated bytevector.
+</ul>
+
+<p>Note: The <code>\|</code> sequence is provided so that
+string parsing, symbol parsing, and string-notated bytevector parsing
+can all use the same sequences.
+However, we give a complete definition of the valid lexical syntax
+in this SRFI rather than inheriting the native syntax of strings,
+so that it is clear that <code>#u8"ι"</code> and
+<code>#u8"\xE000;"</code> are invalid.</p>
+<p>When the Scheme reader encounters a string-notated bytevector, it produces
a datum as if that bytevector had been written out in full. That is,
<code>#u8"A"</code> is exactly equivalent to <code>#u8(65)</code>.
+
+<p>A Scheme implementation which supports string-notated bytevectors may not
by default use this notation when any of the <code>write</code> family of
procedures is called upon a bytevector or upon another datum containing a
bytevector. A future SRFI is expected to add a configurable version of the
<code>write</code> procedure which may enable the use of this notation in this
context.
+
+<h3>Formal syntax</h3>
+
+<p>The formal syntax of Scheme (defined in R7RS-small 7.1) is amended as
follows.
+
+<ul>
+<li><p>In the definition of ⟨token⟩, after ‘| ⟨string⟩’, insert ‘|
⟨string-notated bytevector⟩’.
+<li><p>After the definition of ⟨byte⟩ is inserted:
+ <blockquote>
+ <p>⟨string-notated bytevector⟩ → <code>#u8"</code> ⟨string-notated
bytevector element⟩* <code>"</code><br>
+ ⟨string-notated bytevector element⟩ → ⟨any printable ASCII character
other than <code>"</code> or <code>\</code>⟩<br>
+ <span style="margin-left:1em">| ⟨mnemonic escape⟩ | <code>\"</code> |
<code>\\</code></span><br>
+ <span style="margin-left:1em">| <code>\</code>⟨intraline whitespace⟩*⟨line
ending⟩⟨intraline whitespace⟩*</span><br>
+ <span style="margin-left:1em">| ⟨inline hex escape⟩</span>
+ </blockquote>
+</ul>
+
+<h3>Constructors</h3>
+
+<p><code>(bytestring</code> <var>arg</var> …<code>)</code></p>
+<p>Converts <var>args</var> into a sequence of small integers and returns them
as a bytevector as follows:</p>
+<ul>
+ <li>
+ <p>If <var>arg</var> is an exact integer in the range 0-255 inclusive,
it is added to the result.</p>
+ </li>
+ <li>
+ <p>If <var>arg</var> is an ASCII character (that is, its codepoint is
in the range 0-127 inclusive), it is converted to its codepoint and added to
the result.</p>
+ </li>
+ <li>
+ <p>If <var>arg</var> is a bytevector, its elements are added to the
result.</p>
+ </li>
+ <li>
+ <p>If <var>arg</var> is a string of ASCII characters, it is converted
to a sequence of codepoints which are added to the result.</p>
+ </li>
+</ul>
+<p>Otherwise, an error satisfying <code>bytestring-error?</code> is
signaled.</p>
+<p>Examples:</p>
+<pre class="example"><code>(bytestring "lo" #\r #x65 #u8(#x6d)) ⇒
#u8"lorem"
+(bytestring "η" #\space #u8(#x65 #x71 #x75 #x69 #x76)) ⇒</code>
<em>error</em>
+</pre>
+
+<p><code>(make-bytestring</code> <var>list</var><code>)</code></p>
+<p>If the elements of <var>list</var> are suitable arguments for
+<code>bytestring</code>, returns the bytevector that would be the
+result of applying <code>bytestring</code> to <var>list</var>.
+Otherwise, an error satisfying <code>bytestring-error?</code> is signaled.</p>
+
+<p id="errata-1"><code>(make-bytestring!</code> <var>bytevector at
list</var><code>)</code></p>
+<p>If the elements of <var>list</var> are suitable arguments for
+<code>bytestring</code>, writes the bytes of the bytevector that would be the
+result of calling <code>make-bytestring</code>
+into <var>bytevector</var> starting at index <var>at</var>.</p>
+<pre class="example"><code>(define bstring (make-bytevector 10 #x20))
+(make-bytestring! bstring 2 '(#\s #\c "he" #u8(#x6d #x65)))
+bstring ⇒ #u8" scheme "</code></pre>
+
+<h3>Conversion</h3>
+
+<p><code>(bytevector->hex-string</code> <var>bytevector</var><code>)</code><br>
+<code>(hex-string->bytevector</code> <var>string</var><code>)</code></p>
+<p>Converts between a bytevector and a string containing pairs of hexadecimal
digits.
+If <var>string</var> is not pairs of hexadecimal digits, an error satisfying
<code>bytestring-error?</code> is raised.</p>
+<pre class="example"><code>(bytevector->hex-string #u8"Ford") ⇒
"467f7264"
+(hex-string->bytevector "5a6170686f64") ⇒ #u8"Zaphod"</code></pre>
+
+<p><code>(bytevector->base64</code> <var>bytevector</var>
[<var>digits</var>]<code>)</code><br>
+<code>(base64->bytevector</code> <var>string</var>
[<var>digits</var>]<code>)</code></p>
+<p>Converts between a bytevector and its base-64 encoding as a string. The 64
digits are represented by the characters 0-9, A-Z, a-z, and the symbols + and
/. However, there are different variants of base-64 encoding which use
different representations of the 62nd and 63rd digit. If the optional argument
<var>digits</var> (a two-character string) is provided, those two characters
will be used as the 62nd and 63rd digit instead.
+Details can be found in
+<a href="https://tools.ietf.org/html/rfc4648">RFC 4648</a>.
+If <var>string</var> is not in base-64 format, an error satisfying
<code>bytestring-error?</code> is raised.
+However, characters that satisfy <code>char-whitespace?</code>
+are silently ignored.</p>
+<pre class="example"><code>(bytevector->base64 #u8(1 2 3 4 5 6)) ⇒
"AQIDBAUG"
+(bytevector->base64 #u8"Arthur Dent") ⇒ "QXJ0aHVyIERlbnQ="
+(base64->bytevector "+/ /+") ⇒ #u8(#xfb #xff #xfe)</code></pre>
+
+<p><code>(bytestring->list</code> <var>bytevector</var> [
<var>start</var> [ <var>end</var> ] ]<code>)</code></p>
+<p>Converts all or part of a bytevector
+into a list of the same length containing
+characters for elements in the range 32 to 127
+and exact integers for all other elements.</p>
+<pre class="example"><code>(bytestring->list #u8(#x41 #x42 1 2) 1 3) ⇒
(#\B 1)</code></pre>
+
+<p><code>(make-bytestring-generator</code> <var>arg</var>
…<code>)</code></p>
+<p>Returns a generator that when invoked will return consecutive bytes
+of the bytevector that <code>bytestring</code> would create when applied
+to <var>args</var>, but without creating any bytevectors.
+The <var>args</var> are validated before any bytes are generated;
+if they are ill-formed, an error satisfying
+<code>bytestring-error?</code> is raised.</p>
+<pre class="example"><code>(generator->list (make-bytestring-generator
"lorem"))
+ ⇒ (#x6c #x6f #x72 #x65 #x6d)</code></pre>
+<h3>Selection</h3>
+
+<p><code>(bytestring-pad</code> <var>bytevector len
char-or-u8</var><code>)</code><br>
+<code>(bytestring-pad-right</code> <var>bytevector len
char-or-u8</var><code>)</code></p>
+<p>Returns a newly allocated bytevector with the contents of
<var>bytevector</var> plus sufficient additional bytes at the beginning/end
containing <var>char-or-u8</var> (which can be either an ASCII character or an
exact integer in the range 0-255) such that the length of the result is at
least <var>len</var>.</p>
+<pre class="example"><code>(bytestring-pad #u8"Zaphod" 10 #\_) ⇒
#u8"____Zaphod"
+(bytestring-pad-right #u8(#x80 #x7f) 8 0) ⇒ #u8(#x80 #x7f 0 0 0 0 0
0)</code></pre>
+
+<p><code>(bytestring-trim</code> <var>bytevector
pred</var><code>)</code><br>
+<code>(bytestring-trim-right</code> <var>bytevector
pred</var><code>)</code><br>
+<code>(bytestring-trim-both</code> <var>bytevector
pred</var><code>)</code></p>
+<p>Returns a newly allocated bytevector with the contents of
<var>bytevector</var>, except that consecutive bytes at the beginning / the end
/ both the beginning and the end that satisfy <var>pred</var> are not
included.</p>
+<pre class="example"><code>(bytestring-trim #u8" Trillian" (lambda (b) (= b
#x20)))
+ ⇒ #u8"Trillian"
+(bytestring-trim-both #u8(0 0 #x80 #x7f 0 0 0) zero?) ⇒ #u8(#x80
#x7f)</code></pre>
+
+<h3>Replacement</h3>
+
+<p><code>(bytestring-replace</code> <var>bytevector1 bytevector2 start1
end1 [start2 end2]</var><code>)</code></p>
+<p>Returns a newly allocated bytevector with the contents of
<var>bytevector1</var>, except that the bytes indexed by <var>start1</var> and
<var>end1</var> are not included but are replaced by the bytes of
<var>bytevector2</var> indexed by <var>start2</var> and <var>end2</var>.</p>
+<pre class="example"><code>(bytestring-replace #u8"Vogon torture" #u8"poetry"
6 13)
+ ⇒ #u8"Vogon poetry"</code></pre>
+
+<h3>Comparison</h3>
+
+<p id="errata-2">To compare bytevectors for equality, use the
+procedure <code>bytevector=?</code> from
+the R6RS library <code>(rnrs bytevectors)</code> or
+<code>equal?</code> in R7RS.
+
+<p><code>(bytestring<?</code> <var>bytevector1
bytevector2</var><code>)</code><br>
+<code>(bytestring>?</code> <var>bytevector1
bytevector2</var><code>)</code><br>
+<code>(bytestring<=?</code> <var>bytevector1
bytevector2</var><code>)</code><br>
+<code>(bytestring>=?</code> <var>bytevector1
bytevector2</var><code>)</code></p>
+<p>Returns <code>#t</code> if <var>bytevector1</var> is less than / greater
than / less than or equal to / greater than or equal to <var>bytevector2</var>.
Comparisons are lexicographical: shorter bytevectors compare before longer
ones, all elements being equal.</p>
+<pre class="example"><code>(bytestring<? #u8"Heart Of Gold" #u8"Heart of
Gold") ⇒ #t
+(bytestring<=? #u8(#x81 #x95) #u8(#x80 #xa0)) ⇒ #f
+(bytestring>? #u8(1 2 3) #u8(1 2)) ⇒ #t
+</code></pre>
+
+<h3>Searching</h3>
+
+<p><code>(bytestring-index</code> <var>bytevector pred</var>
[<var>start</var> [<var>end</var>]]<code>)</code><br>
+<code>(bytestring-index-right</code> <var>bytevector pred</var>
[<var>start</var> [<var>end</var>]]<code>)</code></p>
+<p>Searches <var>bytevector</var> from <var>start</var> to <var>end</var> /
from <var>end</var> to <var>start</var> for the first byte that satisfies
<var>pred</var>, and returns the index into <var>bytevector</var> containing
that byte. In either direction, <var>start</var> is inclusive and
<var>end</var> is exclusive. If there are no such bytes, returns
<code>#f</code>.</p>
+<pre class="example"><code>(bytestring-index #u8(#x65 #x72 #x83 #x6f) (lambda
(b) (> b #x7f))) ⇒ 2
+(bytestring-index #u8"Beeblebrox" (lambda (b) (> b #x7f))) ⇒ #f
+(bytestring-index-right #u8"Zaphod" odd?) ⇒ 4
+</code></pre>
+
+<p><code>(bytestring-break</code> <var>bytevector
pred</var><code>)</code><br>
+<code>(bytestring-span</code> <var>bytevector pred</var><code>)</code></p>
+<p>Returns two values, a bytevector containing the maximal sequence of
characters (searching from the beginning of <var>bytevector</var> to the end)
that do not satisfy / do satisfy <var>pred</var>, and another bytevector
containing the remaining characters.</p>
+<pre class="example"><code>(bytestring-break #u8(#x50 #x4b 0 0 #x1 #x5) zero?)
+ ⇒ #u8(#x50 #x4b)
+ #u8(0 0 #x1 #x5)
+(bytestring-span #u8"ABCDefg" (lambda (b) (and (> b 40) (< b 91))))
+ ⇒ #u8"ABCD"
+ #u8"efg"
+</code></pre>
+
+<h3 id="joining-and-splitting">Joining and splitting</h3>
+
+<p><code>(bytestring-join</code> <var>bytevector-list delimiter</var>
[<var>grammar</var>]<code>)</code></p>
+<p>Pastes the bytevectors in <var>bytevector-list</var> together
+using the <var>delimiter</var>,
+which can be anything suitable as an argument to <code>bytestring</code>.
+The <var>grammar</var>
+argument is a symbol that determines how the delimiter is used, and
+defaults to <code>infix</code>. It is an error for grammar to be
+any symbol other than these four:</p>
+<ul>
+ <li><code>infix</code> means an infix or separator grammar: inserts the
delimiter between list elements. An empty list will produce an empty
bytevector.</li>
+ <li><code>strict-infix</code> means the same as <code>infix</code> if the
list is non-empty, but will signal an error satisfying
<code>bytestring-error?</code> if given an empty list.</li>
+ <li><code>suffix</code> means a suffix or terminator grammar: inserts the
delimiter after every list element.</li>
+ <li><code>prefix</code> means a prefix grammar: inserts the delimiter before
every list element.</li>
+</ul>
+<pre class="example"><code>(bytestring-join '(#u8"Heart" #u8"of" #u8"Gold")
#x20) ⇒ #u8"Heart of Gold"
+(bytestring-join '(#u8(#xef #xbb) #u8(#xbf)) 0 'prefix) ⇒ #u8(0 #xef #xbb
0 #xbf)
+(bytestring-join '() 0 'strict-infix) ⇒</code> <em>error</em></pre>
+
+<p><code>(bytestring-split</code> <var>bytevector delimiter</var>
[<var>grammar</var>]<code>)</code></p>
+<p>Divides the elements of <var>bytevector</var> and returns a list of newly
allocated bytevectors using the <var>delimiter</var> (an ASCII character or
exact integer in the range 0-255 inclusive). Delimiter bytes are not included
in the result bytevectors.</p>
+<p>The <var>grammar</var> argument is used to control how
<var>bytevector</var> is divided. It has the same default and meaning as in
<code>bytestring-join</code>, except that <code>infix</code> and
<code>strict-infix</code> mean the same thing. That is, if <var>grammar</var>
is <code>prefix</code> or <code>suffix</code>, then ignore any delimiter in the
first or last position of <var>bytevector</var> respectively.</p>
+<pre class="example"><code>(bytestring-split #u8"Beeblebrox" #x62) ⇒
(#u8"Bee" #u8"le" #u8"rox")
+(bytestring-split #u8(1 0 2 0) 0 'suffix) ⇒ (#u8(1) #u8(2))
+</code></pre>
+
+<h3>I/O</h3>
+
+<code>(read-textual-bytestring</code> <var>prefix</var> [
<var>port</var> ]<code>)</code>
+<p>Reads a string in the external format described in this SRFI
+from <var>port</var> and return it as a bytevector.
+If the <var>prefix</var> argument is false, this procedure assumes
+that "<code>#u8</code>" has already been read from <var>port</var>.
+If <var>port</var> is omitted, it defaults to the value of
<code>(current-input-port)</code>.
+If the characters read are not in the external format,
+an error satisfying <code>bytestring-error?</code> is raised.</p>
+<pre class="example"><code>(call-with-port (open-input-string
"#u8\"AB\\xad;\\xf0;\\x0d;CD\"")
+ (lambda (port)
+ (read-textual-bytestring #t port)))
+ ⇒ #u8(#x41 #x42 #xad #xf0 #x0d #x43 #x44)
+</code></pre>
+
+<p><code>(write-textual-bytestring</code> <var>bytevector</var> [
<var>port</var> ]<code>)</code></p>
+<p>Writes <var>bytevector</var> in the external format described in this SRFI
to <var>port</var>.
+Bytes representing non-graphical ASCII characters are unencoded:
+all other bytes are encoded with a single letter if possible,
+otherwise with a <code>\x</code> escape.
+If <var>port</var> is omitted, it defaults to the value of
<code>(current-output-port)</code>.</p>
+<pre class="example"><code>(call-with-port (open-output-string)
+ (lambda (port)
+ (write-textual-bytestring
+ #u8(#x9 #x41 #x72 #x74 #x68 #x75 #x72 #xa)
+ port)
+ (get-output-string port)))
+ ⇒ "#u8\"\\tArthur\\n\""
+</code></pre>
+
+<p><code>(write-binary-bytestring</code> <var>port arg</var>
…<code>)</code></p>
+<p>Outputs each <var>arg</var> to the binary output port <var>port</var>
+using the same interpretations as <code>bytestring</code>,
+but without creating any bytevectors.
+The <var>args</var> are validated before any bytes are written to
+<var>port</var>; if they are ill-formed, an error satisfying
+<code>bytestring-error?</code> is raised.</p>
+<pre class="example"><code>(call-with-port (open-output-bytevector)
+ (lambda (port)
+ (write-binary-bytestring port #\Z #x61 #x70 "hod")
+ (get-output-bytevector port)))
+ ⇒ #u8"Zaphod"
+</code></pre>
+
+<h3>Exception</h3>
+
+<p><code>(bytestring-error?</code> <var>obj</var><code>)</code></p>
+<p>Returns <code>#t</code> if <var>obj</var> is an object signaled by any of
the
+following procedures, in the circumstances described above:</p>
+<ul>
+ <li><code>bytestring</code></li>
+ <li><code>hex-string->bytestring</code></li>
+ <li><code>base64->bytestring</code></li>
+ <li><code>make-bytestring</code></li>
+ <li><code>make-bytestring!</code></li>
+ <li><code>bytestring-join</code></li>
+ <li><code>read-textual-bytestring</code></li>
+ <li><code>write-binary-bytestring</code></li>
+ <li><code>make-bytestring-generator</code></li>
+</ul>
+
+<h2 id="implementation">Implementation</h2>
+
+<p>There is a sample implementation of the procedures,
+but not the notation, in the repository of this SRFI.
+
+<h2 id="acknowledgements">Acknowledgements</h2>
+
+<p>Daphne Preston-Kendal devised the string notation for bytevectors; John
Cowan, the procedure library; Wolfgang Corcoran-Mathe, the sample
implementation of the procedures.
+
+<p>The notation is inspired by the notation used in Python since version 2.6
for <code>bytes</code> objects, which are fundamentally similar in purpose to
Scheme bytevectors, especially in R7RS. In addition, many of the procedures are
closely analogous to those of <a
href="https://srfi.schemers.org/srfi-152/srfi-152.html">SRFI 152</a>.
+
+<p>Thanks is also due to the participants in the SRFI mailing list. In
particular: Lassi Kortela corrected an embarrassing technical error; Marc
Nieper-Wißkirchen explained why the <code>write</code> procedure ought not to
be allowed to use this notation by default.
+
+<h2 id="copyright">Copyright</h2>
+<p>© 2020 Daphne Preston-Kendal, John Cowan, and Wolfgang Corcoran-Mathe.</p>
+
+<p>
+ Permission is hereby granted, free of charge, to any person
+ obtaining a copy of this software and associated documentation files
+ (the "Software"), to deal in the Software without restriction,
+ including without limitation the rights to use, copy, modify, merge,
+ publish, distribute, sublicense, and/or sell copies of the Software,
+ and to permit persons to whom the Software is furnished to do so,
+ subject to the following conditions:</p>
+
+<p>
+ The above copyright notice and this permission notice (including the
+ next paragraph) shall be included in all copies or substantial
+ portions of the Software.</p>
+<p>
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ SOFTWARE.</p>
+
+ <hr>
+ <address>Editor: <a
href="mailto:srfi-editors+at+srfi+dot+schemers+dot+org">Arthur A.
Gleckler</a></address></body></html>
diff --git a/module/srfi/srfi-207/upstream/base64.scm
b/module/srfi/srfi-207/upstream/base64.scm
new file mode 100644
index 000000000..71845c174
--- /dev/null
+++ b/module/srfi/srfi-207/upstream/base64.scm
@@ -0,0 +1,182 @@
+;;;; Reduced and heavily modified base64 library from chibi-scheme.
+;;;
+;;; Copyright (c) 2009-2018 Alex Shinn
+;;; All rights reserved.
+;;;
+;;; Redistribution and use in source and binary forms, with or without
+;;; modification, are permitted provided that the following conditions
+;;; are met:
+;;; 1. Redistributions of source code must retain the above copyright
+;;; notice, this list of conditions and the following disclaimer.
+;;; 2. Redistributions in binary form must reproduce the above copyright
+;;; notice, this list of conditions and the following disclaimer in the
+;;; documentation and/or other materials provided with the distribution.
+;;; 3. The name of the author may not be used to endorse or promote products
+;;; derived from this software without specific prior written permission.
+;;;
+;;; THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+;;; IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+;;; OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+;;; IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+;;; INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+;;; NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+;;; DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+;;; THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+;;; (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+;;; THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+;;;; Constants and tables
+
+(define outside-char 99) ; luft-balloons
+(define pad-char 101) ; dalmations
+
+(define (outside-char? x) (eqv? x outside-char))
+(define (pad-char? x) (eqv? x pad-char))
+
+(define (make-base64-decode-table digits)
+ (let ((extra-1 (char->integer (string-ref digits 0)))
+ (extra-2 (char->integer (string-ref digits 1))))
+ (vector-unfold
+ (lambda (i)
+ (cond ((and (>= i 48) (< i 58)) (+ i 4)) ; numbers
+ ((and (>= i 65) (< i 91)) (- i 65)) ; upper case letters
+ ((and (>= i 97) (< i 123)) (- i 71)) ; lower case letters
+ ((= i extra-1) 62)
+ ((= i extra-2) 63)
+ ((= i 61) pad-char) ; '='
+ (else outside-char)))
+ #x100)))
+
+(define (base64-decode-u8 table u8)
+ (vector-ref table u8))
+
+(define (make-base64-encode-table digits)
+ (vector-unfold
+ (lambda (i)
+ (cond ((< i 26) (+ i 65)) ; upper-case letters
+ ((< i 52) (+ i 71)) ; lower-case letters
+ ((< i 62) (- i 4)) ; numbers
+ ((= i 62) (char->integer (string-ref digits 0)))
+ ((= i 63) (char->integer (string-ref digits 1)))
+ (else (error "out of range"))))
+ 64))
+
+;;;; Decoding
+
+(define (decode-base64-string src digits)
+ (let ((table (make-base64-decode-table digits)))
+ (call-with-port
+ (open-output-bytevector)
+ (lambda (out)
+ (decode-base64-to-port src out table)
+ (get-output-bytevector out)))))
+
+;; Loop through src, writing decoded base64 data to port in chunks
+;; of up to three bytes.
+(define (decode-base64-to-port src port table)
+ (let ((len (string-length src)))
+ (let lp ((i 0) (b1 outside-char) (b2 outside-char) (b3 outside-char))
+ (if (= i len)
+ (decode-base64-trailing port b1 b2 b3)
+ (let* ((c (string-ref src i))
+ (b (base64-decode-u8 table (char->integer c))))
+ (cond ((pad-char? b) (decode-base64-trailing port b1 b2 b3))
+ ((char-whitespace? c) (lp (+ i 1) b1 b2 b3))
+ ((outside-char? b)
+ (bytestring-error "invalid character in base64 string"
+ c
+ src))
+ ((outside-char? b1) (lp (+ i 1) b b2 b3))
+ ((outside-char? b2) (lp (+ i 1) b1 b b3))
+ ((outside-char? b3) (lp (+ i 1) b1 b2 b))
+ (else
+ (write-u8 (bitwise-ior (arithmetic-shift b1 2)
+ (bit-field b2 4 6))
+ port)
+ (write-u8 (bitwise-ior
+ (arithmetic-shift (bit-field b2 0 4) 4)
+ (bit-field b3 2 6))
+ port)
+ (write-u8 (bitwise-ior
+ (arithmetic-shift (bit-field b3 0 2) 6)
+ b)
+ port)
+ (lp (+ i 1) outside-char outside-char outside-char))))))))
+
+;; Flush any trailing bits accumulated in the decode loop to the
+;; bytevector port `out', then return the finalized bytestring.
+(define (decode-base64-trailing out b1 b2 b3)
+ (cond ((outside-char? b1) #t)
+ ((outside-char? b2) (write-u8 (arithmetic-shift b1 2) out))
+ (else
+ (write-u8 (bitwise-ior (arithmetic-shift b1 2) (bit-field b2 4 6))
+ out)
+ (unless (outside-char? b3)
+ (write-u8 (bitwise-ior (arithmetic-shift (bit-field b2 0 4) 4)
+ (bit-field b3 2 6))
+ out)))))
+
+;;;; Encoding
+
+(define (base64-encode-bytevector bv digits)
+ (let* ((len (bytevector-length bv))
+ (quot (quotient len 3))
+ (rem (- len (* quot 3)))
+ (res-len (arithmetic-shift (+ quot (if (zero? rem) 0 1)) 2))
+ (res (make-bytevector res-len))
+ (table (make-base64-encode-table digits)))
+ (base64-encode-bytevector! bv 0 len res table)
+ res))
+
+(define (base64-encode-bytevector! bv start end res table)
+ (let ((limit (- end 2))
+ (enc (lambda (i) (vector-ref table i))))
+ (let lp ((i start) (j 0))
+ (if (>= i limit)
+ (case (- end i)
+ ((1)
+ (let ((b1 (bytevector-u8-ref bv i)))
+ (bytevector-u8-set! res j (enc (arithmetic-shift b1 -2)))
+ (bytevector-u8-set!
+ res
+ (+ j 1)
+ (enc (arithmetic-shift (bitwise-and #b11 b1) 4)))
+ (bytevector-u8-set! res (+ j 2) (char->integer #\=))
+ (bytevector-u8-set! res (+ j 3) (char->integer #\=))
+ (+ j 4)))
+ ((2)
+ (let ((b1 (bytevector-u8-ref bv i))
+ (b2 (bytevector-u8-ref bv (+ i 1))))
+ (bytevector-u8-set! res j (enc (arithmetic-shift b1 -2)))
+ (bytevector-u8-set!
+ res
+ (+ j 1)
+ (enc (bitwise-ior
+ (arithmetic-shift (bitwise-and #b11 b1) 4)
+ (bit-field b2 4 8))))
+ (bytevector-u8-set!
+ res
+ (+ j 2)
+ (enc (arithmetic-shift (bit-field b2 0 4) 2)))
+ (bytevector-u8-set! res (+ j 3) (char->integer #\=))
+ (+ j 4)))
+ (else
+ j))
+ (let ((b1 (bytevector-u8-ref bv i))
+ (b2 (bytevector-u8-ref bv (+ i 1)))
+ (b3 (bytevector-u8-ref bv (+ i 2))))
+ (bytevector-u8-set! res j (enc (arithmetic-shift b1 -2)))
+ (bytevector-u8-set!
+ res
+ (+ j 1)
+ (enc (bitwise-ior
+ (arithmetic-shift (bitwise-and #b11 b1) 4)
+ (bit-field b2 4 8))))
+ (bytevector-u8-set!
+ res
+ (+ j 2)
+ (enc (bitwise-ior
+ (arithmetic-shift (bit-field b2 0 4) 2)
+ (bit-field b3 6 8))))
+ (bytevector-u8-set! res (+ j 3) (enc (bitwise-and #b111111 b3)))
+ (lp (+ i 3) (+ j 4)))))))
diff --git a/module/srfi/srfi-207/upstream/bytestrings-impl.scm
b/module/srfi/srfi-207/upstream/bytestrings-impl.scm
new file mode 100644
index 000000000..d30424867
--- /dev/null
+++ b/module/srfi/srfi-207/upstream/bytestrings-impl.scm
@@ -0,0 +1,499 @@
+;;; Copyright (C) 2020 Wolfgang Corcoran-Mathe
+;;;
+;;; Permission is hereby granted, free of charge, to any person obtaining a
+;;; copy of this software and associated documentation files (the
+;;; "Software"), to deal in the Software without restriction, including
+;;; without limitation the rights to use, copy, modify, merge, publish,
+;;; distribute, sublicense, and/or sell copies of the Software, and to
+;;; permit persons to whom the Software is furnished to do so, subject to
+;;; the following conditions:
+;;;
+;;; The above copyright notice and this permission notice shall be included
+;;; in all copies or substantial portions of the Software.
+;;;
+;;; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+;;; OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+;;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+;;; IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+;;; CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+;;; TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+;;; SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
+;;;; Utility
+
+(define (exact-natural? x)
+ (and (exact-integer? x) (not (negative? x))))
+
+(define (u8-or-ascii-char? obj)
+ (or (and (char? obj) (char<=? obj #\delete))
+ (and (exact-natural? obj) (< obj 256))))
+
+(define (string-ascii? obj)
+ (and (string? obj)
+ (string-every (lambda (c) (char<=? c #\delete)) obj)
+ #t))
+
+(define (valid-bytestring-segment? obj)
+ (or (bytevector? obj)
+ (u8-or-ascii-char? obj)
+ (string-ascii? obj)))
+
+(define (%bytestring-null? bstring)
+ (zero? (bytevector-length bstring)))
+
+(define (%bytestring-last bstring)
+ (assume (not (%bytestring-null? bstring)) "empty bytestring")
+ (bytevector-u8-ref bstring (- (bytevector-length bstring) 1)))
+
+(define (negate pred)
+ (lambda (obj)
+ (not (pred obj))))
+
+;;;; Constructors
+
+(define (make-bytestring lis)
+ (assume (or (pair? lis) (null? lis)))
+ (call-with-port
+ (open-output-bytevector)
+ (lambda (out)
+ (for-each (lambda (seg) (%write-bytestring-segment seg out)) lis)
+ (get-output-bytevector out))))
+
+(define (make-bytestring! bvec at lis)
+ (assume (bytevector? bvec))
+ (assume (and (exact-natural? at)
+ (< at (bytevector-length bvec))))
+ (bytevector-copy! bvec at (make-bytestring lis)))
+
+(define (%write-bytestring-segment obj port)
+ ((cond ((and (exact-natural? obj) (< obj 256)) write-u8)
+ ((and (char? obj) (char<=? obj #\delete)) write-char-binary)
+ ((bytevector? obj) write-bytevector)
+ ((string-ascii? obj) write-string-binary)
+ (else
+ (bytestring-error "invalid bytestring element" obj)))
+ obj
+ port))
+
+;; If your Scheme allows binary ports to function as textual ports,
+;; get rid of this dance.
+(define (write-char-binary c port)
+ (write-u8 (char->integer c) port))
+
+(define (write-string-binary s port)
+ (string-for-each (lambda (c)
+ (write-char-binary c port))
+ s))
+
+(define (bytestring . args)
+ (if (null? args) (bytevector) (make-bytestring args)))
+
+;;;; Conversion
+
+;;; Hex string conversion
+
+;; Convert an unsigned integer n to a bytevector representing
+;; the base-256 big-endian form (the zero index holds the MSB).
+(define (integer->bytevector n)
+ (assume (and (integer? n) (not (negative? n))))
+ (if (zero? n)
+ (make-bytevector 1 0)
+ (u8-list->bytevector
+ (unfold-right zero?
+ (lambda (n) (truncate-remainder n 256))
+ (lambda (n) (truncate-quotient n 256))
+ n))))
+
+(define (integer->hex-string n)
+ (cond ((number->string n 16) =>
+ (lambda (res)
+ (if (even? (string-length res))
+ res
+ (string-append "0" res))))
+ (else (bytestring-error "not an integer" n))))
+
+(define (bytevector->hex-string bv)
+ (assume (bytevector? bv))
+ (string-concatenate
+ (list-tabulate (bytevector-length bv)
+ (lambda (i)
+ (integer->hex-string (bytevector-u8-ref bv i))))))
+
+(define (hex-string->bytevector hex-str)
+ (assume (string? hex-str))
+ (let ((len (string-length hex-str)))
+ (unless (even? len)
+ (bytestring-error "incomplete hexadecimal string" hex-str))
+ (u8vector-unfold
+ (lambda (_ i)
+ (let* ((end (+ i 2))
+ (s (substring hex-str i end))
+ (n (string->number s 16)))
+ (if n
+ (values n end)
+ (bytestring-error "invalid hexadecimal sequence" s))))
+ (truncate-quotient len 2)
+ 0)))
+
+(define bytevector->base64
+ (case-lambda
+ ((bvec) (bytevector->base64 bvec "+/"))
+ ((bvec digits)
+ (assume (bytevector? bvec))
+ (assume (string? digits))
+ (utf8->string (base64-encode-bytevector bvec digits)))))
+
+(define base64->bytevector
+ (case-lambda
+ ((base64-string) (base64->bytevector base64-string "+/"))
+ ((base64-string digits)
+ (assume (string? base64-string))
+ (assume (string? digits))
+ (decode-base64-string base64-string digits))))
+
+(define bytestring->list
+ (case-lambda
+ ((bstring) (bytestring->list bstring 0 (bytevector-length bstring)))
+ ((bstring start)
+ (bytestring->list bstring start (bytevector-length bstring)))
+ ((bstring start end)
+ (assume (bytevector? bstring))
+ (assume (and (exact-natural? start) (>= start 0))
+ "invalid start index"
+ start
+ bstring)
+ (assume (and (exact-natural? end) (<= end (bytevector-length bstring)))
+ "invalid end index"
+ end
+ bstring)
+ (assume (>= end start) "invalid indices" start end)
+ (unfold (lambda (i) (= i end))
+ (lambda (i)
+ (let ((b (bytevector-u8-ref bstring i)))
+ (if (and (>= b #x20) (< b #x7f))
+ (integer->char b)
+ b)))
+ (lambda (i) (+ i 1))
+ start))))
+
+;; Lazily generate the bytestring constructed from objs.
+(define (make-bytestring-generator . objs)
+ (list->generator (flatten-bytestring-segments objs)))
+
+;; Convert and flatten chars and strings, and flatten bytevectors
+;; to yield a flat list of bytes.
+(define (flatten-bytestring-segments objs)
+ (fold-right
+ (lambda (x res)
+ (cond ((and (exact-natural? x) (< x 256)) (cons x res))
+ ((and (char? x) (char<=? x #\delete))
+ (cons (char->integer x) res))
+ ((bytevector? x)
+ (append (bytevector->u8-list x) res))
+ ((string-ascii? x)
+ (append (map char->integer (string->list x)) res))
+ (else
+ (bytestring-error "invalid bytestring segment" x))))
+ '()
+ objs))
+
+;;;; Selection
+
+(define (%bytestring-pad-left-or-right bstring len char-or-u8 right)
+ (assume (bytevector? bstring))
+ (assume (exact-natural? len))
+ (assume (u8-or-ascii-char? char-or-u8))
+ (let ((pad-len (- len (bytevector-length bstring)))
+ (pad-byte (if (char? char-or-u8)
+ (char->integer char-or-u8)
+ char-or-u8)))
+ (if (<= pad-len 0)
+ (bytevector-copy bstring)
+ (let ((padded (make-bytevector len pad-byte)))
+ (bytevector-copy! padded (if right 0 pad-len) bstring)
+ padded))))
+
+(define (bytestring-pad bstring len char-or-u8)
+ (%bytestring-pad-left-or-right bstring len char-or-u8 #f))
+
+(define (bytestring-pad-right bstring len char-or-u8)
+ (%bytestring-pad-left-or-right bstring len char-or-u8 #t))
+
+(define (bytestring-trim bstring pred)
+ (assume (bytevector? bstring))
+ (assume (procedure? pred))
+ (let ((new-start (bytestring-index bstring (negate pred))))
+ (if new-start
+ (bytevector-copy bstring new-start)
+ (bytevector))))
+
+(define (bytestring-trim-right bstring pred)
+ (assume (bytevector? bstring))
+ (assume (procedure? pred))
+ (cond ((bytestring-index-right bstring (negate pred)) =>
+ (lambda (end-1)
+ (bytevector-copy bstring 0 (+ 1 end-1))))
+ (else (bytevector))))
+
+(define (bytestring-trim-both bstring pred)
+ (assume (bytevector? bstring))
+ (assume (procedure? pred))
+ (let ((neg-pred (negate pred)))
+ (cond ((bytestring-index bstring neg-pred) =>
+ (lambda (start)
+ (bytevector-copy bstring
+ start
+ (+ (bytestring-index-right bstring neg-pred)
+ 1))))
+ (else (bytevector)))))
+
+;;;; Replacement
+
+(define bytestring-replace
+ (case-lambda
+ ((bstring1 bstring2 start end)
+ (bytestring-replace bstring1
+ bstring2
+ start
+ end
+ 0
+ (bytevector-length bstring2)))
+ ((bstring1 bstring2 start1 end1 start2 end2)
+ (assume (bytevector? bstring1))
+ (assume (bytevector? bstring2))
+ (assume (and (exact-natural? start1) (>= start1 0) (<= start1 end1))
+ "invalid start index"
+ start1)
+ (assume (and (exact-natural? end1)
+ (<= end1 (bytevector-length bstring1)))
+ "invalid end index"
+ bstring1)
+ (assume (and (exact-natural? start2) (>= start2 0) (<= start2 end2))
+ "invalid start index"
+ start2)
+ (assume (and (exact-natural? end2)
+ (<= end2 (bytevector-length bstring2)))
+ "invalid end index"
+ bstring2)
+ (if (and (= start1 end1) (= start2 end2))
+ (bytevector-copy bstring1) ; replace no bits with no bits
+ (let* ((b1-len (bytevector-length bstring1))
+ (sub-len (- end2 start2))
+ (new-len (+ sub-len (- b1-len (- end1 start1))))
+ (bs-new (make-bytevector new-len)))
+ (bytevector-copy! bs-new 0 bstring1 0 start1)
+ (bytevector-copy! bs-new start1 bstring2 start2 end2)
+ (bytevector-copy! bs-new (+ start1 sub-len) bstring1 end1 b1-len)
+ bs-new)))))
+
+;;;; Comparison
+
+(define (%bytestring-prefix-length bstring1 bstring2)
+ (let ((end (min (bytevector-length bstring1)
+ (bytevector-length bstring2))))
+ (if (eqv? bstring1 bstring2) ; fast path
+ end
+ (let lp ((i 0))
+ (if (or (>= i end)
+ (not (= (bytevector-u8-ref bstring1 i)
+ (bytevector-u8-ref bstring2 i))))
+ i
+ (lp (+ i 1)))))))
+
+;;; Primitive bytevector comparison functions.
+
+(define (%bytestring-compare bstring1 bstring2 res< res= res>)
+ (let ((len1 (bytevector-length bstring1))
+ (len2 (bytevector-length bstring2)))
+ (let ((match (%bytestring-prefix-length bstring1 bstring2)))
+ (if (= match len1)
+ (if (= match len2) res= res<)
+ (if (= match len2)
+ res>
+ (if (< (bytevector-u8-ref bstring1 match)
+ (bytevector-u8-ref bstring2 match))
+ res<
+ res>))))))
+
+(define (bytestring<? bstring1 bstring2)
+ (assume (bytevector? bstring1))
+ (assume (bytevector? bstring2))
+ (and (not (eqv? bstring1 bstring2))
+ (%bytestring-compare bstring1 bstring2 #t #f #f)))
+
+(define (bytestring>? bstring1 bstring2)
+ (assume (bytevector? bstring1))
+ (assume (bytevector? bstring2))
+ (and (not (eqv? bstring1 bstring2))
+ (%bytestring-compare bstring1 bstring2 #f #f #t)))
+
+(define (bytestring<=? bstring1 bstring2)
+ (assume (bytevector? bstring1))
+ (assume (bytevector? bstring2))
+ (or (eqv? bstring1 bstring2)
+ (%bytestring-compare bstring1 bstring2 #t #t #f)))
+
+(define (bytestring>=? bstring1 bstring2)
+ (assume (bytevector? bstring1))
+ (assume (bytevector? bstring2))
+ (or (eqv? bstring1 bstring2)
+ (%bytestring-compare bstring1 bstring2 #f #t #t)))
+
+;;;; Searching
+
+(define bytestring-index
+ (case-lambda
+ ((bstring pred) (bytestring-index bstring pred 0))
+ ((bstring pred start)
+ (bytestring-index bstring pred start (bytevector-length bstring)))
+ ((bstring pred start end)
+ (assume (bytevector? bstring))
+ (assume (procedure? pred))
+ (assume (exact-natural? start))
+ (assume (exact-natural? end))
+ (let lp ((i start))
+ (and (< i end)
+ (if (pred (bytevector-u8-ref bstring i))
+ i
+ (lp (+ i 1))))))))
+
+(define bytestring-index-right
+ (case-lambda
+ ((bstring pred) (bytestring-index-right bstring pred 0))
+ ((bstring pred start)
+ (bytestring-index-right bstring pred start (bytevector-length bstring)))
+ ((bstring pred start end)
+ (assume (bytevector? bstring))
+ (assume (procedure? pred))
+ (assume (exact-natural? start))
+ (assume (exact-natural? end))
+ (let lp ((i (- end 1)))
+ (and (>= i start)
+ (if (pred (bytevector-u8-ref bstring i))
+ i
+ (lp (- i 1))))))))
+
+(define (bytestring-break bstring pred)
+ (assume (bytevector? bstring))
+ (assume (procedure? pred))
+ (cond ((bytestring-index bstring pred) =>
+ (lambda (len)
+ (values (bytevector-copy bstring 0 len)
+ (bytevector-copy bstring len))))
+ (else (values (bytevector-copy bstring) (bytevector)))))
+
+(define (bytestring-span bstring pred)
+ (assume (bytevector? bstring))
+ (assume (procedure? pred))
+ (cond ((bytestring-index bstring (negate pred)) =>
+ (lambda (len)
+ (values (bytevector-copy bstring 0 len)
+ (bytevector-copy bstring len))))
+ (else (values (bytevector-copy bstring) (bytevector)))))
+
+;;;; Joining & Splitting
+
+(define (%bytestring-join-nonempty bstrings delimiter grammar)
+ (call-with-port
+ (open-output-bytevector)
+ (lambda (out)
+ (when (eqv? grammar 'prefix) (write-bytevector delimiter out))
+ (write-bytevector (car bstrings) out)
+ (for-each (lambda (bstr)
+ (write-bytevector delimiter out)
+ (write-bytevector bstr out))
+ (cdr bstrings))
+ (when (eqv? grammar 'suffix) (write-bytevector delimiter out))
+ (get-output-bytevector out))))
+
+(define bytestring-join
+ (case-lambda
+ ((bstrings delimiter) (bytestring-join bstrings delimiter 'infix))
+ ((bstrings delimiter grammar)
+ (assume (or (pair? bstrings) (null? bstrings)))
+ (unless (memv grammar '(infix strict-infix prefix suffix))
+ (bytestring-error "invalid grammar" grammar))
+ (let ((delim-bstring (bytestring delimiter)))
+ (if (pair? bstrings)
+ (%bytestring-join-nonempty bstrings delim-bstring grammar)
+ (if (eqv? grammar 'strict-infix)
+ (bytestring-error "empty list with strict-infix grammar")
+ (bytevector)))))))
+
+(define (%find-right bstring byte end)
+ (bytestring-index-right bstring (lambda (b) (= b byte)) 0 end))
+
+(define (%bytestring-infix-split bstring delimiter)
+ (let lp ((token-end (bytevector-length bstring)) (split '()))
+ (cond ((< token-end 0) split)
+ ((%find-right bstring delimiter token-end) =>
+ (lambda (token-start-1)
+ (lp token-start-1
+ (cons (bytevector-copy bstring (+ 1 token-start-1)
+ token-end)
+ split))))
+ (else (cons (bytevector-copy bstring 0 token-end) split)))))
+
+(define (%trim-byte bstring byte)
+ (bytestring-trim bstring (lambda (b) (= b byte))))
+
+(define (%trim-right-byte bstring byte)
+ (bytestring-trim-right bstring (lambda (b) (= b byte))))
+
+(define (%bytestring-split/trim-outliers bstring delimiter grammar)
+ (let ((trimmed (case grammar
+ ((infix strict-infix) bstring)
+ ((prefix) (%trim-byte bstring delimiter))
+ ((suffix) (%trim-right-byte bstring delimiter)))))
+ (%bytestring-infix-split trimmed delimiter)))
+
+(define bytestring-split
+ (case-lambda
+ ((bstring delimiter) (bytestring-split bstring delimiter 'infix))
+ ((bstring delimiter grammar)
+ (assume (bytevector? bstring))
+ (assume (u8-or-ascii-char? delimiter))
+ (unless (memv grammar '(infix strict-infix prefix suffix))
+ (bytestring-error "invalid grammar" grammar))
+ (if (%bytestring-null? bstring)
+ '()
+ (%bytestring-split/trim-outliers
+ bstring
+ (if (char? delimiter) (char->integer delimiter) delimiter)
+ grammar)))))
+
+;;;; I/O
+
+(define backslash-codepoints
+ '((7 . #\a) (8 . #\b) (9 . #\t) (10 . #\n) (13 . #\r)
+ (34 . #\") (92 . #\\) (124 . #\|)))
+
+(define write-textual-bytestring
+ (case-lambda
+ ((bstring)
+ (write-textual-bytestring bstring (current-output-port)))
+ ((bstring port)
+ (parameterize ((current-output-port port))
+ (write-string "#u8\"")
+ (u8vector-for-each
+ (lambda (b)
+ (cond ((assv b backslash-codepoints) =>
+ (lambda (p)
+ (write-char #\\)
+ (write-char (cdr p))))
+ ((and (>= b #x20) (<= b #x7e))
+ (write-char (integer->char b)))
+ (else
+ (write-string "\\x")
+ (write-string (number->string b 16))
+ (write-char #\;))))
+ bstring)
+ (write-char #\")))))
+
+(define (write-binary-bytestring port . args)
+ (assume (binary-port? port))
+ (for-each (lambda (arg)
+ (unless (valid-bytestring-segment? arg)
+ (bytestring-error "invalid bytestring element" arg)))
+ args)
+ (for-each (lambda (seg) (%write-bytestring-segment seg port)) args))
diff --git a/test-suite/tests/srfi-207.test b/test-suite/tests/srfi-207.test
new file mode 100644
index 000000000..b5c55cbf7
--- /dev/null
+++ b/test-suite/tests/srfi-207.test
@@ -0,0 +1,444 @@
+;;; Copyright (C) 2020 Wolfgang Corcoran-Mathe
+;;;
+;;; Permission is hereby granted, free of charge, to any person obtaining a
+;;; copy of this software and associated documentation files (the
+;;; "Software"), to deal in the Software without restriction, including
+;;; without limitation the rights to use, copy, modify, merge, publish,
+;;; distribute, sublicense, and/or sell copies of the Software, and to
+;;; permit persons to whom the Software is furnished to do so, subject to
+;;; the following conditions:
+;;;
+;;; The above copyright notice and this permission notice shall be included
+;;; in all copies or substantial portions of the Software.
+;;;
+;;; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+;;; OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+;;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+;;; IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+;;; CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+;;; TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+;;; SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
+(import (scheme base))
+(import (scheme write))
+(import (srfi 207))
+(import (only (srfi 1) list-tabulate every))
+
+(cond-expand
+ ((library (srfi 158))
+ (import (only (srfi 158) generator->list)))
+ (else
+ (begin
+ (define (generator->list gen)
+ (let rec ((x (gen)))
+ (if (eof-object? x)
+ '()
+ (cons x (rec (gen)))))))))
+
+(cond-expand
+ ((library (srfi 78))
+ (import (srfi 78)))
+ (else
+ (begin
+ (define *tests-failed* 0)
+ (define-syntax check
+ (syntax-rules (=>)
+ ((check expr => expected)
+ (if (equal? expr expected)
+ (begin
+ (display 'expr)
+ (display " => ")
+ (display expected)
+ (display " ; correct")
+ (newline))
+ (begin
+ (set! *tests-failed* (+ *tests-failed* 1))
+ (display "FAILED: for ")
+ (display 'expr)
+ (display " expected ")
+ (display expected)
+ (display " but got ")
+ (display expr)
+ (newline))))))
+ (define (check-report)
+ (if (zero? *tests-failed*)
+ (begin
+ (display "All tests passed.")
+ (newline))
+ (begin
+ (display "TESTS FAILED: ")
+ (display *tests-failed*)
+ (newline)))))))
+
+;;;; Utility
+
+(define (print-header message)
+ (newline)
+ (display (string-append ";;; " message))
+ (newline))
+
+(define-syntax constantly
+ (syntax-rules ()
+ ((_ obj) (lambda _ obj))))
+
+(define always (constantly #t))
+(define never (constantly #f))
+
+;; Returns a list of the values produced by expr.
+(define-syntax values~>list
+ (syntax-rules ()
+ ((_ expr)
+ (call-with-values (lambda () expr) list))))
+
+;; If expr causes an exception to be raised, return 'bytestring-error
+;; if the raised object satisfies bytestring-error?, and #f otherwise.
+(define-syntax catch-bytestring-error
+ (syntax-rules ()
+ ((_ expr)
+ (guard (condition ((bytestring-error? condition) 'bytestring-error)
+ (else #f))
+ expr))))
+
+;; Testing shorthand for write-binary-bytestring.
+(define (%bytestring/IO . args)
+ (call-with-port (open-output-bytevector)
+ (lambda (port)
+ (apply write-binary-bytestring port args)
+ (get-output-bytevector port))))
+
+;; Testing shorthands for SNB I/O. Coverage library fans, eat your
+;; hearts out.
+(define (parse-SNB/prefix s)
+ (call-with-port (open-input-string s)
+ (lambda (p)
+ (read-textual-bytestring #t p))))
+
+(define (parse-SNB s)
+ (call-with-port (open-input-string s)
+ (lambda (p)
+ (read-textual-bytestring #f p))))
+
+(define (%bytestring->SNB bstring)
+ (call-with-port (open-output-string)
+ (lambda (port)
+ (write-textual-bytestring bstring port)
+ (get-output-string port))))
+
+
+(define test-bstring (bytestring "lorem"))
+
+(define homer
+ (bytestring "The Man, O Muse, informe, who many a way / \
+ Wound in his wisedome to his wished stay;"))
+
+(define homer64
+ "VGhlIE1hbiwgTyBNdXNlLCBpbmZvcm1lLCB3aG8gbWFueSBhIHdheSAvIFdvd\
+ W5kIGluIGhpcyB3aXNlZG9tZSB0byBoaXMgd2lzaGVkIHN0YXk7")
+
+(define homer64-w
+ "VGhlIE1hb iwgTyBNdXNlL CBpbmZvcm1lL\nCB3aG8gbWF\tueSBhIH\rdheSAvIFdvd\
+ W5kIGluI GhpcyB 3aXNlZ\t\t\nG9tZSB0b yBoaXMgd\t2lzaGVkIHN0YXk7")
+
+;;;; Constructors
+
+(define (check-constructor)
+ (print-header "Running constructor tests...")
+ (check (bytestring "lo" #\r #x65 #u8(#x6d)) => test-bstring)
+ (check (bytestring) => (bytevector))
+
+ (check (catch-bytestring-error (bytestring #x100)) => 'bytestring-error)
+ (check (catch-bytestring-error (bytestring "λ")) => 'bytestring-error))
+
+(define (check-conversion)
+ (print-header "Running conversion tests...")
+
+ (check (bytevector->hex-string test-bstring) => "6c6f72656d")
+ (check (hex-string->bytevector "6c6f72656d") => test-bstring)
+ (check (catch-bytestring-error
+ (hex-string->bytevector "c6f72656d"))
+ => 'bytestring-error)
+ (check (catch-bytestring-error
+ (hex-string->bytevector "6czf72656d"))
+ => 'bytestring-error)
+ (check (equal? (hex-string->bytevector (bytevector->hex-string homer))
+ homer)
+ => #t)
+
+ (check (hex-string->bytevector (bytevector->hex-string #u8())) => #u8())
+
+ (check (bytevector->base64 test-bstring) => "bG9yZW0=")
+ (check (bytevector->base64 #u8(#xff #xef #xff)) => "/+//")
+ (check (bytevector->base64 #u8(#xff #xef #xff) "*@") => "@*@@")
+ (check (equal? (bytevector->base64 homer) homer64) => #t)
+ (check (bytevector->base64 #u8(1)) => "AQ==")
+ (check (bytevector->base64 #u8()) => "")
+ (check (base64->bytevector "bG9yZW0=") => test-bstring)
+ (check (base64->bytevector "/+//") => #u8(#xff #xef #xff))
+ (check (base64->bytevector "@*@@" "*@") => #u8(#xff #xef #xff))
+ (check (equal? (base64->bytevector homer64) homer) => #t)
+ (check (equal? (base64->bytevector homer64-w) homer) => #t)
+ (check (base64->bytevector "AQ==") => #u8(1))
+ (check (base64->bytevector "") => #u8())
+ (check (base64->bytevector "\n\n\n==\t\r\n") => #u8())
+ (check (catch-bytestring-error
+ (base64->bytevector "bG9@frob")) => 'bytestring-error)
+
+ (check (bytestring->list #u8()) => '())
+ (check (bytestring->list (bytestring 70 82 0 66)) => '(#\F #\R 0 #\B))
+ (check (bytestring->list (bytestring "\a\t\t\n" 200)) => '(7 9 9 10 200))
+ (check (make-bytestring (bytestring->list test-bstring)) => test-bstring)
+ (check (make-bytestring (bytestring->list test-bstring 2))
+ => (bytestring "rem"))
+ (check (make-bytestring (bytestring->list test-bstring 1 3))
+ => (bytestring "or"))
+
+ (let ((bvec (make-bytevector 5)))
+ (check (begin
+ (make-bytestring! bvec 0 '(#x6c #x6f #x72 #x65 #x6d))
+ bvec)
+ => test-bstring))
+ (let ((bvec (make-bytevector 9 #x20)))
+ (check (begin (make-bytestring! bvec 2 '("lo" #\r #x65 #u8(#x6d)))
+ bvec)
+ => (bytestring " lorem ")))
+ (check (catch-bytestring-error (make-bytestring '("λ")))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (make-bytestring '(#x100)))
+ => 'bytestring-error)
+
+ (let ((s (list-tabulate (bytevector-length test-bstring)
+ (lambda (i)
+ (bytevector-u8-ref test-bstring i)))))
+ (check (let ((g (make-bytestring-generator "lo" #\r #x65 #u8(#x6d))))
+ (generator->list g))
+ => s))
+ (check (catch-bytestring-error (make-bytestring-generator "λ" #\m #\u))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (make-bytestring-generator 89 90 300))
+ => 'bytestring-error)
+)
+
+(define (check-selection)
+ (print-header "Running selection tests...")
+
+ (check (bytestring-pad test-bstring (bytevector-length test-bstring) #x7a)
+ => test-bstring)
+ (check (utf8->string (bytestring-pad test-bstring 8 #x7a))
+ => "zzzlorem")
+ (check (equal? (bytestring-pad test-bstring 8 #\z)
+ (bytestring-pad test-bstring 8 (char->integer #\z)))
+ => #t)
+ (check (bytestring-pad-right test-bstring
+ (bytevector-length test-bstring)
+ #x7a)
+ => test-bstring)
+ (check (utf8->string (bytestring-pad-right test-bstring 8 #x7a))
+ => "loremzzz")
+ (check (equal? (bytestring-pad-right test-bstring 8 #\z)
+ (bytestring-pad-right test-bstring 8 (char->integer #\z)))
+ => #t)
+
+ (check (bytestring-trim test-bstring always) => #u8())
+ (check (bytestring-trim test-bstring never) => test-bstring)
+ (check (bytestring-trim test-bstring (lambda (u8) (< u8 #x70)))
+ => #u8(#x72 #x65 #x6d))
+ (check (bytestring-trim-right test-bstring always) => #u8())
+ (check (bytestring-trim-right test-bstring never) => test-bstring)
+ (check (bytestring-trim-right test-bstring (lambda (u8) (< u8 #x70)))
+ => #u8(#x6c #x6f #x72))
+ (check (bytestring-trim-both test-bstring always) => #u8())
+ (check (bytestring-trim-both test-bstring never) => test-bstring)
+ (check (bytestring-trim-both test-bstring (lambda (u8) (< u8 #x70)))
+ => #u8(#x72)))
+
+(define (check-replacement)
+ (print-header "Running bytestring-replace tests...")
+
+ (check (bytestring-replace test-bstring (bytestring "mists") 1 5 1 5)
+ => (bytestring "lists"))
+ (check (bytestring-replace test-bstring (bytestring "faded") 2 5 1 5)
+ => (bytestring "loaded"))
+ (check (bytestring-replace (make-bytevector 5)
+ test-bstring
+ 0
+ (bytevector-length test-bstring))
+ => test-bstring)
+
+ (let ((bv1 (bytestring "food")) (bv2 (bytestring "od fo")))
+ (check (bytestring-replace bv1 bv2 2 2 0 5) => (bytestring "food food")))
+ (let ((bv1 (bytestring "food food")))
+ (check (bytestring-replace bv1 (bytevector) 2 7 0 0)
+ => (bytestring "food")))
+)
+
+(define (check-comparison)
+ (define short-bstring (bytestring "lore"))
+ (define long-bstring (bytestring "lorem "))
+ (define mixed-case-bstring (bytestring "loreM"))
+ (print-header "Runnng comparison tests...")
+
+ (check (bytestring<? test-bstring test-bstring) => #f)
+ (check (bytestring<? short-bstring test-bstring) => #t)
+ (check (bytestring<? mixed-case-bstring test-bstring) => #t)
+ (check (bytestring>? test-bstring test-bstring) => #f)
+ (check (bytestring>? test-bstring short-bstring) => #t)
+ (check (bytestring>? test-bstring mixed-case-bstring) => #t)
+ (check (bytestring<=? test-bstring test-bstring) => #t)
+ (check (bytestring<=? short-bstring test-bstring) => #t)
+ (check (bytestring<=? mixed-case-bstring test-bstring) => #t)
+ (check (bytestring<=? test-bstring mixed-case-bstring) => #f)
+ (check (bytestring<=? long-bstring test-bstring) => #f)
+ (check (bytestring>=? test-bstring test-bstring) => #t)
+ (check (bytestring>=? test-bstring short-bstring) => #t)
+ (check (bytestring>=? test-bstring mixed-case-bstring) => #t)
+ (check (bytestring>=? mixed-case-bstring test-bstring) => #f)
+ (check (bytestring>=? short-bstring test-bstring) => #f)
+)
+
+(define (check-searching)
+ (define (eq-r? b) (= b #x72))
+ (define (lt-r? b) (< b #x72))
+ (print-header "Running search tests...")
+
+ (check (bytestring-index test-bstring always) => 0)
+ (check (bytestring-index test-bstring never) => #f)
+ (check (bytestring-index test-bstring always 3) => 3)
+ (check (bytestring-index test-bstring eq-r?) => 2)
+
+ (check (bytestring-index-right test-bstring always) => 4)
+ (check (bytestring-index-right test-bstring never) => #f)
+ (check (bytestring-index-right test-bstring always 3) => 4)
+ (check (bytestring-index-right test-bstring eq-r?) => 2)
+
+ (check (values~>list (bytestring-span test-bstring always))
+ => (list test-bstring (bytevector)))
+ (check (values~>list (bytestring-span test-bstring never))
+ => (list (bytevector) test-bstring))
+ (check (values~>list (bytestring-span test-bstring lt-r?))
+ => (list (bytestring "lo") (bytestring "rem")))
+
+ (check (values~>list (bytestring-break test-bstring always))
+ => (list (bytevector) test-bstring))
+ (check (values~>list (bytestring-break test-bstring never))
+ => (list test-bstring (bytevector)))
+ (check (values~>list (bytestring-break test-bstring eq-r?))
+ => (list (bytestring "lo") (bytestring "rem"))))
+
+(define (check-join-and-split)
+ (define test-segments '(#u8(1) #u8(2) #u8(3)))
+ (print-header "Running joining and splitting tests...")
+
+ (check (bytestring-join test-segments #u8(0)) => #u8(1 0 2 0 3))
+ (check (bytestring-join test-segments #u8(0) 'prefix) => #u8(0 1 0 2 0 3))
+ (check (bytestring-join test-segments #u8(0) 'suffix) => #u8(1 0 2 0 3 0))
+ (check (bytestring-join '() #u8(0)) => #u8())
+ (check (bytestring-join test-segments #\space) => #u8(1 32 2 32 3))
+ (check (bytestring-join test-segments 0) => #u8(1 0 2 0 3))
+ (check (bytestring-join test-segments "AB")
+ => #u8(1 65 66 2 65 66 3))
+ (check (bytestring-join test-segments #u8(7 8)) => #u8(1 7 8 2 7 8 3))
+ (check (catch-bytestring-error
+ (bytestring-join test-segments 300)) => 'bytestring-error)
+ (check (catch-bytestring-error
+ (bytestring-join test-segments "λ")) => 'bytestring-error)
+ (check (catch-bytestring-error
+ (bytestring-join '() #u8(0) 'strict-infix)) => 'bytestring-error)
+ (check (catch-bytestring-error
+ (bytestring-join '() #u8(0) 'foofix)) => 'bytestring-error)
+
+ (check (bytestring-split #u8(1 0 2 0 3) 0 'infix) => test-segments)
+ (check (bytestring-split #u8(0 1 0 2 0 3) 0 'prefix) => test-segments)
+ (check (bytestring-split #u8(1 0 2 0 3 0) 0 'suffix) => test-segments)
+ (check (bytestring-split #u8(0 0) 0) => '(#u8() #u8() #u8()))
+ (check (bytestring-split #u8() 0) => '())
+ (check (catch-bytestring-error
+ (bytestring-split #u8() 0 'foofix)) => 'bytestring-error))
+
+(define (check-io)
+ (print-header "Running I/O tests...")
+
+ (check (%bytestring/IO "lo" #\r #x65 #u8(#x6d)) => test-bstring)
+ (check (%bytestring/IO) => #u8())
+ (check (catch-bytestring-error (%bytestring/IO #x100)) => 'bytestring-error)
+ (check (catch-bytestring-error (%bytestring/IO "λ")) => 'bytestring-error)
+
+ ;;; read-textual-bytestring
+
+ (check (parse-SNB/prefix "#u8\"\"") => #u8())
+ (check (parse-SNB/prefix "#u8\"lorem\"") => test-bstring)
+ (check (parse-SNB/prefix "#u8\"\\xde;\\xad;\\xf0;\\x0d;\"")
+ => (bytevector #xde #xad #xf0 #x0d))
+ (check (parse-SNB/prefix "#u8\"\\\"\\\\\\a\\b\\t\\n\\r\\\|\"")
+ => (bytestring #\" #\\ #\alarm #\backspace #\tab #\newline #\return #\|))
+ (check (parse-SNB/prefix "#u8\"lor\\\n\te\\ \r\n\tm\"")
+ => test-bstring)
+ (check (parse-SNB "\"lorem\"") => test-bstring)
+
+ ;; Invalid SNB detection.
+ (check (catch-bytestring-error (parse-SNB/prefix "#u\"lorem\""))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (parse-SNB/prefix "#u8lorem\""))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (parse-SNB/prefix "#u8\"lorem"))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (parse-SNB/prefix "#u8\"lorem"))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (parse-SNB/prefix "#u8\"l\\orem\""))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (parse-SNB/prefix "#u8\"l\\ orem\""))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (parse-SNB/prefix "#u8\"l\\x6frem\""))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (parse-SNB/prefix "#u8\"l\\x6z;rem\""))
+ => 'bytestring-error)
+ (check (catch-bytestring-error (parse-SNB/prefix "#u8\"α equivalence\""))
+ => 'bytestring-error)
+
+ ;;; write-textual-bytestring
+
+ (check (%bytestring->SNB #u8()) => "#u8\"\"")
+ (check (%bytestring->SNB test-bstring) => "#u8\"lorem\"")
+ (check (%bytestring->SNB (bytevector #xde #xad #xbe #xef))
+ => "#u8\"\\xde;\\xad;\\xbe;\\xef;\"")
+ (check (%bytestring->SNB
+ (bytestring #\" #\\ #\alarm #\backspace #\tab #\newline #\return
#\|))
+ => "#u8\"\\\"\\\\\\a\\b\\t\\n\\r\\\|\"")
+
+ (let ((test-bstrings
+ '(#u8(124 199 173 212 209 232 249 16 198 32 123 111 130 92 64 155)
+ #u8(50 133 193 27 177 105 10 186 61 149 177 105 96 70 223 190)
+ #u8(0 117 226 155 110 0 66 216 27 129 187 81 17 210 71 152)
+ #u8(123 31 159 25 100 135 246 47 249 137 243 241 45 241 240 221)
+ #u8(207 186 70 110 118 231 79 195 153 253 93 101 126 198 70 235)
+ #u8(138 176 92 152 208 107 28 236 198 254 111 37 241 116 191 206)
+ #u8(221 254 214 90 0 155 132 92 157 246 199 224 224 142 91 114)
+ #u8(228 216 233 80 142 15 158 54 5 85 174 101 111 75 126 209)
+ #u8(191 16 83 245 45 98 72 212 148 202 135 19 213 150 141 121)
+ #u8(41 169 182 96 47 184 16 116 196 251 243 93 81 162 175 140)
+ #u8(85 49 218 138 132 11 27 11 182 27 120 71 254 169 132 166)
+ #u8(89 216 175 23 97 10 237 112 208 195 112 80 198 154 241 254)
+ #u8(187 54 6 57 250 137 129 89 188 19 225 217 168 178 174 129)
+ #u8(88 164 89 40 175 194 108 56 12 124 109 96 148 149 119 109)
+ #u8(241 66 32 115 203 71 128 154 240 111 194 137 73 44 146 3)
+ #u8(177 185 177 233 18 14 178 106 110 109 222 147 111 157 216
208))))
+ (check
+ (every (lambda (bvec)
+ (equal? bvec (parse-SNB/prefix (%bytestring->SNB bvec))))
+ test-bstrings)
+ => #t))
+)
+
+(define (check-all)
+ (check-constructor)
+ (check-conversion)
+ (check-selection)
+ (check-replacement)
+ (check-comparison)
+ (check-searching)
+ (check-join-and-split)
+ (check-io)
+
+ (newline)
+ (check-report))
+
+(check-all)