Hi,

There are two bugs in the current uri-encode procedure in (web
uri).

Firstly, if you have an octet less than 16 it only gets encoded to %
HEXDIGIT instead of % HEXDIGIT HEXDIGIT.

scheme@(guile−user)> (uri-encode "foo\nbar")
$30 = "foo%abar"

Secondly, if you have a string with no unreserved characters, nothing
gets encoded.
scheme@(guile−user)> (uri-encode "<>\\^")
$31 = "<>\\∧"
scheme@(guile−user)> (uri-encode "<>\\^a")
$32 = "%3c%3e%5c%5ea"

Patches attached. Cheers,

-- 
Ian Price -- shift-reset.com

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"

>From 11f56bd6a4fdf1331ea30cd68b4d77e35215b4a5 Mon Sep 17 00:00:00 2001
From: Ian Price <[email protected]>
Date: Mon, 20 Aug 2012 23:03:38 +0100
Subject: [PATCH 1/2] Fix uri-encoding for octets 0-15

* module/web/uri.scm (uri-encode): All encoded octets should be of the
  form % HEXDIGIT HEXDIGIT.
* test-suite/tests/web-uri.test ("encode"): Add test.
---
 module/web/uri.scm            |    2 ++
 test-suite/tests/web-uri.test |    3 ++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/module/web/uri.scm b/module/web/uri.scm
index 109118b..3816d02 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -377,6 +377,8 @@ the byte."
                     (if (< i len)
                         (let ((byte (bytevector-u8-ref bv i)))
                           (display #\% port)
+                          (when (< byte 16)
+                            (display #\0 port))
                           (display (number->string byte 16) port)
                           (lp (1+ i))))))))
           str)))
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index 4621a19..a9ded46 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -258,4 +258,5 @@
     (equal? "foo bar" (uri-decode "foo+bar"))))
 
 (with-test-prefix "encode"
-  (pass-if (equal? "foo%20bar" (uri-encode "foo bar"))))
+  (pass-if (equal? "foo%20bar" (uri-encode "foo bar")))
+  (pass-if (equal? "foo%0a%00bar" (uri-encode "foo\n\x00bar"))))
-- 
1.7.7.6

>From ae4fa3f65c1d49822b5a284a065017673c81e65e Mon Sep 17 00:00:00 2001
From: Ian Price <[email protected]>
Date: Mon, 20 Aug 2012 23:12:23 +0100
Subject: [PATCH 2/2] Fix uri-encoding for strings with no unreserved chars

* module/web/uri.scm (uri-encode): Change test to check for unreserved
  chars instead of reserved chars.
* test-suite/tests/web-uri.test ("encode"): Add test.
---
 module/web/uri.scm            |    4 +++-
 test-suite/tests/web-uri.test |    3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/module/web/uri.scm b/module/web/uri.scm
index 3816d02..78614a5 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -364,7 +364,9 @@ Percent-encoding first writes out the given character to a bytevector
 within the given @var{encoding}, then encodes each byte as
 @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
 the byte."
-  (if (string-index str unescaped-chars)
+  (define (needs-escaped? ch)
+    (not (char-set-contains? unescaped-chars ch)))
+  (if (string-index str needs-escaped?)
       (call-with-output-string*
        (lambda (port)
          (string-for-each
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index a9ded46..3f6e7e3 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -259,4 +259,5 @@
 
 (with-test-prefix "encode"
   (pass-if (equal? "foo%20bar" (uri-encode "foo bar")))
-  (pass-if (equal? "foo%0a%00bar" (uri-encode "foo\n\x00bar"))))
+  (pass-if (equal? "foo%0a%00bar" (uri-encode "foo\n\x00bar")))
+  (pass-if (equal? "%3c%3e%5c%5e" (uri-encode "<>\\^"))))
-- 
1.7.7.6

Reply via email to