Re: [Chicken-users] test egg

2011-11-01 Thread Alex Shinn
On Tue, Nov 1, 2011 at 6:47 PM, Mario Domenech Goulart
 wrote:
> Hi Curtis,
>
> On Mon, 31 Oct 2011 20:07:57 -0700 Curtis Cooley  
> wrote:
>
>> I'm trying to get the test egg working, but I'm not getting very far.
>> I'm using chicken to learn scheme, so I'm really new at all this. I'm
>> running Linux Mint 11, but I've downloaded and compiled chicken 4.7
>> because Mint came with 4.2 and I could not even get the test egg to
>> load. Any help or pointers are much appreciated. I'm trying to take a
>> TDD pass through SICM.
>>
>> Here's the output from csi:
>>
>> #;1> (require-extension test)
>> ; loading /usr/local/lib/chicken/6/test.import.so ...
>> ; loading /usr/local/lib/chicken/6/regex.import.so ...
>> ; loading /usr/local/lib/chicken/6/irregex.import.so ...
>> ; loading /usr/local/lib/chicken/6/extras.import.so ...
>> ; loading /usr/local/lib/chicken/6/test.so ...
>> ; loading /usr/local/lib/chicken/6/regex.so ...
>> #;2> (test 4 (+ 2 2))
>>
>> Error: (cdr) bad argument type: #f
>
> It seems that the latest release of the test egg is broken.  You can try
> version 0.9.9.2, which is the last before the latest (alas, you'd need a
> copy of the svn repo to find that out, since the version history in the
> docs has not been updated).
>
> To install 0.9.9.2 you can run:
>
>    $ chicken-install test:0.9.9.2
>
> I hope that helps.

It's not completely broken, you just need to call

(test-begin)

before any tests (or use test-group).

I'll fix this as soon as I get home, but it's not
very meaningful to use tests without any group.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] test egg

2011-11-01 Thread Alex Shinn
On Tue, Nov 1, 2011 at 7:24 PM, Christian Kellermann
 wrote:
> * Alex Shinn  [01 11:06]:
>> >> #;2> (test 4 (+ 2 2))
>> >>
>> >> Error: (cdr) bad argument type: #f
>>
>> I'll fix this as soon as I get home, but it's not
>> very meaningful to use tests without any group.
>
> Ah I guess the confusion comes from the example above which is the
> one from the docs...

Yes - you basically always want to use a group,
but it's nice to be able to run simple examples in
the repl.

Fixed now.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Bugs in documentation and Mac OS X 10.6.8 executable generation broken

2011-12-05 Thread Alex Shinn
On Tue, Dec 6, 2011 at 4:05 AM, Christian Kellermann
 wrote:
> Watson,
>
> * Mario Domenech Goulart  [111205 18:27]:
>> >>> So what is the correct way to compile an SRFI-22 compliant script to
>> >>> an executable?
>
>> Maybe you are assuming a `main' procedure would be automatically called
>> when you run your compiled code?  If so, that's not going to happen, and
>> AFAIK, there's no compiler option to do that.  You can just explicitly
>> call your `main' procedure and run your scripts with csi -s" instead of
>> "csi -ss".  The compiled code would just work this way.
>
> Out of curiousity: Did you just assume that chicken supports srfi-22?
> If you got the idea by reading some of chicken's docs then this is
> a bug and we need to clarify these bits, so the next person will
> not be led astray.

csi does support SRFI-22 via the -ss option - it's basically
the same as -s plus calling (main (command-line-arguments))
at the end.

For csc you can either use

  csc -postlogue '(main (command-line-arguments))' foo.scm

or add to the end of foo.scm:

  (cond-expand
((and chicken compiling) (main (command-line-arguments)))
(else #f))

so that it supports SRFI-22 both interpreted and compiled
(untested).

Or you could just give up on SRFI-22 and use csi -s.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] unbound variable: sxml:sxml->xml

2011-12-07 Thread Alex Shinn
On Thu, Dec 8, 2011 at 2:17 AM, Vok Vojwo  wrote:
> 2011/12/7 Peter Bex :
>>
>> Actually, the main reason is that SSAX is a horrible mess which has
>> many completely unrelated procedures all mixed together.
>> There are several egg that provide different sets of procedures
>> from the SSAX project.  Most eggs include all files from the SSAX
>> project to make it easy to update them, but they don't install them
>> all.
>>
>> The proper place to do this is in the sxml-serializer egg, and
>> we definitely should *not* be adding random procedures to sxpath.
>>
>> (the *REALLY* proper way would be to drop SSAX and create a sane
>> and consistent XML library from scratch, maybe reusing some algorithms
>> from SSAX, but that's a whole other story...)
>
> I can not see any reason why it should be necessary to split the Oleg
> code. Someone who needs sxpath also needs sxml. Splitting the code
> into different Chicken modules is pretty useless.
>
> I would like if Olegs SSAX code stays together in one big module. This
> is the easiest way to do it.

In general it's better to partition into multiple
modules where necessary.  People who want
everything can import everything, or even provide
a wrapper module which imports then re-exports
everything.

In this case there are already multiple conceptual
modules (ssax, sxpath, sxlst) which are not organized
well.  The procedure you want (sxml->xml) is not part
of SSAX, nor even written by Oleg - it's part
ofl Kirill Lisovsky's sxml-tools.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Dynamic wind problem in with-input-from-file

2011-12-19 Thread Alex Shinn
Hi,

As I explained on the Chibi list, this is not technically a bug:

The specification of with-input-from-file (in R5RS and R7RS)
states:

 If an escape procedure is used to escape from the
 continuation of these procedures, their behavior
 is implementation dependent.

Hence Chibi (and Chicken) get confused because they
implement it without dynamic-wind, and when you jump
out current-input-port is still bound to /etc/motd, where
it tries to continue the repl from.

Although this is not a violation of the standard, I've
updated Chibi to use dynamic-wind and do the right
thing here.

-- 
Alex

On Tue, Dec 20, 2011 at 12:40 AM,   wrote:
> I tried the following:
>
> (define (ignore . args)
>  (if #f #f))
>
> (define (x)
>  (call-with-current-continuation
>   (lambda (return)
>     (with-input-from-file "/etc/motd"
>       (lambda ()
>         (return (ignore)))
>
> (ignore (x))
>
> This maks csi reading the contents of /etc/motd as source input:
>
> $ csi
>
> CHICKEN
> (c)2008-2011 The Chicken Team
> (c)2000-2007 Felix L. Winkelmann
> Version 4.7.0.3-st
> linux-unix-gnu-x86 [ manyargs dload ptables ]
> compiled 2011-12-09 on x (Linux)
>
> #;1> (define (ignore . args)
>  (if #f #f))
> #;2> (define (x)
>  (call-with-current-continuation
>   (lambda (return)
>     (with-input-from-file "/etc/motd"
>       (lambda ()
>         (return (ignore)))
> #;3> (ignore (x))
>
> Error: unbound variable:
> --
>
> Error: unbound variable: Red
>
> Error: unbound variable: Hat
>
> Error: unbound variable: Enterprise
>
> Error: unbound variable: Linux
>
> Error: unbound variable: Client
>
> Error: unbound variable: release
> 5.5
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Dynamic wind problem in with-input-from-file

2011-12-19 Thread Alex Shinn
On Tue, Dec 20, 2011 at 1:56 AM,   wrote:
>
> I see. The old RnRS problem. Everything is either implementation
> dependent or unspecified:
>
> $ grep -io "implementation-dependent\|unspecified" r7rs-draft-1.txt |wc -l
> 109

The return value of every side-effecting procedure
are "unspecified", and that's a very good thing.

In general, the things that are unspecified are
so simply because it doesn't make sense to
specify them.  This particular case I believe
is an exception - I think the authors simply
forgot to update the with-input-from-file spec
after adding dynamic-wind (which was new in
R5RS).  I've added a ticket for this:

  http://trac.sacrideo.us/wg/ticket/317

If there are any other cases of unspecified
behavior you actually think should be specified,
please bring it up on the R7RS list.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] compiling chicken on (crippled) embedded platforms

2012-02-13 Thread Alex Shinn
Hi Attila,

On Mon, Feb 13, 2012 at 10:32 PM, Attila Lendvai
 wrote:
>
> my problem with chibi is that once i've flashed the firmwares and the
> devices are sent out, then an FFI bug is a real headache. chicken
> seems to be more mature on this front.

Could you clarify, is there a specific FFI bug and if so can you report it?

Or are you just anticipating potential FFI bugs?

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Hash table equality pitfall

2012-03-01 Thread Alex Shinn
On Fri, Mar 2, 2012 at 3:01 AM, Peter Bex  wrote:
> On Thu, Mar 01, 2012 at 12:03:25PM -0500, John Cowan wrote:
>> In addition, the following Schemes support SRFI-69 with `equal?`
>> descending into hash tables:  Kawa, Chibi.
>
> What does this mean in practice?  Do they do a "dumb" comparison like
> Chicken does (ie, producing different results depending on the insertion
> order) or do they check whether the hash tables have exactly the same
> keys, each with identical corresponding values?

Chibi does a dumb comparison like Chicken does.

If you use the equal? from R7RS (scheme base) it also
handles cycles.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] New redirects from chickenscheme.org and chicken-scheme.org?

2012-04-23 Thread Alex Shinn
On Mon, Apr 23, 2012 at 11:35 PM, Kristian Lein-Mathisen
 wrote:
>
> I just purchased chickenscheme.org and chicken-scheme.org for 1 year. Is it
> ok if I have them redirect to call-cc.org?

You're supposed to ask "how much will you give me
to _not_ redirect them to racket-lang.org?"

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] ANN: lazy-seq, a port of Clojure's lazy sequence API

2012-06-03 Thread Alex Shinn
On Sun, Jun 3, 2012 at 4:05 PM, John Cowan  wrote:
> Peter Danenberg scripsit:
>
>> Stream-cons, stream-lambda, &c. are so fucking verbose!
>
> Only two letters longer than lazy-length, lazy-map, lazy-head, lazy-tail, etc.
>
> Why not a macro with-lazy that rewrites car, cdr, lambda, cons, etc. within
> its body?

Unhygienic!  Heathen!

:)

Although you have the right idea.

First, the srfi-41 vs. lazy-seq comparison in the
blog post was an apples to oranges comparison
of a clumsy letrec vs a compact named let.  If we
rewrite the srfi-41 version in the same style as
the lazy-seq one, then we get:

(define multiples-of-three
  (let next ((n 3))
(stream-cons n (next (+ n 3)

This is actually more compact - just _remove_
the lazy-seq reference, and s/cons/stream-cons/.

Now, if we have a whole program or library which
consistently uses lazy streams instead of lists,
we can import srfi-41 renaming all the stream-*
bindings by removing the stream- prefix (this is
where the drop-prefix you like comes in handy).
Then you have a normal Scheme library using
car/cdr/cons etc. which happens to be using
streams (and you could change the import if
needed to toggle between the two).

Introducing an extra syntactic wrapper just makes
this more complicated.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] ANN: lazy-seq, a port of Clojure's lazy sequence API

2012-06-03 Thread Alex Shinn
On Sun, Jun 3, 2012 at 5:31 PM, Peter Danenberg  wrote:
> Quoth Alex Shinn on Prickle-Prickle, the 8th of Confusion:
>> Now, if we have a whole program or library which consistently uses
>> lazy streams instead of lists, we can import srfi-41 renaming all
>> the stream-* bindings by removing the stream- prefix.
>
> Interesting idea; the attached program will, however, reliably cause
> Chicken to SIGSEGV.

You mis-matched your parens - it should be:

  (define (factorial n)
(ref (scan * 1 (from 1)) n))
   ^^^

Presumably you compiled to get a segfault -
the interpreter gives an arity error.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] ANN: lazy-seq, a port of Clojure's lazy sequence API

2012-06-03 Thread Alex Shinn
On Sun, Jun 3, 2012 at 8:31 PM, Moritz Heidkamp
 wrote:
> Alex Shinn  writes:
>> First, the srfi-41 vs. lazy-seq comparison in the
>> blog post was an apples to oranges comparison
>> of a clumsy letrec vs a compact named let.  If we
>> rewrite the srfi-41 version in the same style as
>> the lazy-seq one, then we get:
>>
>> (define multiples-of-three
>>   (let next ((n 3))
>>     (stream-cons n (next (+ n 3)
>
> Ah, thanks for pointing this out -- I was really surprised that I
> couldn't find a simpler way to express this with SRFI 41. I'll update
> the blog article accordingly.
>
>
>> Now, if we have a whole program or library which
>> consistently uses lazy streams instead of lists,
>> we can import srfi-41 renaming all the stream-*
>> bindings by removing the stream- prefix (this is
>> where the drop-prefix you like comes in handy).
>> Then you have a normal Scheme library using
>> car/cdr/cons etc. which happens to be using
>> streams (and you could change the import if
>> needed to toggle between the two).
>
> Fair enough. I was actually considering to provide a module which
> exports the functions without the `lazy-' prefix, perhaps with a `*'
> suffix so that it can be conviently loaded alongside regular Scheme. The
> same could be done with SRFI 41, of course. What I don't quite
> understand is why SRFI 41 also defines stream-let, stream-lambda etc. Do
> you know of a good reason why one would want those?

For the specific case of stream-cons I believe they're
useless, because stream-cons is syntax and so works
well with normal let and lambda.  However if you had
another lazy computation not based on some constructor,
say a tree, you could use stream-lambda instead of
introducing a new stream-tree syntax.

For a better comparison to srfi-41, you could compare
using your lazy-seq against stream-lambda for some
infinite tree.  Internally, stream-lambda is implemented
with a utility stream-lazy which is basically the same
as lazy-seq.  Bewig decided not to expose this utility
with the following reasoning:

  Besides hiding lazy and making the types work out correctly,
  stream-lambda is obvious and easy-to-use for competent Scheme
  programmers, especially when augmented with the syntactic sugar
  of define-stream and named stream-let. The alternative of
  exposing stream-lazy would be less clear and harder to use.

It seems for you at least the opposite was the case,
and it would have been better to expose stream-lazy
after all :)

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Unbounded stack growth

2012-07-11 Thread Alex Shinn
On Thu, Jul 12, 2012 at 10:52 AM, Marc Feeley  wrote:
>
> On 2012-07-11, at 5:59 PM, Felix wrote:
>
>>>
>>> Performance should not trump safety and correctness.
>>
>> Absolutely right, yet everybody has a different perception of what
>> performance, safety and correctness means. Segfaulting on
>> _stack-overflow_ is not something that I'd call "incorrect" or "unsafe"
>> - I'd call it "inconvenient" and it may be the case that handling the
>> overflow gracefully isn't such a big deal at all. On the other hand,
>> an extremely deep recursion could in such a case (stack checks
>> everywhere) bring the machine to a halt due to excessive thrashing.  I
>> don't know whether I'd perhaps prefer the segfault in such a situation...
>
> The point is that, in a high-level language, a segfault represents a loss of 
> control of the language implementation.  On unix systems, it is usually 
> caused by a protected virtual memory page that has been touched, and a 
> segfault signal is generated.  Not very informative, but not a big safety 
> issue.  On other operating systems (e.g. in a Nintendo DS, a FPGA, or a 
> toaster), some unrelated zone of memory (say the heap containing Scheme 
> objects) has been corrupted silently and the program has started executing 
> random stuff, possibly burning your toast and setting fire to the house.
>
> The programmer already has control (with the csc -heap-limit N option) over 
> how much memory is available to the program to avoid thrashing caused by an 
> infinite recursion, or an allocation loop gone wild.  There should be no 
> difference in heap and stack allocation.  These are implementation terms.  It 
> is all just memory.  After all, the Cheney on the MTA approach was designed 
> to migrate the stack frames to the heap transparently, so the user should be 
> isolated from the concern of where the continuation is stored.

I disagree - I think a stack grown too large is likely indicative
of a programming error, or at the very least an inefficient
algorithm.  In the general case I want my programs to be
able to allocate as much heap as possible, but have a
separate limitation on the stack.

The default Chibi stack is only 1k.  It can grow, but again to
a limit of only 1M (by default).  This coincidentally causes
it to also return #t on the 200,000 input to even and raise
an "out of stack" error on 300,000 (with a ridiculously long
stack trace making me wonder if 1M is too large).

-- 
Alex

>
> What is unfortunate with this bug is that it goes against the programmer's 
> intuition.  If you slightly modify the code to this :
>
> ;; File: even.scm
>
> (define (even i n)
>   (if (= 0 (modulo i 10)) (print i))
>   (if (= i n)
>   #t
>   (not (even (+ i 1) n
>
> (print (even 0 (string->number (car (command-line-arguments)
>
> So that it is easy to see how deep in the recursion the program has gone, 
> then you get this output :
>
> % ./even 90
> 0
> 10
> 20
> 30
> 40
> 50
> 60
> 70
> 80
> 90
> Segmentation fault: 11
> % ./even 30
> 0
> 10
> 20
> 30
> Segmentation fault: 11
>
> This shows that the stack overflow is occuring during the unwinding phase of 
> the recursion.  This is quite unintuitive for me.  Try explaining to a 
> beginner that the unwinding of the recursion is causing memory allocation!  
> Moreover, if you slightly modify the code so that there is a call to the 
> my-not function instead of the builtin not, then there are no problems 
> (because the call to my-not at each step of the unwinding is going to 
> gracefully handle the growing C stack and garbage collect it):
>
> ;; File: even.scm (slightly modified)
>
> (define (my-not x) (not x))
>
> (define (even i n)
>   (if (= 0 (modulo i 10)) (print i))
>   (if (= i n)
>   #t
>   (my-not (even (+ i 1) n
>
> (print (even 0 (string->number (car (command-line-arguments)
>
> % ./even 90
> 0
> 10
> 20
> 30
> 40
> 50
> 60
> 70
> 80
> 90
> #t
>
> The decision for not testing the stack limit on returns is based on a 
> performance concern.  Adding this test at return points would allow the above 
> code to work correctly, for very large values of n.
>
> By the way, I'm surprised that the very similar looking code for rev-iota :
>
> (define (rev-iota n)
>   (if (= n 0)
>   '()
>   (cons n (rev-iota (- n 1)
>
> does not trigger the bug.  Is "cons" treated differently from "not"?
>
> Marc
>
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Unbounded stack growth

2012-07-11 Thread Alex Shinn
On Thu, Jul 12, 2012 at 11:55 AM, Matthew Flatt  wrote:
> At Thu, 12 Jul 2012 11:25:44 +0900, Alex Shinn wrote:
>> I disagree - I think a stack grown too large is likely indicative
>> of a programming error, or at the very least an inefficient
>> algorithm.  In the general case I want my programs to be
>> able to allocate as much heap as possible, but have a
>> separate limitation on the stack.
>
> Amen. Just because a computation is naturally expressed as a recursion
> does not mean that you should write it that way.

:)

Unfortunately we don't live in a perfect world,
and our computers have nasty limitations.  Even
if they didn't, infinite loops are easy to write, and
the halting problem thwarts our attempts to
detect these.

Racket will happily allocate all available memory
on this problem and thrash for a while before
aborting with an out-of-memory error, and if you're
lucky no other victims have fallen to the whimsical
Linux OOM killer.

Chibi will fairly early on raise a (continuable) out
of stack exception.  I suppose it might be nicer
if the stack limitations were fully configurable and
the default repl prompted you if you wanted to
continue with a larger limit in these cases. I may
add that, but in the meantime I prefer the current
behavior to unbounded growth.

Halting problem aside, some attempts at runtime
detection of loops, or scaling stack space with
the size of inputs might be nice.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Unbounded stack growth

2012-07-12 Thread Alex Shinn
On Thu, Jul 12, 2012 at 4:39 PM, Alex Queiroz  wrote:
> Hallo,
>
> On Thu, Jul 12, 2012 at 4:25 AM, Alex Shinn  wrote:
>>
>> I disagree - I think a stack grown too large is likely indicative
>> of a programming error, or at the very least an inefficient
>> algorithm.  In the general case I want my programs to be
>> able to allocate as much heap as possible, but have a
>> separate limitation on the stack.
>>
>
> Programming errors or inefficient algorithms should crash C programs,
> not Scheme programs.

Wow, if you've got a magical Scheme compiler that
can read my mind and fix all my bugs for me I'll switch
right now! :)

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Unbounded stack growth

2012-07-12 Thread Alex Shinn
On Thu, Jul 12, 2012 at 4:51 PM, Alex Queiroz  wrote:
> On Thu, Jul 12, 2012 at 9:47 AM, Alex Shinn  wrote:
>>
>> Wow, if you've got a magical Scheme compiler that
>> can read my mind and fix all my bugs for me I'll switch
>> right now! :)
>>
>
>  Are you really saying that it is ok for a Scheme program to crash
> with a segmentation fault because of programming errors, and not just
> because of compiler bugs?

No, and I never said nor implied that.

I think the continuum here is, all else being equal:

  raise continuable exception > abort with meaningful message > segfault

though often all else is not equal.

For the specific case of handling programs which
use unbounded stack, most implementations just
blow up, and the question is how heap do they
allocate in the process.  Are they optimistic and
think "it can't be much longer now" as they allocate
that last 100MB, or do they bail out a little earlier?
Whether you set a fixed limit or just let it use up
all available memory, there is still a limit.  Setting
a separate limit does leave you with some heap
space to try to recover with, though, and is friendlier
to other processes.

But "now we're just negotiating the price."

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Neophyte in scheme: string-split not quite what I want

2012-07-20 Thread Alex Shinn
On Fri, Jul 20, 2012 at 8:56 PM, Дмитрий  wrote:
>
>   As for the character classes, they can be generated quite easily from the
> UnicodeData.txt[1] file. We can get a general category[2] from this file
> by sth like (string->symbol (caddr (string-split line ","))); then we just
> need to map the categories into appropriate character classes (e.g. Lu
> belongs to upper, alpha, alphanum, graph), etc. and merge characters if the
> characters of the same categories if they have adjacent codes.
>   It's quite easy to do. If I'm not lazy I'll do this this weekend.

Full unicode character classes and case handling
are already in the utf8 egg.

These are not yet integrated with irregex because
irregex is written to be portable across any Scheme,
and so it uses its own char-set implementation.  When
R7RS is released I'll re-package irregex accordingly.

Unfortunately, while the utf8 char-sets are very
compact, the DFA conversion of large, sparse Unicode
char-sets is quite large.  I'd like eventually to make
a non-backtracking NFA regex matcher which only
compiles to DFA when you really need the speed.

In the meantime, a fast lookup table for the
script of a character would be nice, and this could
be use to tokenize a string of mixed-language text.
I thought I had this and can't seem to find it anywhere...

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Neophyte in scheme: string-split not quite what I want

2012-07-20 Thread Alex Shinn
On Sat, Jul 21, 2012 at 3:19 AM, Charles Hixson
 wrote:
> On 07/20/2012 04:05 AM, Дмитрий wrote:
>>
>> Hello.
>>
>> Does IrRegex support Unicode character classes? E.g. Will IrRegex consider
>> accented letters (á) or Cyrillic letters (я) as "alpha"? Wil IrRegex
>> consider Chinese wide space ( ) as "space"? Will IrRegex consider Chinese
>> brackets (「」【】) as "punct"? If it doesn't, the regexp is going to be
>> EXTREMELY messy [in fact, I believe it may better to build such a regexp
>> automatically then].
>>
>> I’m on Windows, so I can’t check it (when I use UTF-8 console via chcp
>> 65001, for some reason Chicken seems to fail on every string with operation
>> non-ascii string — even on a simple (display "Привет")).
>>
>>
>>   --
>> Yours sincerely,
>> Dmitry Kushnariov
>>
>>
>
> As I said, I'm a neophyte.  My "character classes" were based around
> [a-zA-z]  etc.  So you can readily see why the pattern would have quickly
> become unreasonably complex.  I didn't find any definition of other
> character classes (well, not one that meant anything) and given the
> discussion, I think that they wouldn't have worked if I'd gotten to the
> point of testing them.
>
> I was planning on using Chicken to learn scheme, since R7SR is supposed to
> be based more on R5SR than on R6SR, but maybe it's better to learn using
> Racket.  I *trust* the conversion won't be too difficult.  (I *do* need to
> use utf-8 in lots of places, and an incomplete implementation while I was
> learning would be ... unpleasant.  Particularly if the user documentation
> presumed that it *was* complete.)

The utf8 implementation is not incomplete.  It's just
not the default.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Alex Shinn, Kon Lovett: list of eggs needing new releases

2012-10-06 Thread Alex Shinn
On Sun, Oct 7, 2012 at 8:09 AM, Mario Domenech Goulart
 wrote:
> Hi,
>
> On Sat, 6 Oct 2012 16:24:22 -0600 Alan Post  
> wrote:
>
>> Based on recently deprecated functions in core, the following eggs
>> need new releases:
>>
>>   condition-utils: patch to trunk attached.
>>   lookup-table:patch to trunk attached.
>>   srfi-41: 1.2.2 no longer works, but trunk has been updated.
>>   stack:   patch to trunk attached.
>>   utf8:attached patch in previous message, but patch
>>repeated here.
>>
>> It appears that the deprecated features used in these eggs are the
>> primary reasons for the majority of the test failures.  The owners
>> of these eggs are active here:
>>
>>   Alex Shinn
>> + utf8
>>   Kon Lovett
>> + condition-utils
>> + lookup-table
>> + srfi-41
>> + stack
>>
>> I don't have the required permission in svn to correct these issues,
>> though I'm happy to be granted sufficient privilege to patch, test
>> and cut new releases for these eggs.  Otherwise, I've done what I
>> can and goddess speed both of you.  ^_^
>
> Notice that to use (use make) in .setup files, you need (depends make)
> in the corresponding .meta files, since make is now an egg.

(needs make) was already in the .meta file.  I've
added (use make) and bumped the version.

-- 
Alex

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] string-join not in utf8-srfi-13

2012-11-02 Thread Alex Shinn
Added.

On Fri, Nov 2, 2012 at 11:52 PM, .alyn.post.
 wrote:
> I'm trying to use string-join in my egg git-hooks-mediawiki[1].
>
> This egg imports utf8-srfi-13, and I am getting an error message
> about this routine:
>
>   Warning: reference to possibly unbound identifier `string-join' in:
>   Warning:parse-error
>   Warning:suggesting: `(import srfi-13)'
>
> I believe that importing utf8-srfi-13 should suffice to get this
> routine, and that the utf8 egg should but doesn't import this
> routine.
>
> Is that correct?  is string-join the only routine from srfi-13 not
> imported by utf8-srfi-13?
>
> I appreciate your guidance here,
>
> -Alan
>
> 1: https://github.com/alanpost/git-hooks-mediawiki
> --
> .i ma'a lo bradi cu penmi gi'e du

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-13 Thread Alex Shinn
Hi,

On Mon, Jan 14, 2013 at 12:52 PM, Sungjin Chun  wrote:

> First, I might have found wrong place but...
>
> It seems that the main source of the my problem is related to the part of
> uri-generic.scm, especially;
>
> (define char-set:uri-unreserved
>   (char-set union char-set:letter+digit (string->char-set "-_.~")))
>
> If I change this part as;
>
> (define char-set:uri-unreserved
>   (char-set union char-set:letter+digit (string->char-set "-_.~")
> char-set:hangul))
>
> then, uri/url with korean characters work. How can I set those part more
> generic one?
>

I believe the ASCII definition is correct even for Unicode URLs.
You need to represent the URL in utf8 and then use percent
escapes on the utf8 bytes, which is what would happen naturally
here.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-13 Thread Alex Shinn
On Mon, Jan 14, 2013 at 1:36 PM, Sungjin Chun  wrote:

> As far as I know, revised RFC permits UTF-8 characters in the URL without
> encoding. Am I wrong here?
>

The latest URI RFC is 3986.  The relevant description in prose is:

  Local names, such as file system names, are stored with a local
  character encoding.  URI producing applications (e.g., origin
  servers) will typically use the local encoding as the basis for
  producing meaningful names.  The URI producer will transform the
  local encoding to one that is suitable for a public interface and
  then transform the public interface encoding into the restricted set
  of URI characters (reserved, unreserved, and percent-encodings).
  Those characters are, in turn, encoded as octets to be used as a
  reference within a data format (e.g., a document charset), and such
  data formats are often subsequently encoded for transmission over
  Internet protocols.

The relevant parts of the BNF are:

   pct-encoded = "%" HEXDIG HEXDIG

   reserved= gen-delims / sub-delims

   gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

   sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
   / "*" / "+" / "," / ";" / "="

   unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

   path  = path-abempty; begins with "/" or is empty
 / path-absolute   ; begins with "/" but not "//"
 / path-noscheme   ; begins with a non-colon segment
 / path-rootless   ; begins with a segment
 / path-empty  ; zero characters

   path-abempty  = *( "/" segment )
   path-absolute = "/" [ segment-nz *( "/" segment ) ]
   path-noscheme = segment-nz-nc *( "/" segment )
   path-rootless = segment-nz *( "/" segment )
   path-empty= 0

   segment   = *pchar
   segment-nz= 1*pchar
   segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
 ; non-zero-length segment without any colon ":"

   pchar = unreserved / pct-encoded / sub-delims / ":" / "@"

Thus you can't use raw non-ASCII bytes in a URI - they must
be encoded, and interpretation is up to the origin (and is overwhelmingly
utf8 these days).

Even Solr (the search engine) permits them.
>

It would of course be possible for any tool or webserver to
accept URIs with non-ASCII bytes, but I don't know of any
browsers which would _send_ such a request, because in
general it would be rejected.

I tried searching non-ASCII on whitehouse.gov (which uses
Solr) and indeed it generated a percent-encoded query.  My
browser (Chrome) rendered the percent escapes as utf-8 for
me though.

There's also punycode which can be used to represent Unicode
domain names (which otherwise don't even allow percent escapes).
In some cases certain browsers will render this for you (generally
if the encoded script matches the top-level country name, e.g.
for a .kr domain Hangul would be shown), but it's in general
a dangerous extension because it makes phishing attempts easier.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-14 Thread Alex Shinn
On Tue, Jan 15, 2013 at 7:35 AM, Sungjin Chun  wrote:

> Thank you very much. :-)
>
> My proposed hack(yes, no solution) just works for me but I found that it
> is just wrong w.r.t RFC.
> I'll try your modification and and let you know whether it works or not.
>
> Thank you again.
>
>
> On Mon, Jan 14, 2013 at 5:08 PM, Ivan Raikov wrote:
>
>> Hi Sungjin,
>>
>>Thanks for trying to use the uri-generic library. As Peter already
>> pointed out, uri-generic and uri-common are intended to implement RFC 3986
>> (URIs), and so far no effort has been done to support RFC 3987 (IRIs).
>>
>
Interesting, I wasn't even aware of RFC 3987.  Note that this extension
only applies to new schemes - in particular IRIs cannot be used for HTTP.


> However, the IRI RFC does define a mapping from IRI to URI, where Unicode
>> characters in IRIs are converted to percent  encoded UTF-8 sequences. The
>> caveat here is that if you try to decode these percent-encoded sequences
>> they will likely result in invalid URI characters. I have prototyped a
>> procedure iri->uri which attempts to percent-encode all UTF-8 sequences in
>> the input string and create a URI. You can see it here:
>>
>
This shouldn't be needed.  Sungjin was using uri-common, which already
percent-encodes UTF-8 sequences, which is what is desired.

Sungjin - going back to your original question, what did you try and
what did it do differently from what you expected?  This should just be
working.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-14 Thread Alex Shinn
On Tue, Jan 15, 2013 at 11:50 AM, Sungjin Chun  wrote:

> My intention is to create search client for Solr (search server using
> lucene); where I should send
> request URL like this;
>
>   http://127.0.0.1:8983/solr/select?q=삼계탕&start=0&rows=10
>
> I've tried to create this client using http-client egg and had found that
> it does not like UTF-8 characters
> in the URL, so has my journey to hack started.
>

Ah, I see.  I had been building URIs directly with make-uri,
which accepts non-ASCII characters and encodes correctly
on output:

  (make-uri scheme: "http" host: "127.0.0.1"
 path: '("" "solr" "select") query: '((q . "삼계탕")))

If you want to parse a string which is already an invalid URI,
you need a hack.  Treating it as an invalid IRI (invalid because
it doesn't allow http) and converting to a URI would work.

Alternately, the URI parsing procedures (and their usage from
http-client) could take an optional non-strict? parameter to allow
invalid characters.  It might make sense to make this the
default for http-client, since this is what browsers typically do -
allow invalid URIs but percent-encode them on request.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-14 Thread Alex Shinn
On Tue, Jan 15, 2013 at 2:23 PM, Ivan Raikov wrote:

> Hi again,
>
>I have now extended the utf8 code in uri-generic, so that UTF-8
> sequences are percent-encoded as lists of the form '(% h1 h2 [% h3 h4
> ...])). The percent-decoding routine is not going to decode sequences of
> more that one byte, so that now percent encoding normalization will not
> interfere with encoded UTF-8 sequences. I have also renamed the iri->uri
> routine to utf8-string->uri. I think now its behavior is compliant with
> both RFC 3986 and 3987:
>
> (utf8-string->uri "http://example.com/삼계탕";) =>
>
> #(URI scheme=http authority=#(URIAuth host="example.com" port=#f) path=(/
> "%EC%82%BC%EA%B3%84%ED%83%95") query=#f fragment=#f)
>

This result looks broken.  As I noted in my previous mail, the URI
representation
already handles non-ASCII characters and escapes on output:

$ csi -R uri-common
#;1> (make-uri scheme: "http" host: "127.0.0.1" path: '(/ "삼계탕"))
#
#;2> (uri->string (make-uri scheme: "http" host: "127.0.0.1" path: '(/
"삼계탕")))
"http://127.0.0.1/82%BCB3%8483%95";

If you put percent escapes _inside_ the internal path representation,
you'll get double escaping.

Parsing is a separate matter, and utf8-string->uri should return
the URI object without error, but with the unescaped values in
the path and query as resulting from the make-uri above.

Unrelated, the actual escaped output looks buggy - it looks like
some characters like the leading "%EC%" are getting dropped.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-15 Thread Alex Shinn
On Tue, Jan 15, 2013 at 3:03 PM, Ivan Raikov wrote:

>
> Percent-encoded sequences of more than one octet will not get touched by
> pct-decode in the current implementation, so you will not get double
> escaping. Percent-encoded sequences of one octet will get decoded if they
> fall in the "unstructured" char-set, as per RFC 3986.
>

OK, now I'm thoroughly confused.  The percent-encoding is context sensitive?
How can this not be broken?

We need to make the design clear:

  * What can be constructed directly with make-uri.
  * What can be parsed, and how this is passed to make-uri.
  * How URIs are represented internally.
  * How URIs are encoded on output.

It sounds like uri-common and uri-generic are doing different things here.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-15 Thread Alex Shinn
On Tue, Jan 15, 2013 at 6:23 PM, Peter Bex  wrote:

> On Tue, Jan 15, 2013 at 06:07:06PM +0900, Alex Shinn wrote:
> > On Tue, Jan 15, 2013 at 3:03 PM, Ivan Raikov  >wrote:
> >
> > >
> > > Percent-encoded sequences of more than one octet will not get touched
> by
> > > pct-decode in the current implementation, so you will not get double
> > > escaping. Percent-encoded sequences of one octet will get decoded if
> they
> > > fall in the "unstructured" char-set, as per RFC 3986.
> > >
> >
> > OK, now I'm thoroughly confused.  The percent-encoding is context
> sensitive?
> > How can this not be broken?
> >
> > We need to make the design clear:
> >
> >   * What can be constructed directly with make-uri.
> >   * What can be parsed, and how this is passed to make-uri.
> >   * How URIs are represented internally.
> >   * How URIs are encoded on output.
> >
> > It sounds like uri-common and uri-generic are doing different things
> here.
>
> uri-generic is agnostic about specific encodings and types.
> uri-common is designed to make life simpler in the case of "common" URIs
> like HTTP where we know what types of characters are to be decoded.
>
> RFC3986 "special characters" cannot be decoded unless we know they have
> no special meaning.  uri-common just decodes everything fully because
> there is generally no deeper nested encoding involved.  It's also smart
> enough to know that port 80 belongs to http, so it can be omitted,
> whereas uri-generic can't make such assumptions.
>
> uri-common also makes the assumption that query args are
> x-www-form-urlencoded.  This is the main reason to prefer it for web
> programming; uri-generic doesn't know about form-encoding because that
> is really only used in the context of HTML (it's strictly not even a
> HTTP thing), so this messy stuff should stay out of the generic URI
> library.
>
> Yes, the web is evil and must die.
>

Right, I'm familiar with the evil standards :)  I'm also hoping that we can
have some basic compatibility between Chicken's uri module and Chibi's
(and whatever R7RS WG2 comes up with).

It seems to me the sane thing to do is represent URIs unencoded
internally, which can be generated directly with make-uri or decoded
on parsing.  The decoding might be schema-specific, although
really the only difference is the space-to-+ and query args encoding.

Then, on output we would encode as needed.

I was confused because the uri-generic change Ivan suggests
seems to be putting encoded characters directly in the representation,
whereas uri-common is encoding only on output.

[It also looks like the uri-common encoding is broken - why were bytes
getting lost?]

Finally, regarding parsing I still don't understand why %AB is decoded
into the corresponding octet but %AB%CD is not?

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-15 Thread Alex Shinn
On Tue, Jan 15, 2013 at 7:48 PM, Peter Bex  wrote:

>
> These special characters are called "reserved" in the BNF.  As you can
> see, the question mark, equals sign and ampersand is in there.
> For query urlencoded query strings, these *cannot* be decoded, because
> then you can't distinguish between
>
> http://calc.example.com?bool-expr=x%26y%3D
> and
> http://calc.example.com?bool-expr=x&y=1
>
> The former should be decoded in uri-common to the alist
> ((bool-expr . "x&y=1")) and the latter to ((bool-expr . "x") (y . "1")).
> By fully decoding all reserved characters in uri-generic, you drop
> important information.
>

The internal representation is either decoded, or it is encoded.
Either can be made to work.

In this case, the decoded uri-common representation of the former is:

  ((bool-expr . "x&y=1"))

and the decoded representation of the latter is:

  ((bool-expr . "x") (y . "1"))

just as you say, so this is how they are stored in the URI object.

In uri-generic, both get parsed to:

  ((bool-expr . "x&y=1"))

As the RFC states:

   Because the percent ("%") character serves as the indicator for
   percent-encoded octets, it must be percent-encoded as "%25" for that
   octet to be used as data within a URI.

Therefore, if you intended the raw URI data to include a "%",
then the correct representation (for either common or generic)
would have been:

  
http://calc.example.com?bool-expr=x%2526y%253D

So assuming & is _not_ special to the query (as is the case
with uri-generic), escaping & with %25 or not produces the
same result.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-15 Thread Alex Shinn
On Wed, Jan 16, 2013 at 12:59 AM, Peter Bex  wrote:

> On Wed, Jan 16, 2013 at 12:39:16AM +0900, Alex Shinn wrote:
> > The internal representation is either decoded, or it is encoded.
> > Either can be made to work.
> >
> > In this case, the decoded uri-common representation of the former is:
> >
> >   ((bool-expr . "x&y=1"))
> >
> > and the decoded representation of the latter is:
> >
> >   ((bool-expr . "x") (y . "1"))
> >
> > just as you say, so this is how they are stored in the URI object.
> >
> > In uri-generic, both get parsed to:
> >
> >   ((bool-expr . "x&y=1"))
>
> This cannot work because uri-common is re-using uri-generic's parser.
> Also, uri-generic doesn't do alist-decoding at all, because form-encoding
> is a HTML affair and has nothing to do with HTTP or URI standards.
>

Ah, OK, there may be implementation details on why you
store encoded or decoded.

Anyway, this isn't really important.  I'm mostly concerned
with making utf8 do the right thing, and was wondering what
the API was because it's not clear from the docs.

Put another way, do uri-path and uri-query return the
encoded or decoded values (maybe differently for uri-common
and uri-generic)?

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-22 Thread Alex Shinn
On Thu, Jan 17, 2013 at 4:51 AM, Peter Bex  wrote:

> On Tue, Jan 15, 2013 at 02:44:08PM +0900, Alex Shinn wrote:
> > This result looks broken.  As I noted in my previous mail, the URI
> > representation already handles non-ASCII characters and escapes on
> output:
> >
> > $ csi -R uri-common
> > #;1> (make-uri scheme: "http" host: "127.0.0.1" path: '(/ "삼계탕"))
> > # > query=#f fragment=#f>
> > #;2> (uri->string (make-uri scheme: "http" host: "127.0.0.1" path: '(/
> > "삼계탕")))
> > "http://127.0.0.1/82%BCB3%8483%95";
> >
> > Unrelated, the actual escaped output looks buggy - it looks like
> > some characters like the leading "%EC%" are getting dropped.
>
> OK, I took some time to investigate and I pinpointed this problem.
> This appears to happen due to the use of core srfi-14 and srfi-13 in
> uri-generic; its char-set operations simply don't deal with anything
> beyond ASCII.


As an aside from the uri discussion, we really need to fix srfi-14.

The reference implementation is terrible.  Not only does it not
handle Unicode, but it doesn't not-handle it gracefully:

#;1> (char-set-contains? char-set:full #\x100)
Error: (string-ref) out of range [...]

At a minimum we should avoid these errors, but really we
should be using a Unicode-aware implementation - there's no
barrier to doing so like there is for Unicode strings.  We could
just move utf8-srfi-14 into the core, or I could patch up the
srfi-14 implementation to handle wide chars properly (but maybe
slowly) without bringing in the iset dependency.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-23 Thread Alex Shinn
On Wed, Jan 23, 2013 at 3:45 PM, Ivan Raikov wrote:

> Yes, I ran into this when I was adding UTF-8 support to mbox... If you
> were to add wide char support in srfi-14, is there a way to quantify the
> performance penalty?
>

To add the bounds check so it doesn't error?  Practically
nothing.

To branch to a separate path for a wide-char table if
the bounds check fails?  Same cost if the input is ASCII.

For efficient handling in the case of Unicode input...
how small/fast do you want it?

-- 
Alex

On Wed, Jan 23, 2013 at 3:42 PM, Alex Shinn  wrote:

> On Thu, Jan 17, 2013 at 4:51 AM, Peter Bex  wrote:
>>
>>> On Tue, Jan 15, 2013 at 02:44:08PM +0900, Alex Shinn wrote:
>>> > This result looks broken.  As I noted in my previous mail, the URI
>>> > representation already handles non-ASCII characters and escapes on
>>> output:
>>> >
>>> > $ csi -R uri-common
>>> > #;1> (make-uri scheme: "http" host: "127.0.0.1" path: '(/ "삼계탕"))
>>> > #>> > query=#f fragment=#f>
>>> > #;2> (uri->string (make-uri scheme: "http" host: "127.0.0.1" path: '(/
>>> > "삼계탕")))
>>> > "http://127.0.0.1/82%BCB3%8483%95";
>>> >
>>> > Unrelated, the actual escaped output looks buggy - it looks like
>>> > some characters like the leading "%EC%" are getting dropped.
>>>
>>> OK, I took some time to investigate and I pinpointed this problem.
>>> This appears to happen due to the use of core srfi-14 and srfi-13 in
>>> uri-generic; its char-set operations simply don't deal with anything
>>> beyond ASCII.
>>
>>
>> As an aside from the uri discussion, we really need to fix srfi-14.
>>
>> The reference implementation is terrible.  Not only does it not
>> handle Unicode, but it doesn't not-handle it gracefully:
>>
>> #;1> (char-set-contains? char-set:full #\x100)
>> Error: (string-ref) out of range [...]
>>
>> At a minimum we should avoid these errors, but really we
>> should be using a Unicode-aware implementation - there's no
>> barrier to doing so like there is for Unicode strings.  We could
>> just move utf8-srfi-14 into the core, or I could patch up the
>> srfi-14 implementation to handle wide chars properly (but maybe
>> slowly) without bringing in the iset dependency.
>>
>> --
>> Alex
>>
>>
>> ___
>> Chicken-users mailing list
>> Chicken-users@nongnu.org
>> https://lists.nongnu.org/mailman/listinfo/chicken-users
>>
>>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.

2013-01-25 Thread Alex Shinn
On Wed, Jan 23, 2013 at 5:09 PM, Alex Shinn  wrote:

> On Wed, Jan 23, 2013 at 3:45 PM, Ivan Raikov wrote:
>
>> Yes, I ran into this when I was adding UTF-8 support to mbox... If you
>> were to add wide char support in srfi-14, is there a way to quantify the
>> performance penalty?
>>
>
> To add the bounds check so it doesn't error?  Practically
> nothing.
>
> To branch to a separate path for a wide-char table if
> the bounds check fails?  Same cost if the input is ASCII.
>
> For efficient handling in the case of Unicode input...
> how small/fast do you want it?
>

I've never met such stony silence in response to an offer to do work...

I ran the following simple char-set-contains? benchmark with
a few variations:

  (time
   (do ((i 0 (+ i 1)))
   ((= i 1))
   (do ((j 0 (+ j 1)))
   ((= j 256))
 (char-set-contains? char-set:letter (integer->char j)

This is what most people are concerned about for speed, as
the boolean and construction operations are less common.

The results:

;; reference implementation
;; 0.312s CPU time, 1/2059 GCs (major/minor)

;; "fixed" reference implementation (no error but no support for
non-latin-1)
;; 0.257s CPU time, 1/1706 GCs (major/minor)

;; utf8-srfi-14 with full Unicode char-set:letter
;; 0.243s CPU time, 0/1526 GCs (major/minor)

;; utf8-srfi-14 with ASCII-only char-set:letter
;; 0.242s CPU time, 0/1526 GCs (major/minor)

I was able to add the check and make the reference
implementation faster because I fixed the common case -
it was optimized for checking for 0 instead of 1.

Even with the enormous and complex definition of a
Unicode "letter", utf8-srfi-14 is faster than srfi-14.

As for what we want in Chicken, the answer depends
on what you're optimizing for.  utf8-srfi-14 will always
win for space, and generally for speed as well.

If the biggest concern is code-size, then you might want
to borrow the char-set definition from irregex and use
that as a "fallback" for non-latin-1 chars in the srfi-14
reference impl.  This would have the same perf as
srfi-14 for latin-1, yet still support full Unicode and not
increase the size of the Chicken distribution.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] UTF-8 support in Chicken core [Was: [Q] uri-common has problem with UTF-8 uri.]

2013-01-27 Thread Alex Shinn
On Sun, Jan 27, 2013 at 10:43 AM, Ivan Raikov wrote:

>
> Hi Alex,
>
> Yes, I would have thought that more people would be interested in
> having UTF-8 support in core Chicken (or at least wide-char compatible
> srfi-14). I have changed the title of this thread to reflect the subject
> more accurately :-)
>
>   Personally, I think that adding UTF-8  in core is much better than the
> hacks I had to do in mbox, and is a no brainer considering the benchmark
> results you have below.  But I am sure that opinions vary on this subject...
>
>Can you post your bounds-check patches to srfi-14 on the mailing list,
> and/or create a ticket for it? Hopefully there will be more responses this
> time.
>

Well, I'm not necessarily proposing UTF-8 support in the core.
I understand that has pros and cons and opinions may differ.

I was just pointing out that we're already got 3 char-set
implementations, 2 of them in the core distribution, and
there are no real cons to simplifying this and replacing
srfi-14 with one of the Unicode-capable implementations.

The simplest change I made was replacing:

(define-inline (si=0? s i) (zero? (%char->latin1 (string-ref s i
(define-inline (si=1? s i) (not (si=0? s i)))

with:

(define-inline (si=0? s i) (if (>= i 256) #t (zero? (%char->latin1
(string-ref s i)
(define-inline (si=1? s i) (and (< i 256) (eq? 1 (%char->latin1 (string-ref
s i)

which is actually faster and while it doesn't support
wide char-sets, at least gives the correct answers when
passed wide chars.

-- 
Alex


> Ivan
>
> On Sat, Jan 26, 2013 at 1:42 PM, Alex Shinn  wrote:
>
>> On Wed, Jan 23, 2013 at 5:09 PM, Alex Shinn  wrote:
>>
>>> On Wed, Jan 23, 2013 at 3:45 PM, Ivan Raikov wrote:
>>>
>>>> Yes, I ran into this when I was adding UTF-8 support to mbox... If you
>>>> were to add wide char support in srfi-14, is there a way to quantify the
>>>> performance penalty?
>>>>
>>>
>>> To add the bounds check so it doesn't error?  Practically
>>> nothing.
>>>
>>> To branch to a separate path for a wide-char table if
>>> the bounds check fails?  Same cost if the input is ASCII.
>>>
>>> For efficient handling in the case of Unicode input...
>>> how small/fast do you want it?
>>>
>>
>> I've never met such stony silence in response to an offer to do work...
>>
>> I ran the following simple char-set-contains? benchmark with
>> a few variations:
>>
>>   (time
>>(do ((i 0 (+ i 1)))
>>((= i 1))
>>(do ((j 0 (+ j 1)))
>>((= j 256))
>>  (char-set-contains? char-set:letter (integer->char j)
>>
>> This is what most people are concerned about for speed, as
>> the boolean and construction operations are less common.
>>
>> The results:
>>
>> ;; reference implementation
>> ;; 0.312s CPU time, 1/2059 GCs (major/minor)
>>
>> ;; "fixed" reference implementation (no error but no support for
>> non-latin-1)
>> ;; 0.257s CPU time, 1/1706 GCs (major/minor)
>>
>> ;; utf8-srfi-14 with full Unicode char-set:letter
>> ;; 0.243s CPU time, 0/1526 GCs (major/minor)
>>
>> ;; utf8-srfi-14 with ASCII-only char-set:letter
>> ;; 0.242s CPU time, 0/1526 GCs (major/minor)
>>
>> I was able to add the check and make the reference
>> implementation faster because I fixed the common case -
>> it was optimized for checking for 0 instead of 1.
>>
>> Even with the enormous and complex definition of a
>> Unicode "letter", utf8-srfi-14 is faster than srfi-14.
>>
>> As for what we want in Chicken, the answer depends
>> on what you're optimizing for.  utf8-srfi-14 will always
>> win for space, and generally for speed as well.
>>
>> If the biggest concern is code-size, then you might want
>> to borrow the char-set definition from irregex and use
>> that as a "fallback" for non-latin-1 chars in the srfi-14
>> reference impl.  This would have the same perf as
>> srfi-14 for latin-1, yet still support full Unicode and not
>> increase the size of the Chicken distribution.
>>
>> --
>> Alex
>>
>>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] UTF-8 support in Chicken core [Was: [Q] uri-common has problem with UTF-8 uri.]

2013-01-28 Thread Alex Shinn
On Tue, Jan 29, 2013 at 7:26 AM, .alyn.post.  wrote:

> I'll throw in my two bits here.
>
> I'm not personally decided whether utf-8 in core would be an
> improvement.  I don't have enough background or knowledge of
> the internals to contribute to that decision.
>

I know there are people who want UTF-8 in the core, but I'd
rather not bring that up, because it hurts the chances of fixing
SRFI-14 :)

Core chicken already supports characters > 255.  SRFI-14
does not, and fails badly (reporting an error instead of just
#f to say the char is not in the char-set).

There is no reason (philosophical, API usability, code size,
runtime memory usage, runtime speed or whatever) why we
shouldn't fix this.

My personal recommendation (and least work for me) is to
replace srfi-14 with utf8-srfi-14 (which actually has nothing
to do with utf8).

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] hato multipart/alternative

2013-04-06 Thread Alex Shinn
Hi Andy,

On Fri, Apr 5, 2013 at 9:30 AM, Andy Bennett  wrote:

> Hi,
>
> Can anyone offer guidance on how to send a multipart/alternative mail
> with hato? I'm trying to send HTML mail with a text/plain alternative.
>
> For my proof of concept I tried:
>
> -
> (send-mail From:"Pat Andrews "
>To:  "Andy Pandy "
>Subject: "Hato Test"
>Charset: "ISO-8859-1"
>Attachments: '((Body: "Hello this is the first attachment")
>   (Body: "This is the second attachment")))
> -
>
> This results in a multipart mail where both parts show up in my mail
> reader.
>
> I then tried:
>
> -
> (send-mail From:"Pat Andrews "
>To:  "Andy Pandy "
>Subject: "Hato Test"
>Charset: "ISO-8859-1"
>Content-Type: "multipart/alternative"
>Attachments: '((Body: "Hello this is the first attachment")
>   (Body: "This is the second attachment")))
> -
>

When specifying your own multiple Content-Type, you
need to include the boundary:

  Content-Type: "multipart/alternative; boundary=xyzzy"
  Boundary: "xyzzy"

[Ideally it should infer the boundary in this case, and it
would also be nice to automatically generate the boundary
when unspecified.]

This should prevent everything falling out to the top-level
as you described, though it may not generate quite what
you want.  You can log issues on hato.googlecode.com.
If it's a bug I'll try to fix it, but feature requests will largely
be pending the port to R7RS.


> hato seems to generate MIME messages with a single boundary string.
>

Attachments may include nested attachments, which will
result in nested multiparts with different boundaries.  You
should only need one top-level boundary in your case though.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] hato multipart/alternative

2013-04-10 Thread Alex Shinn
On Thu, Apr 11, 2013 at 1:48 AM, Andy Bennett wrote:

> Hi,
>
> > When specifying your own multiple Content-Type, you
> > need to include the boundary:
> >
> >   Content-Type: "multipart/alternative; boundary=xyzzy"
> >   Boundary: "xyzzy"
> >
> > [Ideally it should infer the boundary in this case, and it
> > would also be nice to automatically generate the boundary
> > when unspecified.]
> >
> > This should prevent everything falling out to the top-level
> > as you described, though it may not generate quite what
> > you want.  You can log issues on hato.googlecode.com
> > .
> > If it's a bug I'll try to fix it, but feature requests will largely
> > be pending the port to R7RS.
> >
> >
> > hato seems to generate MIME messages with a single boundary string.
> >
> >
> > Attachments may include nested attachments, which will
> > result in nested multiparts with different boundaries.  You
> > should only need one top-level boundary in your case though.
>
> Thanks for the tips. I've fiddled with it some more but I can't get it
> to produce exactly the right structure.


OK, then could you file a bug?  If it's not actually broken
make it a feature request.  [One thing it needs is an option
to disable the implicit body when using attachments.]

You can file a separate bug for the timezone format.

Patches welcome!

-- 
Alex


>
> -
> (send-mail From:"Pat Andrews "
>To:  "Andy Pandy "
>Subject: "Hato Test"
>Charset: "ISO-8859-1"
>Content-Type: "multipart/alternative; boundary=xyzzy"
>Boundary: "xyzzy"
>Body: "Hello this is the first attachment"
>Attachments: '((Content-Type: "text/plain" Body: "This is the
> second
> attachment")))
> -
>
> Produces:
>
> -
> Content-Type: multipart/alternative; boundary="xyzzy"
> -
> ...in the headers
>
> and
> -
> --xyzzy
> Content-Type: multipart/alternative; boundary="xyzzy"; charset="ISO-8859-1"
>
> --xyzzy
> Content-Type: text/plain; charset="ISO-8859-1"
>
>
> This is the second attachment
>
>
>
>
> --xyzzy
> Content-Type: text/plain
>
>
>
> --xyzzy--
>
> --xyzzy--
> -
> ...in the body.
>
> The "This is the first attachment" text is mysteriously missing and the
> structure of the message is wrong because there's an extra MIME header
> at the start of the body.
>
>
>
> -
> (send-mail From:"Pat Andrews "
>To:  "Andy Pandy "
>Subject: "Hato Test"
>Charset: "ISO-8859-1"
>Content-Type: "multipart/alternative"
>Boundary: "xyzzy"
>Body: "Hello this is the first attachment"
>Attachments: '((Content-Type: "text/plain" Body: "This is the
> second
> attachment")))
> -
>
> (i.e. omiting the "boundary=" bit from the Content-Type header)
>
> produces:
>
> -
> Content-Type: text/plain
> -
> ...in the headers
>
> and
> -
> --xyzzy
> Content-Type: multipart/alternative; charset="ISO-8859-1"
>
>
> Hello this is the first attachment
>
>
>
>
> --xyzzy
> Content-Type: text/plain; charset="ISO-8859-1"
>
>
> This is the second attachment
>
>
>
>
> --xyzzy
> -
> ...in the body.
>
>
> The structure of this one is *almost* correct but the Content-Type in
> the header is wrong: it should be "multipart/alternative..." and the
> Content-Type in the first part of the body is wrong: it should be
> "text/plain". Also, there's no specification of the boundary anywhere.
>
>
>
> Perhaps I could hack hato-smtp.scm to do what I wanted but I'm not
> particularly familiar with the appropriate standards so I thought I'd
> discuss it here first.
>
> The only way I could come up with for getting hato to not generate an
> extra MIME header at the start of the body was to use the Body: in
> send-mail and a single attachment but that leaves no way to specify both
> the fact the parts are alternative and the Content-Type of the first part.
>
>
>
> Regards,
> @ndy
>
> --
> andy...@ashurst.eu.org
> http://www.ashurst.eu.org/
> 0x7EBA75FF
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] html->sxml (html-parser egg) does not decode entities in html attributes, ideas why?

2013-09-03 Thread Alex Shinn
On Tue, Sep 3, 2013 at 11:19 PM, Philip Kent  wrote:

>  Hi all,
>
> I noticed an issue today with the html-parser egg, where it does not seem
> to decode entities within an attribute of an element, I have included an
> example below.
>
> #;14> (html->sxml "")
> (*TOP* (div (@ (data-foo """
>
> *Expected: **(*TOP* (div (@ (data-foo "\""*
>
> I was wondering if anyone could provide some thoughts as to why this might
> be happening? I have taken a look at the html-parser egg but have not seen
> much (but then this goes far beyond my knowledge of scheme!)
>

html-parser processes entities, but the default for html->sxml
is just to leave the encoded as-is.  I'm not sure if that's the best
default, but will at least provide a convenient option to get
the decoded strings.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] html->sxml (html-parser egg) does not decode entities in html attributes, ideas why?

2013-09-04 Thread Alex Shinn
On Wed, Sep 4, 2013 at 8:23 PM, Philip Kent  wrote:

>  Hi Alex,
>
> Thanks for your email.
>
> I'm somewhat confused by what you say. Through investigation, it seems
> html->sxml will decode entities, so long as they aren't within a HTML
> element attribute. Could you clarify on whether that default applies
> globally or just to attributes?
>

Yes, sorry, I misread my own code :)

The default is to _decode_ entities:

#;1> (html->sxml """)
(*TOP* "\"")

And as you say, it currently doesn't just process attributes:

#;2> (html->sxml "")
(*TOP* (div (@ (data-foo """

I'll fix this.

What I was referring to before is that you can customize
what is done with entities with

 (make-html-parser 'entity: (lambda (name) ...))

and can customize non-default entity names:

 (make-html-parser 'entities: '(("quot" . "\"") ...))

but again, these are currently ignored in attributes.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] html->sxml (html-parser egg) does not decode entities in html attributes, ideas why?

2013-09-08 Thread Alex Shinn
On Thu, Sep 5, 2013 at 12:39 AM, Philip Kent  wrote:

>  Hi Alex,
>
> Excellent! Thanks for looking into it and for the tip re custom parsers -
> I was trying to understand that code!
>

It should work now, let me know if you have any problems.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Issue w/ string-trim functions in utf8-srfi-13

2013-09-19 Thread Alex Shinn
On Fri, Sep 20, 2013 at 5:55 AM, Matt Gushee  wrote:

> Hello--
>
> I've noticed the following unexpected behavior with the string
> trimming functions in utf8-srfi-13:
> [...]
> csi> (use utf8-srfi-13)
> ; loading /usr/lib/chicken/6/utf8-srfi-13.import.so ...
> ; << etc. >>
>
> csi> (map string-trim-both strings)
> ("abc" "\t   abc" "\r   abc" "\n   abc")
>

This looks like an oversight.  The fix is simple, but my dev
machine just died so it may take me a couple days to get to it.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Issue w/ string-trim functions in utf8-srfi-13

2013-09-22 Thread Alex Shinn
On Fri, Sep 20, 2013 at 9:30 AM, Matt Gushee  wrote:

> Hi, Alex--
>
> On Thu, Sep 19, 2013 at 5:33 PM, Alex Shinn  wrote:
> >
> > This looks like an oversight.  The fix is simple, but my dev
> > machine just died so it may take me a couple days to get to it.
>
> Fair enough. The fact that nobody has noticed this until now suggests
> that a short delay will not ruin too many people's lives ;-) And I've
> taken care my immediate needs. Thanks!
>

This should work now.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] iset egg, wrong intersection result

2013-10-27 Thread Alex Shinn
On Sat, Oct 26, 2013 at 1:13 AM, r  wrote:

> Hi,
>
> i have found that on some numbers iset-intersection produce wrong results.
> example in attachment, all numbers in fixnum range
>

Thanks for the report!  If I don't have time before I'll fix this on the
weekend.

iset b a:(44189 50194 78574 80802 80902 88559 89629 89832 89862 90701 91234
> 92306 93599 95409 95687 96350 97827 97844 97845 97846 97847 97848 850 97851
> 97852 97853 97854 97855 97856 97857 97858 97859 97860 97861 97862 97863
> 97864 97876 98645 98646 98647 98648 98649 98650 98651 98977 161)
>
> iset a b:(44189 50194 78574 80802 80902 88559 89629 89832 89862 90701
> 91234 92306 93599 95409 95687 96350 98977 99095 99161)
>
> srfi-1 b a:(44189 50194 78574 80802 80902 88559 89629 89832 89862 90701
> 91234 92306 93599 95409 95687 96350 97845 97863 98977 99095 99161)
>
> srfi-1 a b:(44189 50194 78574 80802 80902 88559 89629 89832 89862 90701
> 91234 92306 93599 95409 95687 96350 97845 97863 98977 99095 99161)
>
>
>
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] an oddly slow regex ...

2013-10-27 Thread Alex Shinn
On Sun, Oct 27, 2013 at 3:38 AM, Peter Bex  wrote:

> On Sat, Oct 26, 2013 at 10:37:36AM -0700, Matt Welland wrote:
> > This regex is so slow that you don't need a timer to see the impact (at
> > least not on my machine with chicken 4.8.0):
> >
> >  (string-match "[a-z][a-z0-9\\-_.]{0,20}"
> "a012345678901234567890123456789")
> >
> > Changing the {0,20} to + makes it run normally fast so I just replaced
> the
> > regex with a string-length and modified the "{0,20}" to "+" . I don't
> > necessarily need a fix for this but it seems like a possible symptom of a
> > deeper problem so I thought I'd report it.
>
> Hi Matt,
>
> Thanks for your report.  I'm afraid this is a known problem with
> Irregex - to avoid producing a state machine with too many states,
> it will always use a backtracking implementation for all repetition
> counts.
>
> I think it's best to take a look at how to fix this upstream first.
> Maybe Alex has an idea of how to do that.
>

It's possible to expand repetition counts in the DFA.  Your
example basically becomes:

"[a-z][a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?[a-z0-9\\-_.]?"

which in this case probably has a reasonably sized DFA.
In other cases, these patterns can easily blow up and
force the backtracking fallback.  I'm not sure if it's better
to try more aggressively or to keep the easier rule of thumb
that n..m repetitions always force backtracking.

> Pre-compiling the regex didn't seem to make any difference.
>
> That's because the backtracker matches really slowly.
>

There is possibly some pathological backtracking happening
here, I'll take a look.

The long-term goal is to replace the backtracking with a
more scalable approach (e.g. the non-backtracking NFA
used in the SRFI 115 reference implementation) and only
use DFAs for simple patterns or when speed is explicitly
requested.  Doing this and preserving all PCRE features
will take time though (especially backrefs).

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] iset egg, wrong intersection result

2013-11-08 Thread Alex Shinn
On Mon, Oct 28, 2013 at 12:40 PM, Alex Shinn  wrote:

> On Sat, Oct 26, 2013 at 1:13 AM, r  wrote:
>
>> Hi,
>>
>> i have found that on some numbers iset-intersection produce wrong results.
>> example in attachment, all numbers in fixnum range
>>
>
> Thanks for the report!  If I don't have time before I'll fix this on the
> weekend.
>

Sorry for the delay, this should be working now.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] html->sxml (html-parser egg) does not decode entities in html attributes, ideas why?

2013-11-23 Thread Alex Shinn
On Sat, Nov 23, 2013 at 11:19 AM, Jim Ursetto  wrote:

> Alex,
>
> Looks like there's a regression of sorts in html-parser 0.5.1.
>
> 0.5.0
>
> #;> (html->sxml "")
> (*TOP* (foo (@ (bar
>
> 0.5.1
>
> #;> (html->sxml "")
> Error: (cadr) bad argument type: ()
>

Oops, fixed.

Arguably, empty attributes should result in a value of "" as per
> http://dev.w3.org/html5/markup/syntax.html#syntax-attr-empty ; for
> example,
>
> #;> (html->sxml "")
> (*TOP* (foo (@ (bar ""
>
> although I'd also be satisfied with a return to the status quo ante, in
> which a null cdr signifies empty.
>

Given that I can see pros and cons to both approaches,
I'm inclined to leave as-is for now.

-- 
Alex


> Jim
>
> On Sep 8, 2013, at 7:30 AM, Alex Shinn  wrote:
>
> On Thu, Sep 5, 2013 at 12:39 AM, Philip Kent  wrote:
>
>>  Hi Alex,
>>
>> Excellent! Thanks for looking into it and for the tip re custom parsers -
>> I was trying to understand that code!
>>
>
> It should work now, let me know if you have any problems.
>
> --
> Alex
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] irregex-replace return value

2014-03-02 Thread Alex Shinn
Hi Michele,

On Sat, Mar 1, 2014 at 7:32 AM, Michele La Monaca <
mikele.chic...@lamonaca.net> wrote:

> Hi,
>
> I've noticed that irregex-replace returns the original string
> if no replacement takes place. I think its a very poor choice.
>
> Whether or not a replacement was actually made can be an important
> piece of information which is lost returning the original string.
> The "correct" return value should be #f.
>
> Ditto irregex-replace/all.
>

I've used irregex-replace{,/all} and equivalents in other
languages for a long time, and find the current semantics
most convenient.  I can see in some cases wanting to test
for a replacement, or in irregex-replace-all the number of
replacements, but it seems to be by far the rarer case
(varying with individual programming style).

Your options right now in these cases are to test for the
match then apply the subst manually, or write a utility to
do so.

If you're interested, there's also SRFI 115 currently under
discussion for standard Scheme regular expressions.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] irregex-replace return value

2014-03-03 Thread Alex Shinn
On Sun, Mar 2, 2014 at 8:51 PM, Michele La Monaca <
mikele.chic...@lamonaca.net> wrote:

>
> While writing my own version of irregex-replace can be (hopefully) an
> enjoyable
> 6-line coding experience (btw, irregex-apply-match is not documented):
>

Oops, thanks, I'll document it.

(define (my-own-irregex-replace irx s . o)
>   (let ((m (irregex-search irx s)))
> (and m (string-append
>  (substring s 0 (irregex-match-start-index m 0))
>  (apply string-append (reverse (irregex-apply-match m o)))
>  (substring s (irregex-match-end-index m 0) (string-length
> s))
>
> writing a customized version of irregex-replace/all means writing a real
> non-elementary program.


It's probably not as complicated as you're imagining.
irregex-fold does most of the work:

#;2> (define (my-irregex-replace-all irx str . o)
  (irregex-fold
   irx
   (lambda (i m acc)
 (let ((m-start (irregex-match-start-index m 0)))
   (cons
(+ 1 (car acc))
(append (irregex-apply-match m o)
(if (>= i m-start)
(cdr acc)
(cons (substring str i m-start) (cdr acc)))
   '(0)
   str
   (lambda (i acc)
 (let ((end (string-length str)))
   (values
(apply
 string-append
 (reverse (if (>= i end)
  (cdr acc)
  (cons (substring str i end) (cdr acc)
(car acc))

#;3> (my-irregex-replace-all '(+ digit) "one 1 two 22 three 333" "?")

"one ? two ? three ?"

3

; 2 values
I'll consider adding this utility, though.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Review my Caesar Cipher?

2014-03-10 Thread Alex Shinn
On Tue, Mar 11, 2014 at 6:16 AM, Daniel Carrera  wrote:

>
> On 10 March 2014 20:04, Daniel Carrera  wrote:
>
>> I am trying to write an R7RS-compliant version. R7RS would give me
>> "import", as well as char->integer and integer->char. The problem I'm
>> having is that my code does not work when I compile it, or when I use "csi
>> -s", but it works perfectly well when I paste it directly into the csi REPL.
>>
>
> After a tip from Erik, I have isolated the issue. The (import) only works
> correctly if you first run (use posix). My REPL was loading posix because I
> loaded readline. The following code compiles and runs correctly:
>
> (use posix)
>
> ;
> ; Unicode-safe. Requires an R7RS-compliant Scheme.
> ;
> (import (srfi 13)) ; String library.
>
> (define msg "The quick brown fox jumps over the lazy fox.")
> (define key 13)
>
> (define (caesar char)
>   (define A (char->integer #\A))
>   (define Z (char->integer #\Z))
>   (define a (char->integer #\a))
>   (define z (char->integer #\z))
>   (define c (char->integer char))
>   (cond ((and (>= c A) (<= c Z)) (integer->char (+ A (modulo (+ key (- c
> A)) 26
> ((and (>= c a) (<= c z)) (integer->char (+ a (modulo (+ key (- c
> a)) 26
> (else char))) ; Return other characters verbatim.
>

(integer->char
 (cond ((<= A c Z) (+ A (modulo (+ key (- c A)) 26)))
  ((<= a c z) (+ a (modulo (+ key (- c A)) 26)))
  (else c)))


>
> (print (string-map caesar msg))
>
>
>
> Cheers,
> Daniel.
>
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] R7RS: (current-jiffy) and (jiffies-per-second)

2014-03-10 Thread Alex Shinn
On Sun, Mar 9, 2014 at 7:51 PM, Daniel Carrera  wrote:

> Hello,
>
> I hope nobody minds an R7RS question. This list seems to have people
> knowledgeable of R7RS. It seems weird that R7RS would specify the functions:
>

There's also scheme-repo...@scheme-reports.org but
I doubt people mind here.

(current-jiffy)  -->  An exact integer representing the number of jiffies
> (arbitrary unit of time) since some arbitrary epoch.
>
> (jiffies-per-second) --> Integer representing the number of jiffies in one
> second.
>
>
> What could possibly be the value of these functions, given that R7RS
> already specifies (current-second) as the number of seconds since the Unix
> epoch? This seems like an oddly useless concept for a language that tries
> to be minimalist.
>

There are actually a number of motivations for this.
(current-second) is expensive, and in the presence
of NTP not guaranteed to be monotonic.  It will also
generally cons to return a bignum or flonum, whereas
current-jiffy could always return fixnums.

In general, for timing you want to use jiffies, and
for calendar operations you want want seconds.

The utility is not disputed.  Whether this belongs
in the small language is debatable (as is _everything_),
but it's there and is easy to implement.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Review my Caesar Cipher?

2014-03-11 Thread Alex Shinn
On Tue, Mar 11, 2014 at 7:15 PM, Daniel Carrera  wrote:

> With the last suggestion from Alex, and a tip to use cond-expand from Kon,
> I have settled on the following:
>
> -
> ;
> ; Works with Chicken Scheme and Gauche.
> ;
> (cond-expand (chicken (use srfi-13))
>  (gauche  (use srfi-13)))
>
> (define msg "The quick brown fox jumps over the lazy dog.")
>  (define key 13)
>
> (define (caesar char)
>   (define A (char->integer #\A))
>   (define Z (char->integer #\Z))
>   (define a (char->integer #\a))
>   (define z (char->integer #\z))
>   (define c (char->integer char))
>   (integer->char
> (cond ((<= A c Z) (+ A (modulo (+ key (- c A)) 26)))
>   ((<= a c z) (+ a (modulo (+ key (- c a)) 26)))
>   (else c ; Return other characters verbatim.
>
> (print (string-map caesar msg))
> -
>
> I tried to include more Schemes, but Chibi doesn't seem to have SRFI-13
>

Chibi has string-map in (chibi string).

But actually, if you're aiming for R7RS support then
string-map is in (scheme base).  Just replace the
cond-expand with:

(import (scheme base))

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Review my Caesar Cipher?

2014-03-12 Thread Alex Shinn
On Tue, Mar 11, 2014 at 9:20 PM, Daniel Carrera  wrote:

>
> On 11 March 2014 12:41, Alex Shinn  wrote:
>
>> Chibi has string-map in (chibi string).
>>
>> But actually, if you're aiming for R7RS support then
>> string-map is in (scheme base).  Just replace the
>> cond-expand with:
>>
>> (import (scheme base))
>>
>
> Hmm... sadly, (import (scheme base)) fails with Chicken and Gauche. I am
> also having a hard time figuring out how to print with Chibi. I tried the
> manual, and I tried (print), (printf) and (display).
>

Then change your cond-expand to:

(cond-expand
 ((or chicken gauche)  ; compatibility
  (use srfi-13))
 (else ; R7RS
  (import (scheme base) (scheme write

R7RS puts display in (scheme write), because it typically
falls back to write for non-char/strings, and because write
is actually a fairly large and complicated procedure not
needed by most libraries.

However, write-char, write-string and newline are all in
(scheme base) so you could just use write-string here.

You can also (import (scheme r5rs)) to get all of the R5RS
bindings except transcript-on/off.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] chicken 4.8.0.5 on cygwin - stty & ECHOPRT

2014-04-03 Thread Alex Shinn
On Mon, Mar 31, 2014 at 3:15 AM, Claude Marinier wrote:

>
> On Sun, 30 Mar 2014, Claude Marinier wrote:
>
>> I just built chicken 4.8.0.5 on MS Windows Vista with a somewhat recent
>> cygwin. The installation process went well but csi could not find parley.
>>
>> Turns out parley needs stty which did not compile because it cannot find
>> the symbol ECHOPRT. I removed the offending code from stty.scm, compiled,
>> and installed it. After that, parley installed without complaint.
>>
>
> I neglected to mention that removing references to ECHOPRT in stty is not
> a proper solution; it just allowed me to use Chicken on MS Windows. Note
> that csi command editing does not work properly; this is likely due the
> changes I made.
>
> I appologize for previously posting an "omnibus" message. I will try to
> remember to have one topic per posting.


Yes please, I usually only skim subjects :)

I'll conditionally compile out ECHOPRT.

-- 
Alex


>
> --
> Claude Marinier
>
>
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] chicken 4.8.0.5 on cygwin - stty & ECHOPRT

2014-04-03 Thread Alex Shinn
On Thu, Apr 3, 2014 at 10:02 PM, John Cowan  wrote:

> Alex Shinn scripsit:
>
> > Yes please, I usually only skim subjects :)
>
> Ah, then you missed my argument.
>
> > I'll conditionally compile out ECHOPRT.
>
> ECHOPRT should be removed unconditionally.  It is not Posix and is only
> useful on hard-copy terminals, which are non-existent these days.
>

It's still defined and documented, and I wouldn't rule
out the possibility of some terminal supporting it (there
could be people running Chicken on all kinds of crazy
ancient hardware for all I know).

When the macro itself is removed from either Linux
or BSD I'll remove it from the egg.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] html->sxml (html-parser egg) does not decode entities in html attributes, ideas why?

2014-05-08 Thread Alex Shinn
On Fri, May 9, 2014 at 6:44 AM, Andy Bennett  wrote:

>
> Empty attributes now seem to decode to the string "()".
>

Fixed.

During " deserialisation when inside an attribute, we seem to get data
> from earlier in the stream introduced:
>

I couldn't reproduce this.  Could you check with the latest fix?
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] html->sxml (html-parser egg) does not decode entities in html attributes, ideas why?

2014-05-08 Thread Alex Shinn
On Fri, May 9, 2014 at 8:26 AM, Andy Bennett  wrote:

>
> Which CHICKEN are you using? I can reproduce it with 0.5.2 on 4.9.0rc1:
>

Nevermind, I had only checked 0.5.3.  I can see it in 0.5.2.
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] UTF-8 support in eggs

2014-07-07 Thread Alex Shinn
Hi,

On Tue, Jul 8, 2014 at 5:58 AM, Mario Domenech Goulart <
mario.goul...@gmail.com> wrote:

> Hi,
>
> I want to use some eggs and I need them to handle UTF-8.  By "handle
> UTF-8" I mean "treat strings as UTF-8", so that
>
>(string (string-ref "ç" 0)) => "ç"
>
> for example.
>
> CHICKEN's string-related procedures "accept" UTF-8 strings, but it
> doesn't mean they will correctly handle them.
>

It also doesn't necessarily mean they will mishandle them.
It might help the discussion if we had a list of eggs which
are known to break on UTF-8 inputs.

I need UTF-8 support in some eggs that currently don't handle UTF-8.
> Assuming we won't have proper UTF-8 support in the core anytime soon,
> what's the best way to approach this?  Here are some options I thought
> (I must tell in advance none sounds good to me):
>
> 1. Have  and -utf8 variants.  Or, more generally,  and
>- variants.  That would turn our coop into a disgusting
>mess and would be a nightmare to egg authors.
>
> 2. Make eggs install  and - modules.  So, you can
>(use ) or (use -) depending on your needs.
>
> 3. Manually forking and patching eggs on the user end.
>

4. Make affected eggs functors on the set of basic string operations.

The same approaches also apply to eggs needing the full
numeric tower, though with UTF-8 there's less chance of
breakage when mixing eggs which do and don't use the utf8 egg.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] UTF-8 support in eggs

2014-07-07 Thread Alex Shinn
On Tue, Jul 8, 2014 at 12:59 PM, John Cowan  wrote:

>
> > The same approaches also apply to eggs needing the full
> > numeric tower, though with UTF-8 there's less chance of
> > breakage when mixing eggs which do and don't use the utf8 egg.
>
> I would say that UTF-8 has *more* chance of causing undetected
> breakage, because UTF-8 strings have an interpretation as core
> strings, whereas bignums, ratnums, compnums etc. don't look
> like numbers to the core, and errors will be thrown.
>

Well, less chance of breakage, but more chance the
breakage goes undetected.  Possibly a much worse
situation, but it's possible to know a priori that there
won't be breakage, e.g. the non-utf8 egg is format,
and you're not doing any padding or truncating.  On
the other hand, it's pretty much impossible to mix
numbers and non-numbers eggs.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] UTF-8 support in eggs

2014-07-08 Thread Alex Shinn
On Tue, Jul 8, 2014 at 11:00 PM, Mario Domenech Goulart <
mario.goul...@gmail.com> wrote:

>
> On Tue, 8 Jul 2014 12:42:21 +0900 Alex Shinn  wrote:
> >
> > 4. Make affected eggs functors on the set of basic string operations.
>
> Wouldn't 4 be an implementation method of 2?
>

Yes.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] UTF-8 support in eggs

2014-07-09 Thread Alex Shinn
On Wed, Jul 9, 2014 at 7:15 AM, Oleg Kolosov  wrote:

>
> IMO just enable utf8 by default and let them break. Is it's not 80's
> anymore, latin1 only software should die.


I agree that if people want "latin1 only" there should at best be
a compiler option for this which is disabled by default.  Chicken
is a community project used by people around the world.

However, I don't think that's the real problem.  The issue as I
understand is that although Chicken has both strings and
bytevectors in the core, historically and for continued simplicity
strings are abused as bytevectors in many cases.  This allows
you to use the plentiful string libraries (e.g. srfi-13 and regex)
on binary data, whereas there are few bytevector utils.  There
are also cases where you have mixed text and binary data,
and applying all appropriate conversions can be tedious.

The clean way to handle this is to duplicate the useful string
APIs for bytevectors.  This could be done without code duplication
with the use of functors, though compiler assistance may be
needed for efficiency (e.g. for inlined procedures).  Even without
code duplication there would be an increase in the core library
size, though we could probably move most utilities to external
libraries (how often do you need regexps that operate on binary
data?).

If we could (through functors or in a pinch duplication) bring
the bytevector API up to speed with strings, then the next
step is to identify all such abusers of strings and move them
to bytevectors.

We did few tests some time ago and they showed that tackling this from
> Scheme side does not make worthy difference. Using pure C is much
> better. Perhaps utf8 egg could enjoy some yet to be written (or found in
> third party libraries) low level support from the core, so we can have
> the best of the both worlds.
>

There's already some small utf8 support in the core, and
adding the missing pieces would not take measurable space.

The bigger issue from the performance perspective is existing
idioms that use indexes, which can degrade to quadratic behavior
in the worst case no matter how much you optimize (without hacks
that slow down normal usage).  So people would have to learn to
take substrings where appropriate to avoid the start/end parameters
to all SRFI 13 functions, or we would need to deprecate SRFI 13
in favor of a cursor-oriented API (planned for R7RS).

So as you see the change is contagious.  We can update the core
efficiently and easily, but then we have to fix the string abusers,
and then we have to replace existing index-oriented APIs.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] UTF-8 support in eggs

2014-07-10 Thread Alex Shinn
On Thu, Jul 10, 2014 at 3:51 PM, John Cowan  wrote:

> Alex Shinn scripsit:
>
> > The clean way to handle this is to duplicate the useful string
> > APIs for bytevectors.  This could be done without code duplication
> > with the use of functors, though compiler assistance may be
> > needed for efficiency (e.g. for inlined procedures).  Even without
> > code duplication there would be an increase in the core library
> > size, though we could probably move most utilities to external
> > libraries (how often do you need regexps that operate on binary
> > data?).
>
> +1.  This is what Python 3.x does to help manage the same transition: the
> only string APIs that don't have bytevector counterparts are formatting,
> string-to-bytevector conversion, and a few others.  This API is also
> useful for dealing with binary protocols that have ASCII parts.
>

Hmmm... that's upsetting.  Python 3 is a notorious dead-end
language.

Note Chibi implements utf8 in the core how I think it should be
done, having no backwards compatibility beyond R7RS to deal
with.  It accounts for less than 6k of the library size, most of
which is for the split index/cursor API rather than the actual
utf8 processing routines.  I do run into inconveniences from
time to time, but am gradually expanding the bytevector utilities
as needed (mostly in (chibi bytevector) and (chibi io)).  When
the API is more stable it may be good to follow it.

For comparison, Chibi's ultra small and naive full numeric
tower implementation costs 53k in library size.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] UTF-8 support in eggs

2014-07-10 Thread Alex Shinn
On Fri, Jul 11, 2014 at 7:20 AM, Oleg Kolosov  wrote:

> On 07/09/14 09:00, Alex Shinn wrote:
> > The clean way to handle this is to duplicate the useful string
> > APIs for bytevectors.  This could be done without code duplication
> > with the use of functors, though compiler assistance may be
> > needed for efficiency (e.g. for inlined procedures).  Even without
> > code duplication there would be an increase in the core library
> > size, though we could probably move most utilities to external
> > libraries (how often do you need regexps that operate on binary
> > data?).
>
> Considering Chibi Scheme size numbers from your other mail, I hardly
> call this a huge price for the benefit received. Even for my specific
> embedded use cases.
>

Note Chibi factors out all but a few string utilities into
separate libraries, i.e. the Chibi core is smaller than the
Chicken core.  The size increase for Chicken would thus
be correspondingly larger, though still likely very small.

> The bigger issue from the performance perspective is existing
> > idioms that use indexes, which can degrade to quadratic behavior
> > in the worst case no matter how much you optimize (without hacks
> > that slow down normal usage).  So people would have to learn to
> > take substrings where appropriate to avoid the start/end parameters
> > to all SRFI 13 functions, or we would need to deprecate SRFI 13
> > in favor of a cursor-oriented API (planned for R7RS).
>
> Do you have some examples on how to avoid performance degradation and
> not use string indexes?


Just don't use string indexes - they're not useful.  Passing
and returning cursors (byte offsets into strings) is all you need. [*]

In the more common cases, just using string ports, string-map,
or loop syntax hides the underlying iteration (a good loop macro
has potential to be faster than manual iteration).

How about more complex formatting like
> outputting numbers with padding? I guess these should be handled with
> something like fmt (or chibi.show).


Well, this is completely orthogonal to utf8, but probably the
most important performance hack for combinator formatters
is Chicken's define-compiler-syntax.

-- 
Alex

[*] With very few exceptions, the only example of which I'm aware
of is Boyer-Moore. However, string search on utf8 bytes is faster than
on UCS-32 codepoints, so the trick is to just provide string search as
part of an API and let implementations optimize accordingly.
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] UTF-8 support in eggs

2014-07-10 Thread Alex Shinn
On Fri, Jul 11, 2014 at 6:53 AM, Michele La Monaca <
mikele.chic...@lamonaca.net> wrote:

>
> Wouldn't be simpler and more effective this other path?
>
> 1) keep current string functions as they are (i.e. byte-oriented) and
> keep "abusers" abusing (and happy)
> 2) provide new utf8/cursor-oriented functions where needed (e.g.
> utf8-string-ref but not utf8-string-append)
>

It's the same thing - the module system can let you
rename whichever way you prefer.  I prefer strings to
be strings and bytevectors to be bytevectors.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Scheme-reports] R7RS-small draft ratified by Steering Committee

2014-07-11 Thread Alex Shinn
On Sat, Jul 12, 2014 at 10:18 AM, John Cowan  wrote:

> Sanel Zukan scripsit:
>
> > Is this means that we are no longer allowed to write and support
> > someting like:
> >
> > (define (1+x x) (+ 1 x))
> >
> > ?
>
> If you are an implementer, you certainly can provide such a procedure.
>
> If you are a user, and you care about standards conformance,
> you should choose a different identifier, as 1+x has never been a
> standards-conformant identifier under *any* version of the Scheme
> standard.  However, most Scheme implementations will accept 1+x as a
> valid identifier.
>

There are actually many implementations for which this
is an error.  Notably R6RS requires it to be an error.

Schemes typically use the name `add1' for this.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Chicken-hackers] [Scheme-reports] R7RS-small draft ratified by Steering Committee

2014-07-12 Thread Alex Shinn
On Sat, Jul 12, 2014 at 4:03 PM, John Cowan  wrote:

> Alex Shinn scripsit:
>
> > There are actually many implementations for which this is an error.
> > Notably R6RS requires it to be an error.
>
> Not too many outside R6RS, actually: only RScheme, SXM, SigScheme, UMB,
> Dfsch, Foment, Chibi.
>
> > Schemes typically use the name `add1' for this.
>
> This is only available (at least at the REPL) in Racket, Chicken, SISC,
> Sizzle, Vicare, IronScheme, RScheme, SXM.
>

That's still a lot more than provide 1+ out of the box (do any
other than Guile?).

See <http://trac.sacrideo.us/wg/wiki/PlusOneEx> for details.
>

You tested the wrong thing - the original question was about
1+, then 1+x was brought up, not +1x.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Using epsilon in test egg

2014-07-26 Thread Alex Shinn
On Sun, Jul 27, 2014 at 10:03 AM, Matt Gushee  wrote:

> Hmm, just realized something. The test egg documentation also says:
>
>   "Percentage difference allowed ..."
>
> So if the expected value is 0, then no variance is allowed?


Naturally we handle this case correctly and check the
absolute difference in one value is zero.  I've added a
clarification to the docs.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Using epsilon in test egg

2014-07-26 Thread Alex Shinn
On Sun, Jul 27, 2014 at 11:02 AM, Alex Charlton 
wrote:

>
> Matt Gushee writes:
>
> > I guess my explanation wasn't entirely clear, but that's pretty much
> > what I was already doing. I showed the equality predicate I was using,
> > which tests individual values in the list with = . I think my mistake
> > was in assuming that (current-test-epsilon) would apply to = in the
> > test environment. I'm guessing now that that is not the case.
>
> Ah, yes, I hadn’t understood how you were testing. You’re right that =
> does not get redefined to use current-test-epsilon. Instead you would have
> to use your own equality predicate that incorporates it. test defines its
> approx-equal? as:
>
> (define (approx-equal? a b epsilon)
>   (cond
>((> (abs a) (abs b))
> (approx-equal? b a epsilon))
>((zero? b)
> (< (abs a) epsilon))
>(else
> (< (abs (/ (- a b) b)) epsilon
>
> Which you could then add to your predicate like so:
>
> (define (list= l1 l2)
>   (and (= (length l1) (length l2))
>(let loop ((l1* l1) (l2* l2))
>  (cond
>((null? l1*) #t)
>((approx-equal? (car l1*) (car l2*) (current-test-epsilon))
> (loop (cdr l1*) (cdr l2*)))
>(else #f)
>

Easier:

(define list=
  (let ((approx=? (current-test-comparator)))
(lambda (ls1 ls2)
  (and (= (length ls1) (length ls2))
 (every approx=? ls1 ls2)

then

(test-assert (list= '(0.655 0.843 0.200 1.0) (parse-color "167,215,51" #f)))

or

(define-syntax test-rgb
  (syntax-rules ()
((test-rgb expected expr)
 (test-rgb #f expected expr))
((test-rgb name expected expr)
 (parameterize ((current-test-comparator list=))
   (test name expected expr)

(test-rgb '(0.655 0.843 0.200 1.0) (parse-color "167,215,51" #f))

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Using epsilon in test egg

2014-07-27 Thread Alex Shinn
On Sun, Jul 27, 2014 at 10:03 AM, Matt Gushee  wrote:

> Hmm, just realized something. The test egg documentation also says:
>
>   "Percentage difference allowed ..."
>
> So if the expected value is 0, then no variance is allowed? If that's
> true, then epsilon isn't what I want anyway. I need to allow an
> absolute amount of variance that is independent of the values being
> tested.
>

By the way, in general you do _not_ want absolute differences.
The test egg is doing the right thing for general comparisons.
Arguably it may be better to use ULPs (Units in the Last Place),
but they're harder to get at in portable Scheme and not clearly
better.  For more than you ever wanted to know about the subject
see:

http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm

The solution is definitely not to write your own comparison function,
and trust that the test egg is doing the right thing.

-- 
Alex


> Anyway, I'd appreciate help in understanding how this is supposed to work
> ...
>
> On Sat, Jul 26, 2014 at 6:42 PM, Matt Gushee  wrote:
> > Hi, folks--
> >
> > I am working on an application that does a lot of floating-point
> > calculations, and I'm having trouble with the test suite. The program
> > is based on the imlib2 egg, and it does some fairly simple image
> > processing for web applications, so the numbers in question are RGBA
> > color values. Internally I am handling all the numbers as floats in
> > the range 0-1, because the blending and compositing formulas are much
> > more straightforward that way compared to using integers 0-255.
> >
> > So I have a number of functions that produce lists of numbers like these:
> >
> >   '(0.323 0.788834 0.12 0.4)
> >   '(0 0 0 1.0)
> >   '(0.67 0.4 0.562 0.0)
> >
> > And given the above, a moderate degree of imprecision in the results
> > is perfectly acceptable - I haven't yet confirmed this with actual
> > images, but I'm thinking as much as 0.5% variance should work fine
> > (basically, as long as the colors in the generated images look as
> > expected to a casual observer, the result should be acceptable). So of
> > course I want the test results to reflect this.
> >
> > I am using the following procedure to compare number lists:
> >
> >   (define (list= l1 l2)
> >   (and (= (length l1) (length l2))
> >(let loop ((l1* l1) (l2* l2))
> >  (cond
> >((null? l1*) #t)
> >((= (car l1*) (car l2*)) (loop (cdr l1*) (cdr l2*)))
> >(else #f)
> >
> > And for the tests that use this predicate, I set
> >
> >   (current-test-epsilon 0.005)
> >
> > [or 0.002 or 0.001 - it doesn't seem to make any difference]
> >
> > But I'm finding that certain tests still fail unexpectedly, e.g.
> >
> >  2.03.10: (parse-color "167,215,51" #f) => '(0.655 0.843 0.200 1.0)  [
> FAIL]
> >   expected (0.655 0.843 0.2 1.0)
> >   but got (0.654901960784314   0.843137254901961 0.2 1.0)
> >
> > But:
> >
> >   (/ 0.654901960784314 0.655) => 0.999850321808113
> >   (/ 0.843137254901961 0.843) => 1.0001628172028
> >
> > So it would appear that the epsilon value does not apply to these tests.
> >
> > I can certainly define a custom equality predicate that will do what I
> > need, but this is bugging me. I guess I don't really understand how
> > epsilon is supposed to work. The test egg documentation says that
> > applies to 'inexact comparisons', but I can't find a definition of
> > 'inexact comparison'. I have also read that '=' may be unreliable for
> > inexact numbers, but I don't know what else to use. Perhaps 'fp=' from
> > the Chicken library? Then I would have to ensure that all numbers are
> > expressed as floats, whereas currently my code has a number of cases
> > where 1 and 0 are expressed as integers.
> >
> > So ... do I understand the problem correctly? Any recommendations?
> >
> > Matt Gushee
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Using epsilon in test egg

2014-07-28 Thread Alex Shinn
On Mon, Jul 28, 2014 at 9:30 PM, John Cowan  wrote:

> Alex Shinn scripsit:
>
> > The solution is definitely not to write your own comparison function,
> > and trust that the test egg is doing the right thing.
>
> It isn't, though, not quite.  What it needs to do is not a dichotomy of
> "if inexact, use epsilon, otherwise use `equal?`" but rather to have
> a version of `equal?` that uses epsilon when it comes to a float.
> That way comparisons against list or vector structure that contains
> floats (as in the OP's case) will work correctly.
>

I meant the right thing wrt comparing two inexacts, as
opposed to trying to come up with your own inexact=? logic.

It's easy to make it handle nested pairs and vectors correctly,
would require lolevel hackery to handle records, and in general
can't support ffi struct types.  So at some point you need to
provide your own structure comparison, and I chose to make
the rule simple:

  If you explicitly expect a single inexact value, assume
  the result should also be inexact and approximately equal.
  Otherwise use equal?.

If people think it's useful I'd consider walking pairs and vectors.

Regardless, I'll add a utility to make defining tests with your
own comparator easier, and explicitly export test-approx-equal?
so you don't have to capture the initial test comparator.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Using epsilon in test egg

2014-07-28 Thread Alex Shinn
On Mon, Jul 28, 2014 at 10:09 PM, John Cowan  wrote:

> Alex Shinn scripsit:
>
> > If people think it's useful I'd consider walking pairs and vectors.
>
> They are the most important cases, because the expected value is
> expressed as a literal value, and that can only be a list or vector.


Actually, the expected value can be anything.  Literals
may be the more common case though.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Using epsilon in test egg

2014-07-30 Thread Alex Shinn
On Wed, Jul 30, 2014 at 9:33 PM, John Cowan  wrote:

> Peter Bex scripsit:
>
> > While you're looking at that, could you also take a look at this ticket?
> > https://bugs.call-cc.org/ticket/935
>

That was from 22 months ago... I don't remember
exactly when it was fixed but it works now.  Closed
the bug.

In addition, the page on comparing floats that you pointed to now has a
> replacement: <
> http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
> >.
>

Heh, I was actually reading that page, linked from
the previous page, but it gets put inside a frame so
copying the location failed...

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] irregex and callbacks

2014-10-01 Thread Alex Shinn
On Thu, Oct 2, 2014 at 8:22 AM, Andy Bennett  wrote:

> Hi,
>
> I am trying to use the browscap.org database to do HTTP User Agent
> Classification.
>
> This database consists of a (large) number of regexes and data about the
> browser should the user agent string match that regex.
>
> What I want to do is compile all the regexes together and be able to add
> annotations such that I can match a UA string against this regex and get
> back an idea of which pattern matched so that I can look up the
> appropriate data.
>
> i.e. I have a data structure keyed by "pattern" and I want to my input
> to be something that matches that pattern rather than the pattern itself.
>
> It seems that for this I need "Callbacks" but I don't really need full
> callback support: I don't necessarily need to call an actual procedure
> and I don't need to replace anything: I'm not doing a search/replace,
> just a match. "All" I really need is to be able to annotate the FSM node
> that matched with a little bit of data that I can get back.
>


You could use submatch info and check which submatch matched.
This would keep the matching as a single regexp, but you'd then
need a linear scan to see which submatch succeeded.

(define (irregex-merge-vector vec)
  (irregex `(or ,@(map (lambda (x) `(=> alt ,x)) (vector->list vec)

(define ua-vec ...)
(define all-ua-rx (irregex-merge-vector ua-vec))

(define (maybe-match-ua ua)
  (cond
((irregex-match all-ua-rx ua)
 => (lambda (m)
 (vector-reg ua-vec (irregex-match-numeric-index 'match-ua m
'(alt)
(else
  #f)))

although I believe irregex-match-numeric-index is not exported.
It's worth having a utility for this idiom.

-- 
Alex


>
> Is this something that would be easy to add to irregex or can anyone
> suggest any other alternative implementations that I might consider?
>
>
> The PHP library that uses this browscap database (apparently) just does
> a linear search by trying to match each regex in turn but I'd rather
> keep that approach as a last resort.
>
>
>
> Thanks for your help and any tips you can offer.
>
>
>
> Regards,
> @ndy
>
> --
> andy...@ashurst.eu.org
> http://www.ashurst.eu.org/
> 0x7EBA75FF
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] irregex and callbacks

2014-10-02 Thread Alex Shinn
On Thu, Oct 2, 2014 at 10:05 PM, Peter Bex  wrote:

> On Thu, Oct 02, 2014 at 01:47:15PM +0100, Andy Bennett wrote:
> > Hi,
> >
> > > You could use submatch info and check which submatch matched.
> > > This would keep the matching as a single regexp, but you'd then
> > > need a linear scan to see which submatch succeeded.
> >
> > Thanks Alex!
> >
> > I'm trying to avoid the linear scan as there are several tens of
> > thousand entries in the database. How expensive do you think it would be?
>
> My guess is that it will fall back to a backtracking parser, as this would
> certainly exceed the DFA compiler's size limit.  And that's going to be
> extremely slow!
>

You could force DFA compilation, assuming the patterns
don't use any features that require backtracking, and assuming
there's no exponential explosion of states.  Compilation would
still be slow but execution would be fast.  You'd require some
internal help (checking tags in the final state) to quickly check
which pattern matched though.

What are the patterns like?  A specialized solution might be
better here.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] [Chicken-hackers] CHICKEN in production

2014-10-05 Thread Alex Shinn
On Fri, Oct 3, 2014 at 6:19 AM, r  wrote:

>
> Whole logic in Chicken with ffi to platform specific HAL C libraries, our
> audio/video player library uses CPS control-flow to manage complex hardware
> accelerators state so its also pretty schemish even if written in C.
>
> GUI based on the Immediate Mode conception and updating on every frame,
> drawing function takes s-expression layout and make straightforward
> tanslation into graphics accelerator primitives.
>
> https://vimeo.com/107857415
>

That's awesome.  классно!

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] SRFI-99 - What is a variant type?

2014-12-14 Thread Alex Shinn
This appears to be a chicken-specific extension:

http://www.chust.org/fossils/srfi-99/wiki?name=variant+types

I'm not sure I understand that description, but appears to be something
like a union type?

-- 
Alex

On Mon, Dec 15, 2014 at 12:12 PM, Bahman Movaqar  wrote:
>
> Reading the docs on SRFI-99 [1], I need some help understanding what is
> a "variant type". Would someone please pass me a relevant link to read?
>
> --
> Bahman Movaqar
>
> http://BahmanM.com - https://twitter.com/bahman__m
> https://github.com/bahmanm - https://gist.github.com/bahmanm
> PGP Key ID: 0x6AB5BD68 (keyserver2.pgp.com)
>
>
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Installing "combinatorics" - "cock" missing

2014-12-15 Thread Alex Shinn
On Tue, Dec 16, 2014 at 2:51 AM, Bahman Movaqar  wrote:
>
>
> Hah!  Great guess on #chicken Mario! It was my ISP...it installed smoothly
> after bypassing that stupid filter.  Thanks for the help.
>

There was a porn filter applied to all internet traffic which removed the
cock egg? Wow.
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Installing "combinatorics" - "cock" missing

2014-12-16 Thread Alex Shinn
On Tue, Dec 16, 2014 at 8:05 AM, Dan Leslie  wrote:
>
>  I can imagine that this is something that might be present on more than a
> few corporate networks.
>
> Perhaps it's best to simply rename the cock egg?
>

I'm tempted to write a schlong [1] egg in protest.

[1] Scheme Long arithmetic


>
> -Dan
>
>
> On 14-12-15 03:01 PM, Alex Shinn wrote:
>
>  On Tue, Dec 16, 2014 at 2:51 AM, Bahman Movaqar 
> wrote:
>>
>>
>>   Hah!  Great guess on #chicken Mario! It was my ISP...it installed
>> smoothly after bypassing that stupid filter.  Thanks for the help.
>>
>
>  There was a porn filter applied to all internet traffic which removed
> the cock egg? Wow.
>
>
>
> ___
> Chicken-users mailing 
> listChicken-users@nongnu.orghttps://lists.nongnu.org/mailman/listinfo/chicken-users
>
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Happy Christmas

2014-12-25 Thread Alex Shinn
Merry Christmas and Happy New Year from Tokyo!

May 2015 be filled with Scheming!

-- 
Alex

On Thu, Dec 25, 2014 at 1:39 PM, Bahman Movaqar  wrote:
>
> Merry Christmas and happy new year.
>
> With best of wishes for CHICKEN'ers and their families from ancient Iran.
>
> Bahman
>
>
>  Original message 
> From: Felix Winkelmann
> Date:24/12/2014 16:10 (GMT+03:30)
> To: chicken-users@nongnu.org
> Cc: chicken-hack...@nongnu.org
> Subject: [Chicken-users] Happy Christmas
>
> Hey!
>
>
> I wish all of you a very happy christmas and a blissful new year!
>
>
> felix
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Parsing HTML, best practice with Chicken

2014-12-29 Thread Alex Shinn
On Tue, Dec 30, 2014 at 3:47 AM, mfv  wrote:

> Hello,
>
> > I somehow always manage to get it working with sxpath when I need to do
> > some web scraping, but it's somewhat painful.
>
> Thanks, I will have a look at sxpath.
>
>
> > >  Are there any packages like Python's Beautifulsoup in the Chicken
> > > arsenal?
> >
> > That sort of thing is sorely lacking.  There's a promising "zipper"
> > library written by Moritz Heidkamp, but so far it's unreleased and
> > undocumented.  If you're feeling very adventurous you could have
> > a look at it: https://bitbucket.org/DerGuteMoritz/zipper
>
> Pity. I will have a look at the BeautifulSoup source. Maybe I can
> copy/mimic some
> sort of its functionality.
>

html-parser is intended to be the parsing side of BeautifulSoup.
The idea is to do one thing well, and leave it up to other libraries
to do matching and extraction.  As Peter says, matchable can be
cumbersome here because it doesn't do unordered matching.

If you find any bugs or surprising behavior in html-parser please
let me know.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] about egg/pty

2015-02-21 Thread Alex Shinn
Yes, it was a quick hack, and if I recall inspired by
another egg which did something similar (maybe ssl?).

Patches for large fileno support are welcome.  As Jim
says, you'd also need to switch the select(2) handling
to use poll(2), although it's probably better to remove
the manual non-blocking support and just use
open-input/output-file* from the posix library.

-- 
Alex

On Sat, Feb 21, 2015 at 12:24 PM, Jim Ursetto  wrote:

> 1024 is the default value of FD_SETSIZE i.e. the largest fd one can use
> with select(), so I think you have the right idea.  Use of a magic constant
> might be reconsidered though.  Could also convert the code to use poll()
> like Chicken core was some time ago.
> Jim
>
>
> > On Feb 20, 2015, at 02:27, Christian Kellermann 
> wrote:
> >
> > Hi!
> >
> > Chaos Eternal  writes:
> >
> >> uses a magic number 1024 , which used to pack master fd and slave fd
> >> into one integer to return.
> >>
> >> but why 1024 and what happens when the slave fd is greater than 1024?
> >
> > I am not the author but 1024 happens to be the default maximum number of
> > open files for most systems(tm).
> >
> > And yes if that assumption is wrong it will break and do strange things.
> >
> > As I see it it's just a quick hack to get both values out of C without
> > further FFI tricks. Maybe I am wrong and Alex can explain the rationale
> > behind it, if there is any.
> >
> > Kind regards,
> >
> > Christian
> >
> > --
> > May you be peaceful, may you live in safety, may you be free from
> > suffering, and may you live with ease.
> >
> >
> > ___
> > Chicken-users mailing list
> > Chicken-users@nongnu.org
> > https://lists.nongnu.org/mailman/listinfo/chicken-users
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Strange memory leak with lazy-seq

2015-02-23 Thread Alex Shinn
On Mon, Feb 23, 2015 at 9:57 PM, Kooda  wrote:

> Hi!
>
> I’ve been playing with lazy-seq for the past few days and found a very
> strange behaviour:
>
> The heap of the following program keeps growing rapidly in csi, running
> the same program after compilation seems to slow down the growth quite a
> lot but the heap isn’t constant as I was expecting it to be.
>
> Here is a test case of the problem:
>
>
> ; Start this script with `csi -:D -:hi100k -:hg101` to observe heap
> resizing
>
> (use lazy-seq)
>
> (define (complex-stream seq)
>   (lazy-map identity seq))
>
> ; This seems to leak:
> (lazy-each void (complex-stream (lazy-numbers)))
>
>
> ; This doesn't:
> #;(lazy-each void (lazy-map identity
>   (lazy-numbers)))
>


You may be falling short of the issue described by SRFI 45,
which is that in all known Scheme implementations:

  (define (loop) (delay (force (loop
  (force (loop))

leaks memory.  In R7RS this becomes

  (define (loop) (delay-force (loop)))

which is required by the standard not to leak.

I'm not sure why you don't observe a leak in the
second example.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Strange memory leak with lazy-seq

2015-02-24 Thread Alex Shinn
On Tue, Feb 24, 2015 at 7:17 PM, Moritz Heidkamp  wrote:

> Hi Alex,
>
> On 24 February 2015 00:13 CET, Alex Shinn wrote:
>
> > You may be falling short of the issue described by SRFI 45,
> > which is that in all known Scheme implementations:
> >
> >   (define (loop) (delay (force (loop
> >   (force (loop))
> >
> > leaks memory.  In R7RS this becomes
> >
> >   (define (loop) (delay-force (loop)))
> >
> > which is required by the standard not to leak.
> >
> > I'm not sure why you don't observe a leak in the
> > second example.
>
> Kooda and I discussed this issue on IRC yesterday and in fact, the first
> version doesn't leak when compiled either (he mixed up results when
> writing this email). So either the CHICKEN compiler is the first Scheme
> implementation to not leak memory in this case or something else is
> going on :-)
>

Well, if lazy-seq doesn't actually use delay + force then it's
not an exception :)

I double checked, and for the code I wrote with delay + force,
Chicken leaks both interpreted and compiled.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] u8vector to numbers bignum

2015-05-28 Thread Alex Shinn
On Thu, May 28, 2015 at 3:59 PM, Peter Bex  wrote:

>
> Yeah, the numbers random implementation is shitty, which is why I
> decided to omit it from my port to CHICKEN core (CHICKEN 5's random is
> still fixnum only).


This is surprisingly hard to do well - the Chibi SRFI-27
implementation had a high bug frequency, and I still
haven't done a proper analysis of the distribution for
different ranges.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] utf8 failing to "deploy" and request for suggestions on how to install multiple versions of an egg.

2015-10-05 Thread Alex Shinn
On Mon, Oct 5, 2015 at 11:42 PM, Matt Welland 
wrote:

> I need to install multiple versions of my refdb and logpro eggs. My first
> thought was to take advantage of deploy but I get the error below. Other
> than hacking them to make a local install I'm out of ideas, any suggestions?
>
> chicken-install -deploy -p $PWD/refdb refdb
>
> that is failing here:
>
> changing current directory to /tmp/temp8135.38302/utf8
>   '/p/f/env/pkgs/chicken/4.9.0.1/bin/csi' -bnq -e "(require-library
> setup-api)" -e "(import setup-api)" -e "(setup-error-handling)" -e
> "(extension-name-and-version '(\"utf8\" \"3.4.1\"))" -e
> "(destination-prefix \"/tmp/refdb\")" -e "(runtime-prefix \"/tmp/refdb\")"
> -e "(deployment-mode #t)" 'utf8.setup'
> make: making utf8-lolevel.so
>   '/p/f/env/pkgs/chicken/4.9.0.1/bin/csc' -feature compiling-extension
> -deployed -fixnum-arithmetic -inline -local -s -O3 -d0 -j utf8-lolevel
> utf8-lolevel.scm
>   '/p/f/env/pkgs/chicken/4.9.0.1/bin/csc' -feature compiling-extension
> -deployed -s -O2 -d0 utf8-lolevel.import.scm
> make: making utf8.so
>   '/p/f/env/pkgs/chicken/4.9.0.1/bin/csc' -feature compiling-extension
> -deployed -fixnum-arithmetic -inline -local -s -O2 -d1 -j utf8 utf8.scm
>
> Warning: renamed identifier not imported: (utf8-string? valid-string?)
>
> Warning: exported identifier of module `utf8' has not been defined:
> valid-string?
>
> Error: module unresolved: utf8
>

I can't reproduce this (same chicken version, osx, with or without -deploy).
Could there be a conflict with an old version of utf8 in your search path?
The identifier in question is defined, though was added somewhat recently.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Regex fail?

2015-11-01 Thread Alex Shinn
On Fri, Oct 30, 2015 at 10:01 PM, John Cowan  wrote:

> Peter Bex scripsit:
>
> > Note the nonl, which the manual states is equivalent to ".", but of
> > course nonl means "no newline".
>
> Dot in regular expressions has *always* meant "match any character but a
> newline".  It doesn't come up that much in Unix commands, which typically
> process their input line by line anyway.  But if you look at the Posix
> definition or the Perl one, you see that dot is indeed equivalent to
> "nonl".
> Indeed, "nonl" exists in order to have an SRE equivalent for dot.
>
> > Maybe Alex can give us some info about why this is the case?  I think
> this
> > may have something to do with the multi-line / single-line distinction
> > (which, to be honest, I never really understood).
>
> Multi-line and single-line mean totally different things: you can use one
> of them or both or neither.  Multi-line mode means that ^ and $ will match
> the beginning and end of a line as well as the beginning and the end of
> the string.  In non-multi-line mode, they match only the beginning and
> the end of the string.  Single-line mode means that dot matches newline;
> non-single-line mode means that it does not.
>

Yes, exactly.  The terminology "single-line" (/s) and "multi-line" (/m)
come from Perl though, and I think are confusing.  But these flags
exist only for PCRE compatibility, so I don't think it's worth changing
them.  With SREs there is no confusion: you always say explicitly
`any' or `nonl', `bol/eol' or `bos/eos'.

Note this is the same in Ruby regexen (which also allows a /m flag).

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Help with columnar formatting in the fmt egg

2015-11-30 Thread Alex Shinn
On Fri, Nov 20, 2015 at 5:53 AM, Christian Kellermann 
wrote:

>
> I would like to ask some help for finding the right fmt expression to
> print entries formatted as like this:
>
> 2015-11-20 foo bar baz... Some·Label   Some·Other·Label
>  -123.23-100.00
>   Yet·Another·Label
>  -23.23
>
> The hard part is obviously the last columns. Both are fed in a list of
> entries consisting of (label amount).  All positive amounts should be
> printed in the first column the negative ones in the second.
>

Your example is split across 4 lines, is this what you intended?
It's not really clear what the rule is.

The columnar/tabular formatters are oriented around formatting
single rows, so if you want to use them you'd need to first group
the data accordingly.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Using fmt and numbers eggs together

2016-01-05 Thread Alex Shinn
On Sun, Jan 3, 2016 at 1:48 AM, Peter Bex  wrote:

> On Sat, Jan 02, 2016 at 11:40:47AM -0500, Sudarshan S Chawathe wrote:
> > I seem to get incorrect output and errors in some cases when using the
> > fmt and numbers eggs together.  A brief transcript illustrating the
> > problem is included below.  In brief:
> >
> >   * (fix 30 2/3) doesn't behave as indicated in the docs.
> >
> >   * Some large numbers cause errors.
> >
> > When not using the numbers egg, fmt's behavior seems to be as expected
> > (given Chicken's implementation of numbers without the 'numbers' egg).
> >
> > Is this a known limitation of fmt with 'numbers'?
>
> I'm afraid so.  As far as I know the only way to fix this is to add a
> hard dependency on numbers to fmt.  This may not be desirable depending
> on what you're using fmt for.
>
> This is a result of the second class treatment that extended numbers get,
> and will be fixed in CHICKEN 5, due to integration of the full numeric
> tower into core.
>

Note this is in contrast to how 'numbers' used to work, where
loading it overloaded all numeric operations globally.  The fmt
documentation still reflects this previous behavior.

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


[Chicken-users] [CFP] Scheme and Functional Programming Workshop 2016

2016-02-17 Thread Alex Shinn
Call For Presentations

17th Annual Scheme and Functional Programming Workshop
Nara, Japan (Co-located with ICFP 2016)
18 September 2016

http://scheme2016.snow-fort.org/



The 2016 Scheme and Functional Programming Workshop is calling for
submissions.  This year we are accepting general presentation
proposals in addition to papers.

Submissions related to Scheme, Racket, Clojure, and functional
programming are welcome and encouraged. Topics of interest include
but are not limited to:

Program-development environments, debugging, testing
Implementation (interpreters, compilers, tools, benchmarks, etc.)
Syntax, macros, hygiene
Distributed computing, concurrency, parallelism
Probabilistic computing
Interoperability with other languages, FFIs
Continuations, modules, object systems, types
Theory, formal semantics, correctness
History, evolution and standardization of Scheme
Applications, experience and industrial uses of Scheme
Education
Scheme pearls (elegant, instructive uses of Scheme)

We also welcome submissions related to dynamic or multiparadigmatic
languages and programming techniques.



Full submissions are due 24 June 2016.
Authors will be notified by 22 July 2016.
Camera-ready versions are due 15 August 2016.
Workshop is 18 September 2016.
All deadlines are 23:59 (UTC-12, "Anywhere on Earth").

Paper submissions must be in ACM proceedings format, no smaller than
9-point type (10-point type preferred). Microsoft Word and LaTeX
templates for this format are available at:
http://www.acm.org/sigs/sigplan/authorInformation.htm

Paper submissions should be in PDF and printable on US Letter, and
generally in the range of 6 to 12 pages.

Presentation submissions should include an outline of the material.
Talks are 40 minutes, including questions and answers.

More information available at: http://scheme2016.snow-fort.org/



Organizers:

Alex Shinn (General Chair)
Kathryn Gray (Program Chair)

(Apologies for duplications from cross-posting.)

-- 
Alex
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


[Chicken-users] Second CFP for Scheme and Functional Programming Workshop 2016

2016-05-28 Thread Alex Shinn
SECOND NOTICE
Call For Presentations

17th Annual Scheme and Functional Programming Workshop
WEBSITE: http://scheme2016.snow-fort.org/
LOCATION: Nara, Japan (Co-located with ICFP 2016)
DATE: 18 September 2016



The 2016 Scheme and Functional Programming Workshop is calling for
submissions.  This year we are accepting general presentation
proposals in addition to papers.

Submissions related to Scheme, Racket, Clojure, and functional
programming are welcome and encouraged. Topics of interest include
but are not limited to:

Program-development environments, debugging, testing
Implementation (interpreters, compilers, tools, benchmarks, etc.)
Syntax, macros, hygiene
Distributed computing, concurrency, parallelism
Probabilistic computing
Interoperability with other languages, FFIs
Continuations, modules, object systems, types
Theory, formal semantics, correctness
History, evolution and standardization of Scheme
Applications, experience and industrial uses of Scheme
Education
Scheme pearls (elegant, instructive uses of Scheme)

We also welcome submissions related to dynamic or multiparadigmatic
languages and programming techniques.



Important Dates:

24 June 2016 - Submissions deadline
22 July 2016 - Author notification
15 August 2016 - Camera-ready deadline
18 September 2016 - Workshop
All deadlines are 23:59 (UTC-12, "Anywhere on Earth").

Paper submissions must be in ACM proceedings format, no smaller than
9-point type (10-point type preferred). Microsoft Word and LaTeX
templates for this format are available at:
http://www.sigplan.org/Resources/Author/

Paper submissions should be in PDF and printable on US Letter, and
generally in the range of 6 to 12 pages.

Presentation submissions should include an outline of the material.
Talks are 40 minutes, including questions and answers.

More information available at: http://scheme2016.snow-fort.org/



Organizers:

Alex Shinn (general chair)
Kathy Gray (program chair)

(Apologies for duplications from cross-posting.)

___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] two-dimensional syntax-rules

2017-03-11 Thread Alex Shinn
It's not a bug, it's the template which is an error.  See the discussion on
the chibi list:
https://groups.google.com/d/msg/chibi-scheme/7fTzofNrPrI/Yy_aibRdBQAJ

-- 
Alex

On Fri, Mar 10, 2017 at 7:16 PM, Peter Bex  wrote:

> On Fri, Mar 10, 2017 at 11:10:35AM +0100, Sascha Ziemann wrote:
> > 2017-03-10 10:55 GMT+01:00 Peter Bex :
> > >
> > > Gauche and Racket accept this macro application, Scheme48 rejects it
> (but
> > > that's expected, because our syntax-rules is originally from Scheme48).
> > >
> >
> > But Gauche fails like Chibi. They silently ignore the last ellipsis.
>
> Oh, good point.  I didn't think to check the results for correctness :)
>
> I've added this info to a ticket: http://bugs.call-cc.org/ticket/1351
>
> Cheers,
> Peter
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] is the readline egg dead?

2019-03-28 Thread Alex Shinn
There's also (chibi term edit-line) which is a pure R7RS implementation and
should work with Chicken.
I haven't put it up on snow-fort yet though.

-- 
Alex

On Sat, Mar 23, 2019 at 10:17 PM Juergen Lorenz  wrote:

> Hi all,
> there is a very easy to use and simple alternative to readline and
> consorts: rlwrap
> It's an external package and thas has the advantage to be usable not
> only with chicken, but with every program without readline support.
> Cheers
> Juergen
> --
>
> Dr. Juergen Lorenz
> Gruener Weg 27
> 29471 Gartow
>
> ___
> Chicken-users mailing list
> Chicken-users@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-users
>
___
Chicken-users mailing list
Chicken-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-users


[Chicken-users] ANN: scheme-complete.el - smart tab completion

2007-10-16 Thread Alex Shinn
scheme-complete.el is a single function that can be used with any
Emacs scheme mode.  It provides real-time, lexical-scope aware
type inferencing tab-completion for any R5RS scheme, with
extensibility for implementation-specific features (currently only
Chicken and Gauche are customized).

For example, given the text

  (string-ref (n^

where the cursor is represented by ^, typing tab (or whatever you
bind the completion function to) would know that in the default
R5RS environment the only possible completion of a procedure
returning a string and beginning with "n" is number->string and
would complete that for you automatically.

Given

  (let ((len (string-length str)))
(string-ref str (- ^

completing would fill in "len" as the only possible completion
since a number is required as an argument to "-" and all the
standard R5RS bindings are procedures and syntax.

Relying on this completion for known type procedures is a handy
way to avoid type errors, even before the compilation.  In more
general cases it's just nice to basic have pruning, such as not
completing syntax in a non-operator position.

Currently completion inside strings does filename completion,
though this may be made for flexible in the future.  For Chicken
and Gauche, completing after "(use " will complete on all
currently installed modules.

Also includes optional eldoc-mode support, to flash a docstring
for the current function.

All parsing is done real-time (I've had no speed issues yet), and
is very careful to handle incomplete input gracefully.

Available at http://synthcode.com/emacs/scheme-complete.el, with
setup instructions at the top of the file.

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Test egg question

2007-11-04 Thread Alex Shinn
Hi,

On Nov 4, 2007 12:02 AM, Peter Busser <[EMAIL PROTECTED]> wrote:
>
> I'm writing a number of test cases using the test egg. I would like to use
> the test program for automated testing. Is it possible to know that one of
> the tests failed, so I can exit the program with an error value?

I was thinking of adding something like this.  In the meantime
you can hack it with something like this:

(define (test-group-ref group key default)
  (cond ((assq key (cdr (current-test-group))) => cdr)
(else default)))

(define all-tests-passed?
  (zero? (+ (or (test-group-ref (current-test-group) 'FAIL #f) 0)
(or (test-group-ref (current-test-group) 'ERROR #f) 0

(test-end)

(exit (if all-tests-passed? 0 1))

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Test egg question

2007-11-04 Thread Alex Shinn
> On Nov 4, 2007 12:02 AM, Peter Busser <[EMAIL PROTECTED]> wrote:
> >
> > I'm writing a number of test cases using the test egg. I would like to use
> > the test program for automated testing. Is it possible to know that one of
> > the tests failed, so I can exit the program with an error value?

OK, I just checked in a test-exit procedure:

  (test-exit [])

which exits the process with 0 if all tests have passed,
and the optional exit code (default 1) if there have been
any errors at all.

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


[Chicken-users] ANN: scheme-complete.el 0.3

2007-11-12 Thread Alex Shinn
A new release 0.3 of scheme-complete.el is available at

  http://synthcode.com/emacs/scheme-complete.el

It includes many bugfixes, and now works out of the box
in GNU Emacs 21 and 22, and XEmacs 21.

New features include smarter inference by determining
variable types bound in LET, as well as a new feature
to complete filename arguments (e.g. the argument to
LOAD) with not only the current lexical environment but
also including filenames in the completion list.  If a filename
is completed (or a prefix that could only belong to filenames)
then the argument will automatically be wrapped in quotes
if not already.  For example, with

  (load te^

where ^ represents the cursor position, and assuming
that uniquely expands the file test.scm, then expanding
would result in

  (load "test.scm^"

Note: If you used SCSH inside of M-x run-scheme +
paredit-mode + scheme-complete then you could have
an interactive scheme shell which in typical cases only
requires one more keystroke (the initial paren) than other
shells like bash, and in more complex cases requires
fewer keystrokes (moduolo the names of commands
which can be aliased or tab-completed anyway).  You could
also tweak cmuscheme to always insert the initial parens
for you.  Better completion would require filtering files
on regexps (e.g. only completing .scm files for LOAD).

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] ANN: scheme-complete.el 0.3

2007-11-13 Thread Alex Shinn
On Nov 13, 2007 3:39 PM, felix winkelmann <[EMAIL PROTECTED]> wrote:
> On Nov 13, 2007 7:05 AM, Alex Shinn <[EMAIL PROTECTED]> wrote:
> > A new release 0.3 of scheme-complete.el is available at
> >
> >   http://synthcode.com/emacs/scheme-complete.el
> >
>
> Hm... Server down?

It's back now.  Try

  http://synthcode.com/emacs/scheme-complete.el.gz

for a faster download.

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] ANN: scheme-complete.el 0.3

2007-11-13 Thread Alex Shinn
On Nov 14, 2007 3:51 PM, felix winkelmann <[EMAIL PROTECTED]> wrote:
>
> Thanks! Is there a way to use it even if chicken is not installed
> in /usr/local?

You can use it, it just won't know about Chicken's non-standard
modules.

I introduced a bug at the last minute, I'll make a new release shortly
with a fix, and which also searches some common paths for Chicken
and Gauche.  I'd prefer to automate things for now, though I may add
some defcustoms later.

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] ANN: scheme-complete.el 0.3

2007-11-13 Thread Alex Shinn
On Nov 14, 2007 4:05 PM, Ivan Shmakov <[EMAIL PROTECTED]> wrote:
>
> Could the server be configured to send
>
> Content-Type: application/emacs-lisp
> Content-Transfer-Encoding: gzip
>
> instead of
>
> Content-Type: application/x-gzip
>
> so that browsers will allow to see the referenced file contents
> without saving?

I may be mistaken, but isn't that for the case when the
server compresses the .el file on the fly?  If you do that
for a file named .el.gz, then wouldn't the browser save the
decompressed data still under the name .el.gz?

The server doesnt yet support automatic compression,
though I've been meaning to rewrite it for ages now.

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] ANN: scheme-complete.el 0.3

2007-11-14 Thread Alex Shinn
On Nov 14, 2007 3:51 PM, felix winkelmann <[EMAIL PROTECTED]> wrote:
>
> Thanks! Is there a way to use it even if chicken is not installed
> in /usr/local?

Fixed and uploaded as 0.4.  It also supports the
CHICKEN_REPOSITORY env var, and for Gauche
now supports the GAUCHE_LOAD_PATH.

There are still a lot of limitations (and always will be,
since it's dealing with potentially incomplete code),
but is still quite nice, and at the current version I
don't think will break or do anything unexpected to
you.  And even without any module system support
it knows R5RS plus the top-level definitions of the
file you're working in plus the current lexical scope.

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


[Chicken-users] chicken lottery!

2007-12-11 Thread Alex Shinn
Ladies, gentlemen and other, we've reached 350 eggs,
which means some lucky contributor is going to be the
proud owner of their very own Chicken T-shirt!  This
time we're inviting you all to witness the choosing live.

At midnight UTC this Friday the 14th (7pm Thursday
night US Eastern Standard time, 4pm Pacific) anyone
interested should join #nethack on irc.freenode.net,
where a resident bot will be asked to make a random
die roll to determine the winner.  Based on the current
contributors from

  http://chicken.wiki.br/the-chicken-lottery

we've made the following table

  Roll #eggs  Name
 
  1-23  23Ivan Raikov
 24-32   9Arto Bendiken
 33-40   8Tony Sidaway
 41-43   3Will Farr
 44-46   3Ben Kurtz
 47-49   3Vo minh Thu
 50-52   3Shawn Wagner
 53-54   2Taylor Campbell
 55-56   2Naruto Canada
 57-58   2Mario Domenech Goulart
 59-60   2Ivan Shmakov
 61-62   2Alaric Blagrave Snell-Pym
63   1Peter Bex
64   1Terrence Brannon
65   1Hans Bulfone
66   1Certainty
67   1Thomas Chust
68   1Sven Hartrumpf
69   1Tony and Martin Sidaway
70   1Alex Sandro Queiroz e Silva
71   1Jean-Philippe Theberge
72   1Zbigniew
73*** re-roll twice! ***

Duplicates (including 73) are ignored, so at most
2 people can get a shirt.  In the event that 69 is
rolled, the joint developers Tony and Martin Sidaway
will be put into a steel cage match to fight for the
death for the right to claim the T-shirt.  Or possibly
a chicken-wire cage.

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


Re: [Chicken-users] Re: UTF-8 support

2007-12-13 Thread Alex Shinn
On Dec 13, 2007 11:32 PM, Tobia Conforto <[EMAIL PROTECTED]> wrote:
>
> Can you (or anybody else) give an example of different behaviour with
> the option turned on and off?  I did a couple of tests and can't see any
> difference, but I admit I have yet to look at the source code.

The only two differences are

  1) . matches a full utf-8 character with the option on,
  whereas with the option off it would match one byte
  of a utf-8 char (thus in Zbigniew's example you get
  the two bytes \316 and \273 instead of the λ)

  2) character classes treat the characters as utf-8 encoded
  with the option on, and as a sequences of bytes with it off

1 is surprisingly rare - you usually use .* or .+, which turn out
to be identical with the option on or off.  2 is only common in
non-English linguistic applications.

-- 
Alex


___
Chicken-users mailing list
Chicken-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/chicken-users


  1   2   3   4   >