Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Shawn Rutledge scripsit: > But you would want the usual string operations to work with either > kind of string, right? Indeed. > It could follow from the general principle of separating metadata from > data: Put the encoding in the extended attributes of the file, or > resource fork if you've

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Shawn Rutledge
On Tue, Mar 18, 2008 at 4:50 PM, John Cowan <[EMAIL PROTECTED]> wrote: > I'm not arguing that point. I'm arguing that there should be two > different kinds of strings, one of which is UTF-8 and one of which > contains single-byte characters in an unspecified ASCII-compatible > encoding, *and t

Re: [Chicken-users] Can we update http://chicken.wiki.br/releases/ please?

2008-03-18 Thread Ivan Raikov
Hello, We were trying to make sure that chicken-setup works in MinGW before making the next release. Today and tomorrow I will be running salmonella with the latest modifications to chicken-setup, and if that works, I will make a new stable release. -Ivan Robin Lee Powell <[EMAIL PROTEC

[Chicken-users] shootout benchmark: mandelbrot

2008-03-18 Thread Tobia Conforto
Howdy I took a shot at the shootout (ha!) starting with an easy one, Mandelbrot. It's basically pure loops of flonum operations. By using the unsafe number operations and by experimenting with a few loop layouts, I managed to reduce the Chicken score from 35 times gcc (previous submissi

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Shawn Rutledge scripsit: > That is a huge advantage. I think unless there are some > insurmountable gotcha's, or it causes major efficiency problems, there > are some good arguments for using UTF-8 for strings in Chicken. I'm not arguing that point. I'm arguing that there should be two differen

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Shawn Rutledge
On Tue, Mar 18, 2008 at 1:53 PM, John Cowan <[EMAIL PROTECTED]> wrote: > > Let's see... ASCII is valid UTF-8, so all ASCII external > > representations wouldn't need any encoding or decoding work. That is a huge advantage. I think unless there are some insurmountable gotcha's, or it causes majo

Re: [chicken-users] silex GPL-2 licensed?

2008-03-18 Thread Leonardo Valeri Manera
On 18/03/2008, John Cowan <[EMAIL PROTECTED]> wrote: > The code does not contain any license at all, and the manual only says that > the author hopes it will be helpful for many Scheme programmers. So this > egg is "use at your own risk". > > Someone should contact the author. I'll do that now

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Tobia Conforto scripsit: > Let's see... ASCII is valid UTF-8, so all ASCII external > representations wouldn't need any encoding or decoding work. True. However, pure ASCII is less comment than people believe, as indicated by the 59K Google hits for "8-bit ASCII". > Most recent formats and pr

Re: [chicken-users] silex GPL-2 licensed?

2008-03-18 Thread John Cowan
Leonardo Valeri Manera scripsit: > Why is silex GPL-2 now? The code does not contain any license at all, and the manual only says that the author hopes it will be helpful for many Scheme programmers. So this egg is "use at your own risk". Someone should contact the author. -- By Elbereth and

[chicken-users] silex GPL-2 licensed?

2008-03-18 Thread Leonardo Valeri Manera
Why is silex GPL-2 now? This screws it up for everyone making non-GPL-2 stuff that uses easyffi or one of the many eggs that depend on it... *slitwrist Leo ___ Chicken-users mailing list Chicken-users@nongnu.org http://lists.nongnu.org/mailman/listin

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Tobia Conforto
John Cowan wrote: If we all lived in a UTF-8/LF world exclusively, then that would be fine. As it is, many of us are not in that world at all, and few of us are in it exclusively. So in practice it is necessary to convert between internal and external encodings anyhow, which involves copy

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Tobia Conforto scripsit: > This discussion has convinced me that from a *practical* point of > view, it makes a lot of sense to use the same underlying object for > both kinds of operation, instead of copying over the contents every > time you want to switch between the two views (as I suppo

[Chicken-users] shootout benchmark: ring

2008-03-18 Thread Graham Fawcett
Hi folks, Last night I worked on a submission for the 'ring' benchmark in the Shootout: http://shootout.alioth.debian.org/sandbox/benchmark.php?test=threadring&lang=all After a couple different approaches, the one that worked best by far was to use the 'mailbox' egg, which we cannot do in the Sh

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Tobia Conforto
John Cowan wrote: The difference between restricted and unrestricted strings may not be as large as the distinction between pairs and fixnums, but it's the same *kind* of difference. I beg to differ. A pair is no fixnum, and vice-versa. They're two disjoint domains. On the other hand, an

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Alex Shinn
> "Graham" == Graham Fawcett <[EMAIL PROTECTED]> writes: Graham> On Tue, Mar 18, 2008 at 12:22 PM, John Cowan <[EMAIL PROTECTED]> wrote: >> It wouldn't solve the data-punning problem. As long >> as the same object can be seen one way by one module >> and another way by anothe

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Graham Fawcett scripsit: > Just curious, whence the 'restricted' terminology? I would have > thought 'utf8 and raw/byte strings' since that's the practical > implication. Restricted strings are restricted to holding characters between #\x0 and #\xFF. Unrestricted strings can hold any character b

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Graham Fawcett
On Tue, Mar 18, 2008 at 12:22 PM, John Cowan <[EMAIL PROTECTED]> wrote: > It wouldn't solve the data-punning problem. As long as the same object > can be seen one way by one module and another way by another, problems > will continue to be endemic. To fix that, we need two run-time types, > w

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Alex Shinn scripsit: > Why do we need this? That's not rhetorical, I'd like to > hear of any use cases where you think a problem could arise. We need it because Scheme is a strongly (dynamically) typed language. If FOO passes BAR a pair, and BAR is expecting an exact integer, the programmer exp

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Kon Lovett
On Mar 18, 2008, at 8:46 AM, John Cowan wrote: Kon Lovett scripsit: In my usage "byte-string" means "octet-string". See the "levenshtein" egg. I mean a "blob". I'd call that a byte vector. "blob" is incorrect in the "levenshtein" egg context. However, not incorrect in a previous post.

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Alex Shinn
> "John" == John Cowan <[EMAIL PROTECTED]> writes: John> felix winkelmann scripsit: >> A real module system would solve all these problems >> cleanly. John> It wouldn't solve the data-punning problem. As John> long as the same object can be seen one way by one John> m

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
felix winkelmann scripsit: > A real module system would solve all these problems cleanly. It wouldn't solve the data-punning problem. As long as the same object can be seen one way by one module and another way by another, problems will continue to be endemic. To fix that, we need two run-time

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Tobia Conforto scripsit: > So they're ditching {"byte", u"unicode"} strings in favor of {b"byte", > "unicode"} ones? What are they ditching exactly? It seems to me > they're just switching the default. Maybe so, but the word "just" is probably not appropriate, as switching defaults has a hu

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Kon Lovett scripsit: > In my usage "byte-string" means "octet-string". See the "levenshtein" > egg. I mean a "blob". I'd call that a byte vector. The issue is, when you get a component, what comes out, a character or an exact integer? -- Work hard, John C

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Kon Lovett
On Mar 18, 2008, at 7:01 AM, John Cowan wrote: Graham Fawcett scripsit: So, a byte string would simply be a string with a null auxilliary vector. That doesn't work. A byte-string is not a sequence of characters from the ASCII repertoire, it's a sequence of characters from the repertoire

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread felix winkelmann
On Tue, Mar 18, 2008 at 12:05 PM, Alex Shinn <[EMAIL PROTECTED]> wrote: > > That's just changing the procedures used to access strings. > Changing the fundamental string representation is a more > substantial change by an order of magnitude, involving > changes to the core compiler and the FFI

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Tobia Conforto
John Cowan wrote: Tobia Conforto scripsit: This is more or less how other languages, such as Python, solved the issue. Two kinds of strings, byte and unicode, and overloading a few string operations to have a slightly different meaning when called on either, computing byte length vs. cha

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Graham Fawcett
On Tue, Mar 18, 2008 at 10:01 AM, John Cowan <[EMAIL PROTECTED]> wrote: > Graham Fawcett scripsit: > > > So, a byte string would simply be a string with a null auxilliary vector. > > That doesn't work. A byte-string is not a sequence of characters from > the ASCII repertoire, it's a sequence of

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Tobia Conforto scripsit: > This is more or less how other languages, such as Python, solved the > issue. Two kinds of strings, byte and unicode, and overloading a few > string operations to have a slightly different meaning when called on > either, computing byte length vs. character length

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread John Cowan
Graham Fawcett scripsit: > So, a byte string would simply be a string with a null auxilliary vector. That doesn't work. A byte-string is not a sequence of characters from the ASCII repertoire, it's a sequence of characters from the repertoire {ASCII set, characters numbered 129 through 255 with

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Leonardo Valeri Manera
On 18/03/2008, Graham Fawcett <[EMAIL PROTECTED]> wrote: > For what it's worth, I also think that GMP should be in the core, and > that no one, nowhere should be allowed to publish an egg with a > toplevel procedure named (format) in it. Mysterious toplevel > interactions between indirect depen

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Graham Fawcett
On Tue, Mar 18, 2008 at 7:05 AM, Alex Shinn <[EMAIL PROTECTED]> wrote: > > "Tobia" == Tobia Conforto <[EMAIL PROTECTED]> writes: > > > Tobia> Graham Fawcett wrote: > >> Here's another thought. It seems to me that if we > >> were to represent strings as composite values, e.g. a >

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Alaric Snell-Pym
The entire problem revolves around adding Unicode support as an option, without modifying the core. *If* we allow ourselves to modify the core, then there is no problem at all, and we can just copy the utf8 egg code over the existing string procedures, and add in some procedures for byte-level ac

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Alex Shinn
> "Peter" == Peter Bex <[EMAIL PROTECTED]> writes: Peter> On Tue, Mar 18, 2008 at 11:41:08AM +0900, Alex Shinn wrote: >> > "Kon" == Kon Lovett <[EMAIL PROTECTED]> >> writes: >> Kon> Summary: I want a byte-string API. I want string Kon> integrations. I want global U

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Alaric Snell-Pym
On 18 Mar 2008, at 2:29 am, Alex Shinn wrote: The problems we're having aren't even about string representation though, they're about the semantics of the string operations themselves. Are the string indices byte positions or character positions? Different libraries disagree. IMHO Java doe

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Alex Shinn
> "Tobia" == Tobia Conforto <[EMAIL PROTECTED]> writes: Tobia> Graham Fawcett wrote: >> Here's another thought. It seems to me that if we >> were to represent strings as composite values, e.g. a >> two-slot record whose first slot is an encoding (the >> symbol 'utf8, or #f

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread F. Wittenberger
Am Dienstag, den 18.03.2008, 09:38 +0100 schrieb Peter Bex: > On Tue, Mar 18, 2008 at 11:41:08AM +0900, Alex Shinn wrote: > > > "Kon" == Kon Lovett <[EMAIL PROTECTED]> writes: > > > > Kon> Summary: I want a byte-string API. I want string > > Kon> integrations. I want global UTF8 strin

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Tobia Conforto
Graham Fawcett wrote: Here's another thought. It seems to me that if we were to represent strings as composite values, e.g. a two-slot record whose first slot is an encoding (the symbol 'utf8, or #f for 'byte' encoding), and whose second slot contains the string data, then the various string

Re: [Chicken-users] ditching syntax-case modules for the utf8 egg

2008-03-18 Thread Peter Bex
On Tue, Mar 18, 2008 at 11:41:08AM +0900, Alex Shinn wrote: > > "Kon" == Kon Lovett <[EMAIL PROTECTED]> writes: > > Kon> Summary: I want a byte-string API. I want string > Kon> integrations. I want global UTF8 strings. > > The only way this can happen is to push the UTF8 handling > in

Re: [Chicken-users] Egg svn request: caketext

2008-03-18 Thread felix winkelmann
Hi! Attached is a patch against the chicken trunk that adds a rewrite hook for literals in program code. It's used like this: % cat fix.scm (set! ##compiler#literal-rewrite-hook (lambda (x w) `(,(w 'fix-strings) ',x))) % cat ftest.scm (use srfi-13) (define (fix-strings x) (let walk ((x x