RE: [Haskell-cafe] Re: Writing binary files?
You wouldn't want to have to accumulate the entire body as a single byte string Ever heard of lazyness? Haskell does it quite well... Accumulating the entire body doesn't really do this because haskell is lazy. You don't need a more complex interface in Haskell! Keean. ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: [Haskell-cafe] Re: Writing binary files?
MR K P SCHUPKE wrote: You wouldn't want to have to accumulate the entire body as a single byte string Ever heard of lazyness? Haskell does it quite well... Accumulating the entire body doesn't really do this because haskell is lazy. You don't need a more complex interface in Haskell! Are you sure that will work in the general case? Or are you assuming lazy I/O? -- Glynn Clements [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Interoperability with other languages and haskell in industry
On Thursday 16 September 2004 20:27, Andy Moran wrote: I'd like to say that this approach has worked for us time and time again, but, to date, we've never had to rewrite a slow component in C :-) For us, C interoperability has always been a case of linking to third party software, or for writing test harnesses to test generated C. The point is that perhaps we will not have a prototype but a single implementation (not that I think it's a good idea in the general case, but we will write a relatively simple bookkeeping application). However I realize that one can write a great part of the software in a single language. The point is providing an escape to java, C++, C#, python or other in vogue languages in case we find that it's difficult to interface with legacy systems, or we don't find a coder to hire in the future. So the point is not to rewrite something in C for efficiency, but rather to be able to say ok, this component is written in haskell and will stay this way, but the rest of the system won't be haskell anymore. However: Things are different if your application is multi-process and/or distributed, and you're not going to be using an established protocol (like HTTP, for instance). In that case, you might want to look at HDirect (giving access to CORBA, COM, DCOM), if you need to talk to CORBA/COM/DCOM objects. There are many simple solutions to RPC available too, if that's all you need. I see that there is for example xmlrpc that should fit my little interoperability needs, and would have liked to hear some experience on that route. Your reply is incouraging, though, since you didn't need any other language at all. That's my hope, too. Bye and waiting for that other famous hakell-using company that I didn't mention to attend this discussion :) Vincenzo ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Proofs for program testing
I saw that many introductions to Haskell contain proofs of properties of implemented functions. This is probably due to the fact that pure functions can be handled more easily than imperative programs with hidden states. I wondered whether one can use proofs of Haskell functions for testing. I found QuickCheck http://www.cs.chalmers.se/~rjmh/QuickCheck/ which, as far as I understand, relies on random inputs and I found http://homepages.inf.ed.ac.uk/wadler/realworld/era.html which sounds like some GUI driven program. Is there something that can be used for automatical testing, e.g. for darcs' tests to check patch integrity? ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Writing binary files?
Glynn Clements [EMAIL PROTECTED] writes: What I'm suggesting in the above is to sidestep the encoding issue by keeping filenames as byte strings wherever possible. Ok, but let it be in addition to, not instead treating them as character strings. And program-generated email notifications frequently include text with no known encoding (i.e. binary data). No, programs don't dump binary data among diagnostic messages. If they output binary data to stdout, it's their only output and it's redirected to a file or another process. Or are you going to demand that anyone who tries to hack into your system only sends it UTF-8 data so that the alert messages are displayed correctly in your mail program? The email protocol is text-only. It may mangle newlines, it has a maximum line length, some texts may be escaped during transport (e.g. From at the beginning of a line). Arbitrary binary data should be put in base64-or-otherwise-encoded attachments. If the cron program embeds the output as email body, the cron job should not dump arbitrary binary data to stdout. Encoding is not the only problem. Processing data in their original byte encodings makes supporting multiple languages harder. Filenames which are inexpressible as character strings get in the way of clean APIs. When considering only filenames, using bytes would be sufficient, but in overall it's more convenient to Unicodize them like other strings. It also harms reliability. Depending upon the encoding, two distinct byte strings may have the same Unicode representation. Such encodings are not suitable for filenames. http://www.mail-archive.com/[EMAIL PROTECTED]/msg00376.html | ISO-2022-JP will never be a satisfactory terminal encoding (like | ISO-8859-*, EUC-*, UTF-8, Shift_JIS) because | | 1) It is a stateful encoding. What happens when a program starts some | terminal output and then is interrupted using Ctrl-C or Ctrl-Z? The | terminal will remain in the shifted state, while other programs start | doing output. But these programs expect that when they start, the | terminal is in the initial state. The net result will be garbage on | the screen. | | 2) ISO-2022-JP is not filesystem safe. Therefore filenames will never | be able to carry Japanese characters in this encodings. | | Robert Brady writes: | Does ISO-2022 see much/any use as the locale encoding, or it it just used | for interchange? | | Just for interchange. | | Paul Eggert searched for uses of ISO-2022-JP as locale encodings (in | order to convince me), and only came up with a handful of questionable | URLs. He didn't convince me. And there are no plans to support | ISO-2022-JP as a locale encoding in glibc - because of 1) and 2) above. For me ISO-2022 is a brain-damaged concept and should die. Almost nothing supports it anyway. Such tarballs are not portable across systems using different encodings. Well, programs which treat filenames as byte strings to be read from argv[] and passed directly to open() won't have any problems with this. The OS itself may have problems with this; only some filesystems accept arbitrary bytes apart from '\0' and '/' (and with the special meaning for '.'). Exotic characters in filenames are not very portable. A Haskell program in my world can do that too. Just set the encoding to Latin1. But programs should handle this by default, IMHO. IMHO it's more important to make them compatible with the representation of strings used in other parts of the program. Filenames are, for the most part, just tokens to be passed around. Filenames are often stored in text files, whose bytes are interpreted as characters. Applying QP to non-ASCII parts of filenames is suitable only if humans won't edit these files by hand. My specific point is that the Haskell98 API has a very big problem due to the assumption that the encoding is always known. Existing implementations work around the problem by assuming that the encoding is always ISO-8859-1. The API is incomplete and needs to be enhanced. Programs written using the current API will be limited to using the locale encoding. That just adds unnecessary failure modes. But otherwise programs would continuously have bugs in handling text which is not ISO-8859-1, especially with multibyte encoding where pretending that ISO-8859-2 is ISO-8859-1 too often doesn't work. I can't switch my environment to UTF-8 yet precisely because too many programs were written with the attitude you are promoting: they don't care about the encoding, they just pass bytes around. Bugs range from small annoyances like tabular output which doesn't line up, through mangled characters on a graphical display, to full-screen interactive programs being unusable on a UTF-8 terminal. This encoding would be incompatible with most other texts seen by the program. In particular reading a filename from a file would not work without manual recoding. We already have that problem; you can't read non-Latin1
Re: [Haskell-cafe] Writing binary files?
Marcin 'Qrczak' Kowalczyk wrote: What I'm suggesting in the above is to sidestep the encoding issue by keeping filenames as byte strings wherever possible. Ok, but let it be in addition to, not instead treating them as character strings. Provided that you know the encoding, nothing stops you converting them to strings, should you have a need to do so. Processing data in their original byte encodings makes supporting multiple languages harder. Filenames which are inexpressible as character strings get in the way of clean APIs. When considering only filenames, using bytes would be sufficient, but in overall it's more convenient to Unicodize them like other strings. It also harms reliability. Depending upon the encoding, two distinct byte strings may have the same Unicode representation. Such encodings are not suitable for filenames. Regardless of whether they are suitable, they are used. For me ISO-2022 is a brain-damaged concept and should die. Well, it isn't likely to. I haven't addressed any of the other stuff about ISO-2022, as it isn't really relevant. Whether ISO-2022 is good or bad doesn't matter; what matters is that it is likely to remain in use for the foreseeable future. Such tarballs are not portable across systems using different encodings. Well, programs which treat filenames as byte strings to be read from argv[] and passed directly to open() won't have any problems with this. The OS itself may have problems with this; only some filesystems accept arbitrary bytes apart from '\0' and '/' (and with the special meaning for '.'). Exotic characters in filenames are not very portable. No, but most Unix programs manage to handle them without problems. A Haskell program in my world can do that too. Just set the encoding to Latin1. But programs should handle this by default, IMHO. IMHO it's more important to make them compatible with the representation of strings used in other parts of the program. Why? Filenames are, for the most part, just tokens to be passed around. Filenames are often stored in text files, True. whose bytes are interpreted as characters. Sometimes true, sometimes not. Where filenames occur in data files, e.g. configuration files, the program which reads the configuration file typically passes the bytes directly to the OS without interpretation. Applying QP to non-ASCII parts of filenames is suitable only if humans won't edit these files by hand. Who said anything about QP? My specific point is that the Haskell98 API has a very big problem due to the assumption that the encoding is always known. Existing implementations work around the problem by assuming that the encoding is always ISO-8859-1. The API is incomplete and needs to be enhanced. Programs written using the current API will be limited to using the locale encoding. That just adds unnecessary failure modes. But otherwise programs would continuously have bugs in handling text which is not ISO-8859-1, especially with multibyte encoding where pretending that ISO-8859-2 is ISO-8859-1 too often doesn't work. Why? I can't switch my environment to UTF-8 yet precisely because too many programs were written with the attitude you are promoting: they don't care about the encoding, they just pass bytes around. That's all that many programs should be doing. Bugs range from small annoyances like tabular output which doesn't line up, through mangled characters on a graphical display, to full-screen interactive programs being unusable on a UTF-8 terminal. IOW: 1. display doesn't work correctly, 2. display doesn't work correctly, and 3. display doesn't work correctly. You keep citing cases involving graphical display as a reason why all programs should be working with characters all of the time. I haven't suggested that programs should never deal with characters, yet you keep insinuating that is my argument, then proceed to attack it. -- Glynn Clements [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe