Re: [Haskell-cafe] Bytestrings vs String?

wren ng thornton Mon, 02 Feb 2009 19:42:11 -0800

Marc Weber wrote:

A lot of people are suggesting using Bytestrings for performance,
strictness whatsoever reasons.


However how well do they talk to other libraries?


I'm not sure how you mean?

For passing them around: If someone's trying to combine your library(version using ByteStrings) and another Haskell library that usesByteStrings, then everything works fine--- assuming both libraries arecompiled against the same version of the bytestring library. As Irecall, ByteStrings are designed to ease passing to C code across theFFI too, in case someone wants to use your library with some FFI C code.If someone's trying to combine your library with another library thatuses String, they'll need to add conversions. (All of this is symmetricfor a version of your library using String with another library usingByteStrings.)

The big compatibility issue I can see is the question of what a givenByteString *means*. In particular, via the Data.ByteString.Char8 moduleit encodes only ASCII characters, not all of Unicode like [Char] does.There are libraries for lossless encoding of [Char] into ByteStrings,but in general there can be encoding mismatch problems if, say, yourlibrary uses UTF8-encoded ByteStrings but the other library treats themlike Char8-encoded (or UTF16BE, UTF16LE, FooBar,...), potentiallymangling or hallucinating multi-byte characters.

In general, if you're concerned about performance (or believe your userswill be) then ByteStrings are a good bet. Just make it clear in thedocumentation what sort of encoding you use (or whether your library isencoding agnostic).

For hslogger specifically, it looks like most of the Strings arearguments which will typically be written as literals. Thus, to minimizeboilerplate, if you do switch to ByteStrings then you may want toprovide a module that does all the String->ByteString conversions forthe user. If you have a good program for testing real world use ofhslogger, before committing to the change I'd suggest benchmarking (intime and in space) the differences between the current Stringimplementation and a proposed ByteString implementation.

Should there be two versions?

hslogger-bytestring and hslogger-string?

I'd just stick with one (with a module for hiding the conversions, asdesired). Duplicating the code introduces too much room for maintenanceand compatibility issues.

Or would it be better to implement one String class which can cope
with everthing (performance will drop, won't it?)

It'd be a very large class if you do it generally[1], and large classeslike that are generally frowned on (for good or ill). If you only need asmall subset of string operations then it may be more feasible to have asmaller class with only those operations.

[1] See everything hidden from the Prelude inhttp://hackage.haskell.org/packages/archive/list-extras/0.2.2.1/doc/html/src/Prelude-Listless.htmlor see what all is offered by Data.ByteString vs the Prelude.

In the future I'd like to explore using haskell for web developement.
So speed does matter. And I don't want my server to convert from
Bytestrings to Strings and back multiple times..

That's the big thing. The more people that use ByteStrings the less needthere is to convert when combining libraries. That said, ByteStringsaren't a panacea; lists and laziness are very useful.


--
Live well,
~wren
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Bytestrings vs String?

Reply via email to