On Mon, 28 Jun 2004, Austin Hastings wrote: > --- Dan Sugalski <[EMAIL PROTECTED]> wrote: > > On Mon, 28 Jun 2004, Juerd wrote: > > > > > Dave Whipp skribis 2004-06-28 9:55 (-0700): > > > > > substr($string, 2 bytes, 4 bytes) = $substitute; > > > > substr($string, 2, 4 :bytes) > > > > > > substr($string, 2 but graphemes, 4 but bytes); > > > > > > I think "but" even makes sense, if substr defaults to something. > > > > I think mixing strings, bytes, graphemes, and code points together > > is a phenomenally bad idea, likely to lead to many tears, much > > gnashing of teeth, and quite a few rampages with sharp objects, > > not to mention a lot of code guaranteed to fail at the edge cases. > > Hmm. Suppose that I have a system that is friendly to 80 byte records. > I want to output "meaningful" strings, so I want to partition a buffer > into 80-ish byte substrings, but preserve any graphemes (i.e., store > the data in a legible format). > > How would I do that?
You don't. Or if you do, you do it with a lot of pain, sweat, and annoying hard work. 80 bytes gets you somewhere between three (And this may be a *high* estimate--there may be circumstances where 80 bytes is insufficient for *one* grapheme) and 80 graphemes. This isn't something that can be made generically easy. Dan --------------------------------------"it's like this"------------------- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk