On Mon, 28 Jun 2004, Austin Hastings wrote:

> --- Dan Sugalski <[EMAIL PROTECTED]> wrote:
> > On Mon, 28 Jun 2004, Juerd wrote:
> >
> > > Dave Whipp skribis 2004-06-28  9:55 (-0700):
> > > > > substr($string, 2 bytes, 4 bytes) = $substitute;
> > > > substr($string, 2, 4 :bytes)
> > >
> > > substr($string, 2 but graphemes, 4 but bytes);
> > >
> > > I think "but" even makes sense, if substr defaults to something.
> >
> > I think mixing strings, bytes, graphemes, and code points together
> > is a phenomenally bad idea, likely to lead to many tears, much
> > gnashing of teeth, and quite a few rampages with sharp objects,
> > not to mention a lot of code guaranteed to fail at the edge cases.
>
> Hmm. Suppose that I have a system that is friendly to 80 byte records.
> I want to output "meaningful" strings, so I want to partition a buffer
> into 80-ish byte substrings, but preserve any graphemes (i.e., store
> the data in a legible format).
>
> How would I do that?

You don't. Or if you do, you do it with a lot of pain, sweat, and annoying
hard work. 80 bytes gets you somewhere between three (And this may be a
*high* estimate--there may be circumstances where 80 bytes is
insufficient for *one* grapheme) and 80 graphemes.

This isn't something that can be made generically easy.

                                        Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                      teddy bears get drunk

Reply via email to