Sorry if you are using outlook, turn off the thing that says "Extra line
breaks in this message were removed" at the top of my previous message.

Scott


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Friday, 20 July 2007 10:11 AM
To: internals@lists.php.net
Subject: RE: [PHP-DEV] POSIX regex

I don't really know much about unicode, and to be honest, I don't really
know much about the internal workings of php.
But I assume that there are going to be different implementations of string
functions depending on whether the string is unicode or not.

I'm going to suggest an implementation suggestion... Keep in mind I havent
hacked around with php source, so my variable naming etc will be wrong...
and its all psuedocode, so its not

// The object type used when php creates a string
class ZendString
{
        char *strPtr; // however strings are stored in php
        ZendStringFunctions *pFunctions;
};


abstract class ZendStringFunctions
{
        abstract function strtolower(ZendString *pStr);
        abstract function strtoupper(ZendString *pStr);
        abstract function substr(ZendString *pStr);

        // All functions that differ depending on unicode / non-unicode
implementation
        // ...
};

// A set of string functions for unicode strings
class ZendStringFunctionsUnicode
{
        function strtolower(ZendString *pStr)
        {
                // unicode implementation
        }

        function strtoupper(ZendString *pStr)
        {
                // unicode implementation
        }

        function substr(ZendString *pStr)
        {
                // unicode implementation
        }
};

// A set of string functions for non-unicode strings
class ZendStringFunctionsNonUnicode
{
        function strtolower(ZendString *pStr)
        {
                // non-unicode implementation
        }

        function strtoupper(ZendString *pStr)
        {
                // non-unicode implementation
        }

        function substr(ZendString *pStr)
        {
                // non-unicode implementation
        }
};


// the strtolower implmentation
ZEND_FUNC strtolower(ZendString *pStr)
{
        return pStr->pFunctions->strtolower(pStr);
}

// the strtoupper implmentation
ZEND_FUNC strtolower(ZendString *pStr)
{
        return pStr->pFunctions->strtolower(pStr);
}

ZEND_FUNC unicode_val(ZendString *pStr)
{
        // do something with pStr->strPtr
        delete pStr->pFunctions;
        pStr->pFunctions = new ZendStringFunctionsUnicode();
}


Anyway - the point I'm trying to make is to use function pointers to switch
between implementations. 

You could even make the ZendStringFunctions singletons and just set
pStr->pFunctions to an instance of the singleton.

I think this would provide a very fast implementation of what is trying to
be done.

Im just making a suggestion, and feel free to ignore/criticise me if im
wrong.  I don't know anything about phps internals... Just an idea

Scott


-----Original Message-----
From: Andrei Zmievski [mailto:[EMAIL PROTECTED] 
Sent: Friday, 20 July 2007 9:36 AM
To: [EMAIL PROTECTED]
Cc: internals@lists.php.net
Subject: Re: [PHP-DEV] POSIX regex

On Jul 19, 2007, at 4:14 PM, <[EMAIL PROTECTED]>  
<[EMAIL PROTECTED]> wrote:

> I don't like the idea of having a "u" prefix for Unicode strings.   
> It may
> improve performance, and give you some level of fine grain control,  
> but...
>
> - It breaks your "keep php simple" policy by introducing a lot of new
> functions (ugly).
> - I (plus a lot of others) have an existing php5 application which  
> I wish to
> eventually use with Unicode, and like others, I don't want to spend  
> time
> refactoring.
> - It will also introduce bugs when programmers accidentally forget  
> to add
> the "u" prefix when working with unicode.
>
> If you always want to produce Unicode, I think its best to always  
> use a cast
> or a conversion function.
>
> Eg
>
> $str = (unicode)(strtoupper($str));
> Or
> $str = unicode_val(strtoupper($str));

Good idea and it will totally work, except that it won't. strtoupper 
() operates in different ways according to the type of the string  
that it gets.

-Andrei

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to