Re: [PHP-DEV] php interpreter
Hi! I've seen this statement before about the impact of caching the actual compilation (or mere tokenization?) to bytecode being very small compared to the impact of avoiding disk access. I am curious if there are any measurements breaking this down. Read-only access to code in files already buffered by the OS (not files read for the first time) ought to be very fast. We did some measurements a long time ago at Zend, but I don't have the numbers right now and anyway the engine changed so much since then they are probably irrelevant anyway. However, the main gist is right - time saved on compilation is not that much. One of the reasons to that is that some of the data structures that are used by the engine are dynamic (class tables, class variables, static variables, etc.) which means a lot of data needs still to be handled to make script stored in SHM runnable. Which greatly decreases savings from not compiling it. The disk read however is still saved, and since unlike compilation it's a system call and talks to potentially very slow (compared to memory) device, the savings are significant. Even with OS cache, you still have context switches and copying the data, etc. With some work I think it is possible to make PHP script to run with zero system calls spent on loading script files. -- Stanislav Malyshev, Software Architect SugarCRM: http://www.sugarcrm.com/ (408)454-6900 ext. 227 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] NEWS again (was: PHP 5.4.4RC2 Released)
On Thu, 31 May 2012 21:01:50 -0400, David Soria Parra wrote: We would like to announce the second RC of the 5.4.4 version. This is mainly a bugfix release. The release includes a fix for a weakness crypts() DES implementation (CVE-2012-2143). Please test it and notify us of any problems you may encounter. The full list of the fixes is as always in the NEWS file. Sorry to bring this up again, but they aren't. 5.3 NEWS are not being merged. Right now, NEWS is pretty useless. If I want to know whether some change is in one release, 5.4 NEWS won't tell me that. For instance, 0f180a63 was committed to 5.3 in April 7 (a stream_get_line() fix). It is most definitely in 5.4.4RC2: $ git merge-base 0f180a63e php-5.4.4RC2 0f180a63ebb2d65bbe49b68d2430639b20443e9a However, there's no mention in NEWS. The current policy of changing only the lowest branch NEWS obviously can only work if these changes are then merged to the most recent branches on release. If the RMs are unwilling to do such merging, we should change the policy to require updating the NEWS files in every stable branch to which the fix was merged. -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
On Fri, 01 Jun 2012 09:40:19 +1000, David Muir wrote: Coming from a pleb, my only concern is the name if the class is in the global scope. A BreakIterator to me sounds like something related to breaking out of a looping structure, and not something used for iterating over various language structure boundaries. If it's in a ICU namespace, then it's not a problem, as it's clearly related to Unicode. We currently don't use namespaces in any of the core extensions. All the other symbols in ext/intl are in the global namespace; to put BreakIterator in a new namespace would be inconsistent -- and to put the whole extension would be a huge BC break. As to the name chosen to the class, it just mirrors the name used in ICU. In some cases, we prefixed the class name with Intl, in order to minimize the likelihood of symbols collisions or distinguish it from other similar functionality in PHP (something namespaces would be more appropriate for), but otherwise we prefer to keep the symbols names used in ICU in order to make it easy for people who already know the native API. Additionally, I think your concerns are exaggerated. The symbol BreakIterator can only used in contexts where it's obvious it's a class name, as in BreakIterator::createWordInstance('en'). -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
How about IntlBreakIterator? I agree with David that the naming is very weird, it doesn't hint at something from Intl but another crazy spl iterator :-) On Fri, Jun 1, 2012 at 9:57 AM, Gustavo Lopes glo...@nebm.ist.utl.ptwrote: On Fri, 01 Jun 2012 09:40:19 +1000, David Muir wrote: Coming from a pleb, my only concern is the name if the class is in the global scope. A BreakIterator to me sounds like something related to breaking out of a looping structure, and not something used for iterating over various language structure boundaries. If it's in a ICU namespace, then it's not a problem, as it's clearly related to Unicode. We currently don't use namespaces in any of the core extensions. All the other symbols in ext/intl are in the global namespace; to put BreakIterator in a new namespace would be inconsistent -- and to put the whole extension would be a huge BC break. As to the name chosen to the class, it just mirrors the name used in ICU. In some cases, we prefixed the class name with Intl, in order to minimize the likelihood of symbols collisions or distinguish it from other similar functionality in PHP (something namespaces would be more appropriate for), but otherwise we prefer to keep the symbols names used in ICU in order to make it easy for people who already know the native API. Additionally, I think your concerns are exaggerated. The symbol BreakIterator can only used in contexts where it's obvious it's a class name, as in BreakIterator::**createWordInstance('en'). -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] PHP 5.3.14RC2 Released
Hi! We would like to announce the second RC of the 5.3.14 version. This is mainly a bugfix release. The release includes a fix for a weakness crypts() DES implementation (CVE-2012-2143). Please test it and notify us of any problems you may encounter. The full list of the fixes is as always in the NEWS file. You can download the packages from: http://downloads.php.net/johannes/php-5.3.14RC2.tar.bz2 http://downloads.php.net/johannes/php-5.3.14RC2.tar.gz The Windows team provides windows binaries for the release. As always you find them at: http://windows.php.net/qa/ Regards, Johannes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
hi, On Fri, Jun 1, 2012 at 10:02 AM, Benjamin Eberlei kont...@beberlei.de wrote: How about IntlBreakIterator? I agree with David that the naming is very weird, it doesn't hint at something from Intl but another crazy spl iterator :-) I agree too. BreakIterator is a very common name and I suspect possible naming conflicts may happen. Cheers, -- Pierre @pierrejoye | http://blog.thepimp.net | http://www.libgd.org -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
On Fri, 1 Jun 2012 12:58:37 +0200, Pierre Joye wrote: On Fri, Jun 1, 2012 at 10:02 AM, Benjamin Eberlei kont...@beberlei.de wrote: How about IntlBreakIterator? I agree with David that the naming is very weird, it doesn't hint at something from Intl but another crazy spl iterator :-) Asides from date related classes -- which could be confused with stuff from ext/date or even ext/calendar --, no other classes have Intl in their name. Does SpoofChecker hint at something from intl? ResourceBundle? ICU is a rather large library, and while internationalization is a common theme, the APIs have diverse functionality and therefore diverse names. Plus, SPL does not have a monopoly on the *Iterator names. I agree too. BreakIterator is a very common name and I suspect possible naming conflicts may happen. So would you have RuleBasedBreakIterator renamed IntlRuleBasedBreakIterator too?... I find it very hard to believe that BreakIterator is a very common name, but I'm open to evidence that points otherwise. This argument could maybe be made for 'Transliterator', which was added in 5.4. -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
On 01-06-2012 13:34, Gustavo Lopes wrote: On Fri, 1 Jun 2012 12:58:37 +0200, Pierre Joye wrote: On Fri, Jun 1, 2012 at 10:02 AM, Benjamin Eberlei kont...@beberlei.de wrote: How about IntlBreakIterator? I agree with David that the naming is very weird, it doesn't hint at something from Intl but another crazy spl iterator :-) Asides from date related classes -- which could be confused with stuff from ext/date or even ext/calendar --, no other classes have Intl in their name. Does SpoofChecker hint at something from intl? ResourceBundle? ICU is a rather large library, and while internationalization is a common theme, the APIs have diverse functionality and therefore diverse names. Plus, SPL does not have a monopoly on the *Iterator names. I agree too. BreakIterator is a very common name and I suspect possible naming conflicts may happen. So would you have RuleBasedBreakIterator renamed IntlRuleBasedBreakIterator too?... I find it very hard to believe that BreakIterator is a very common name, but I'm open to evidence that points otherwise. This argument could maybe be made for 'Transliterator', which was added in 5.4. In my personal opinion, all Intl classes should be prefixed with Intl. It's not so much that BreakIterator is a very common name, but rather a very ambiguous name that may point to many different things. Just by the fact that multiple people have already posted here that at first they thought BreakIterator had something to do with the break statement gives you a rather solid hint that the function of this class is not immediately clear. Prefixing it with Intl immediately makes it clear that it belongs to the Intl superfamily, and limits the potential misunderstandings a lot. I actually still don't understand why not all Intl classes are prefixed? Isn't that the usual procedure? eg. for MySQLi, and pretty much all other extensions? - Tul -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
hi, On Fri, Jun 1, 2012 at 1:34 PM, Gustavo Lopes glo...@nebm.ist.utl.pt wrote: So would you have RuleBasedBreakIterator renamed IntlRuleBasedBreakIterator too?... Ideally we would yes, while they are less common and less aimed to be seen as part of another API. I find it very hard to believe that BreakIterator is a very common name, but I'm open to evidence that points otherwise. This argument could maybe be made for 'Transliterator', which was added in 5.4. Transliterator is not confusing as BreakIterator, sorry. I would not care much if there was some longer not so confusing/common names. But with that one, the risk to conflict with existing may be too high to do not be discussed. Cheers, -- Pierre @pierrejoye | http://blog.thepimp.net | http://www.libgd.org -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
On Fri, Jun 1, 2012 at 9:57 AM, Gustavo Lopes glo...@nebm.ist.utl.pt wrote: We currently don't use namespaces in any of the core extensions. Does anything prevent us from starting to do so? other symbols in ext/intl are in the global namespace; to put BreakIterator in a new namespace would be inconsistent -- and to put the whole extension would be a huge BC break. It sure would be a bit inconcistent, but if you see it as All new Intl classes will go into the Intl namespace it makes perfect sense in my eyes. Also, at least in theory, one could alias all intl classes to namespaced variants (though I'm not sure that's really necessary.) Nikita -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
On Fri, 1 Jun 2012 15:37:30 +0200, Pierre Joye wrote: On Fri, Jun 1, 2012 at 1:34 PM, Gustavo Lopes glo...@nebm.ist.utl.pt wrote: So would you have RuleBasedBreakIterator renamed IntlRuleBasedBreakIterator too?... Ideally we would yes, while they are less common and less aimed to be seen as part of another API. I find it very hard to believe that BreakIterator is a very common name, but I'm open to evidence that points otherwise. This argument could maybe be made for 'Transliterator', which was added in 5.4. Transliterator is not confusing as BreakIterator, sorry. You removed the quoting that provided context, but I was responding to your claim that it was a very common name and that you suspected naming conflicts might happen. But in fact Transliterator is much more confusing than BreakIterator. In fact, the name Transliterator is an ICU artifact of the past, that module is now called Text Transformation as it provides a generic text transformation API, not specifically for transliteration. -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
On Fri, 1 Jun 2012 15:56:59 +0200, Nikita Popov wrote: On Fri, Jun 1, 2012 at 9:57 AM, Gustavo Lopes glo...@nebm.ist.utl.pt wrote: We currently don't use namespaces in any of the core extensions. Does anything prevent us from starting to do so? other symbols in ext/intl are in the global namespace; to put BreakIterator in a new namespace would be inconsistent -- and to put the whole extension would be a huge BC break. It sure would be a bit inconcistent, but if you see it as All new Intl classes will go into the Intl namespace it makes perfect sense in my eyes. You say that it makes perfect sense, but you don't explain why. Also, at least in theory, one could alias all intl classes to namespaced variants (though I'm not sure that's really necessary.) Yes, that would be the only sane way to do it, but I really don't see a benefit large enough to compensate having a different treatment for classes depending on some arbitrary line like when they were added. The only real benefit of namespaces is to avoid name collisions, but most new projects use namespaces and we can easily avoid name collisions in the PHP core. Plus, remember ext/intl is maintained in PECL too, where it supports PHP 5.2. Anyway, this is getting a bit off-topic. -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
On Fri, 01 Jun 2012 15:35:13 +0200, Maciek Sokolewicz wrote: In my personal opinion, all Intl classes should be prefixed with Intl. It's not so much that BreakIterator is a very common name, but rather a very ambiguous name that may point to many different things. Just by the fact that multiple people have already posted here that at first they thought BreakIterator had something to do with the break statement gives you a rather solid hint that the function of this class is not immediately clear. Prefixing it with Intl immediately makes it clear that it belongs to the Intl superfamily, and limits the potential misunderstandings a lot. I actually still don't understand why not all Intl classes are prefixed? Isn't that the usual procedure? eg. for MySQLi, and pretty much all other extensions? We've had the convention of prefixing function names with some extension prefix, but this convention has not been as marked for class names -- perhaps because there were so not many of them and so there were less collision/confusion problems. In any case, I'll rename the classes before merging. -- Gustavo Lopes -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] domdocument loadhtml and encoding
Gentlemen, Regarding this bug report: https://bugs.php.net/bug.php?id=49705 As more developers move away from using regular expressions to parse HTML and start using DOMDocument, I've noticed that quite a few stumble over encoding issues. They're not bugs, because it's documented (I think) that if a document is loaded using ::loadHTMLFile() or if it contains a content-type meta tag which specifies the character encoding it will work as expected. So far I've suggested a hack that involves adding the meta-tag in front of the string that contains the HTML. As horrible as it seems, that does the job! That said, I'm hoping to get enough internals support to add a parameter to ::loadHTML() that set / overrides the default character set when processing the document; when given, any meta tags pertaining to character set encoding should be ignored (AFAIK that's also the browser's behavior). Btw, there's another patch that also introduces a new parameter to ::parseHTML() which has gone into 5.4 branch (https://bugs.php.net/bug.php?id=54037), so it looks like this would be the second (optional) parameter then. Thoughts? -- -- Tjerk -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] Was php bug (IMHO). Have fix. Re: Bad eval() leading to response code 500
eval() does indeed set the response code to 500 upon failure. Is that a bug? I'll file a report because I don't believe leaving the response code at 500 is consistent with the statement from the php.net page about eval(): If there is a parse error in the evaluated code, eval() returns FALSE and execution of the following code continues normally. I don't think leaving the the response code at 500 is consistent with continues normally. I believe the fix is one line, adding EG(current_execute_data)-opline-extended_value != ZEND_EVAL to the if clauses before setting the error header in main.c at line 1132. In case anyone finds my post from last night while doing a search of the archives, I'll explain more below and answer my question about debugging. What I didn't realize at the time of my post last night is that browsers don't mind receiving a 500 as long as everything else looks good. For example, the following web page: ?php eval('0+'); print hello world\n; ? looks fine in a browser so I assumed (oops!) that it was returning code 200. If you try doing a wget on that example, it complains about the response code 500. (My big ugly application uses AJAX and the 500 caused my AJAX framework to reject the page.) Other than the code 500, everything seems to proceed normally. All of the other code is executed normally. The content of the web page is normal and is displayed well in a browser unless you have something checking for unhappy response codes. The answer to my question about watching the headers in the debugger turned out to be pretty easy: watch sapi_globals.sapi_headers watch sapi_globals.sapi_headers.http_response_code It would not have been so simple with ZTS on. (In that case, the TSRM macros come in to play.) -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
HI, On Fri, Jun 1, 2012 at 5:02 PM, Gustavo Lopes glo...@nebm.ist.utl.pt wrote: In any case, I'll rename the classes before merging. You may have missed part of my replies. One key part was: to discuss it before doing anything. This is only one day discussion and I don't feel like we have a long term decision about what to do in this area. Before going with this one only, I would rather prefer to solve this problem once and for all (other intl classes/cases). Cheers, -- Pierre @pierrejoye | http://blog.thepimp.net | http://www.libgd.org -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
Hi! I've wrapped ICU's BreakIterator and RuleBasedBreakIterator. I stopped short of adding a procedural interface. I think there's a larger expectation of a having an OOP interface when working with iterators. What do you think? If there's no procedural interface, I'll change the instances of zend_parse_methods to zpp for performance. Nice! I remember we had TextIterator in PHP 6, IIRC that was the reason BreakIterator never found its way into intl. BreakIterator also exposes other native methods: getAvailableLocales(), getLocale() and factory methods to build several predefined types of BreakIterators: createWordInstance() for word boundaries, createCharacterInstance() for locale dependent notions of characters, createSentenceInstance() for sentences, createLineInstance() and createTitleInstance() -- for title casing breaks. These factories currently return One thing I notice here is that with this API it is not possible to programmatically choose what is the iteration unit - you'd have to do a switch for that. Do you think it may be a good idea to have a generic function that allows to choose the unit programmatically? What is the notion of characters - is it grapheme characters? Is there option to iterate over code points too - not sure if it's useful just curious, as we used to have it in PHP 6 IIRC. About getAvailableLocales() - what this actually does? Does it list all avaliable locales in the system, ones that have BreakIterator rules, or something else? If it's not related to BI, I'm not sure we need to have it in BI. What is the intended usage of it? Maybe it should be part of Locale class? Note that BreakIterator is an iterator only in the sense of the first 'Iterator' in 'IteratorIterator', i.e., it does not implement the Iterator interface. The reason is that there is no sensible implementation for Iterator::key(). Using it for Doesn't it have a notion of current position? If so, key should be the current position. Will this BreakIterator be usable in foreach? I'm not sure I understand it from this description - understanding this without any usage examples, RFCs or code snippets for intended usage is really hard and I think we should really start with doing that. I would expect this class to work like this: foreach(BreakIterator::createWordInstance(blah blah blah) as $i = $word) { echo Word number $i is $word\n; } or at least like this: foreach(BreakIterator::createWordInstance(blah blah blah) as $i = $word) { echo Next word at position $i is: $word\n; } Is it the model? If not, I think we need to wrap the C API to make this possible, because this is what people expect in PHP from the iterator. Finally, I added a convenience method to BreakIterator: getPartsIterator(). This provides an IntlIterator, backed by the BreakIterator PHP object (i.e. moving the pointer or changing the text in BreakIterator affects the iterator and also moving the iterator affects the backing BreakIterator), which allows traversing the text between each boundary. How that text is being traversed - by code points/characters/graphemes/bytes? -- Stanislav Malyshev, Software Architect SugarCRM: http://www.sugarcrm.com/ (408)454-6900 ext. 227 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
[PHP-DEV] 5.4.3 type hint handling
Hi, I'm experiencing an issue adding type hints to the function prototypes. The following definition gives the unknown typehint error when invoking a function ZEND_BEGIN_ARG_INFO_EX(arg_info_trader_adosc, 0, 0, 4) ZEND_ARG_TYPE_INFO(0, high, IS_ARRAY, 0) ZEND_ARG_TYPE_INFO(0, low, IS_ARRAY, 0) ZEND_ARG_TYPE_INFO(0, close, IS_ARRAY, 0) ZEND_ARG_TYPE_INFO(0, volume, IS_ARRAY, 0) ZEND_ARG_TYPE_INFO(0, fastPeriod, IS_LONG, 1) ZEND_ARG_TYPE_INFO(0, slowPeriod, IS_LONG, 1) ZEND_END_ARG_INFO(); The reason I trip up on this is to generate the xml doc proto for the extension. Therefore I'm using the extended ZEND_ARG_INFO version. Without type hints there are no param types in the xml. Quickly looking at the sources I realize that 5.4.3 has an explicit type hint check which was previously ignored in 5.3 http://lxr.php.net/opengrok/xref/PHP_5_4/Zend/zend_execute.c#600 The reason of writing this is not to start a new discussion about scalar types, for God's sake not :), but just to point at the collision with the current core and doc generator. A simple way to fix this would be to restore the old 5.3 behaviour just passing on scalar types. Or may be there were a simple solution for this, despite 5.4.3 is already issued? Cheers Anatoliy -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] BreakIterator
On Fri, 01 Jun 2012 11:31:13 -0700, Stas Malyshev wrote: BreakIterator also exposes other native methods: getAvailableLocales(), getLocale() and factory methods to build several predefined types of BreakIterators: createWordInstance() for word boundaries, createCharacterInstance() for locale dependent notions of characters, createSentenceInstance() for sentences, createLineInstance() and createTitleInstance() -- for title casing breaks. These factories currently return One thing I notice here is that with this API it is not possible to programmatically choose what is the iteration unit - you'd have to do a switch for that. Do you think it may be a good idea to have a generic function that allows to choose the unit programmatically? You can create a RuleBasedBreakIterator with any rules you choose. The rules are basically a set of regex expressions; ICU has two matching modes -- by default it tries the longest match, but it can also chain together rules. There are rules to advance, to go back and to go to a safe position from an arbitrary position in the two directions. The ICU user guide to which I linked in the first e-mail has more details. What is the notion of characters - is it grapheme characters? Is there option to iterate over code points too - not sure if it's useful just curious, as we used to have it in PHP 6 IIRC. Yes, they are grapheme clusters. ICU has a special rule for Thai, but from I see in the tracker, it's obsolete with recent versions of Unicode (possibly the root rule is now generic enough). To iterate over code points, you can build a very simple RuleBasedBreakIterator -- new RuleBasedBreakIterator('.;'). See this example here: https://gist.github.com/2843005 About getAvailableLocales() - what this actually does? Does it list all avaliable locales in the system, ones that have BreakIterator rules, or something else? If it's not related to BI, I'm not sure we need to have it in BI. What is the intended usage of it? Maybe it should be part of Locale class? Right now, the ICU implementation just calls Locale::getAvailableLocales(), but its description is Gets all the available locales that has localized text boundary data. so I suppose it could return a different set in the future. Note that BreakIterator is an iterator only in the sense of the first 'Iterator' in 'IteratorIterator', i.e., it does not implement the Iterator interface. The reason is that there is no sensible implementation for Iterator::key(). Using it for Doesn't it have a notion of current position? If so, key should be the current position. Will this BreakIterator be usable in foreach? I'm not sure I understand it from this description - understanding this without any usage examples, RFCs or code snippets for intended usage is really hard and I think we should really start with doing that. I would expect this class to work like this: foreach(BreakIterator::createWordInstance(blah blah blah) as $i = $word) { echo Word number $i is $word\n; } or at least like this: foreach(BreakIterator::createWordInstance(blah blah blah) as $i = $word) { echo Next word at position $i is: $word\n; } Is it the model? If not, I think we need to wrap the C API to make this possible, because this is what people expect in PHP from the iterator. My options here were: the BreakIterator mirrors the ICU homonym -- it iterates over breaks, i.e., boundaries in the text. Hence, the iterators returns the *positions* of the several boundaries. Therefore, this cannot be used also for the key. Acknowledging that getting the text between the boundaries was going to be a common scenario, I added a method, getPartsIterator(), that yields the text between each boundary. Hence, there is one less element in this iterator than in the BreakIterator. Neither of the iterators implement getKey(), so one traversing the keys will be 0, 1, 2... It would probably be a good a idea to change the parts iterator to give the left boundary as the key. That way on could do: $bi = BreakIterator::createWordInstance(NULL); $bi-setText($foo); foreach ($bi-getPartsIterator() as $k = $v) { echo $v is at position $k\n; } instead of $bi = BreakIterator::createWordInstance(NULL); $bi-setText($foo); $pos = $bi-first(); foreach ($bi-getPartsIterator() as $v) { echo $v is at position $pos\n; $pos = $bi-current(); } Another possibility would be to have the break iterator itself behave as the parts iterator for iteration purposes. I don't think that is a good idea. Even though BreakIterator does not implement Iterator, people would expect next() and current() return the next and current iterator value, while they would be returning the iteration key. By the way, you can look at the test cases in the tree on github for examples: https://github.com/cataphract/php-src/commit/d289c3977ed4ba8d9ba127e5af9f709b19b8e1ba Thanks for the comments! -- Gustavo Lopes -- PHP Internals - PHP
Re: [PHP-DEV] 5.4.3 type hint handling
Hi, 2012/6/1 Anatoliy Belsky a...@php.net: Hi, I'm experiencing an issue adding type hints to the function prototypes. The following definition gives the unknown typehint error when invoking a function ZEND_BEGIN_ARG_INFO_EX(arg_info_trader_adosc, 0, 0, 4) ZEND_ARG_TYPE_INFO(0, high, IS_ARRAY, 0) ZEND_ARG_TYPE_INFO(0, low, IS_ARRAY, 0) ZEND_ARG_TYPE_INFO(0, close, IS_ARRAY, 0) ZEND_ARG_TYPE_INFO(0, volume, IS_ARRAY, 0) ZEND_ARG_TYPE_INFO(0, fastPeriod, IS_LONG, 1) ZEND_ARG_TYPE_INFO(0, slowPeriod, IS_LONG, 1) ZEND_END_ARG_INFO(); We do not use ZEND_ARG_TYPE_INFO() with scalar types that are not covered with the type hint supports. (i.e. string, integer, double, resource) -- Regards, Felipe Pena -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DEV] NEWS again
On 06/01/2012 12:48 AM, Gustavo Lopes wrote: If the RMs are unwilling to do such merging, we should change the policy to require updating the NEWS files in every stable branch to which the fix was merged. This makes sense to me. Chris -- christopher.jo...@oracle.com http://twitter.com/#!/ghrd -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php