Hi Rowan, On Mon, Jan 19, 2015 at 12:05 AM, Rowan Collins <rowan.coll...@gmail.com> wrote:
> On 18 January 2015 at 01:01, Yasuo Ohgaki <yohg...@ohgaki.net> wrote: > >> Hi Rowan, >> >> On Sat, Jan 17, 2015 at 8:43 PM, Rowan Collins <rowan.coll...@gmail.com> >> wrote: >> >>> My concern is, at what cost? Given how rarely used the internal pointer >>> is, >>> are we carrying around a chunk of extra memory with every array just on >>> the >>> off-chance that it will be used, or is there some magic that makes it >>> zero-width until it's needed? >>> >>> Reusing it for foreach makes perfect sense if it increases performance >>> for >>> the majority of cases, and the only way to cover every edge-case like the >>> one at the top of this thread would be to ban that optimisation outright. >>> If there's some way of separating things such that the rarely used >>> constructs take the performance hit, I'm all for it, though. >>> >> >> External position pointer needs a little more memory for sure. However, >> external position pointer is used all over the place in standard and other >> modules. >> >> I agree that some benchmark should be taken before we proceed. >> > > I meant more that the internal pointer takes up memory which is unused by > 99% of PHP applications, so we might as well get some benefit from it, but > it comes to the same thing. > It might be a good idea. I agree that more than 99% of PHP codes do not need internal position pointer. Dmitry might be interested in this area or he might take some benchmarks already. However, PHP does not support bock scope, so getting rid of internal hash position pointer may not be feasible even for simple loops. There may be workarounds in the engine, though. Besides scope, user may call current()/next()/reset() anywhere for an array in code. External position pointer is required to support this. Obvious workaround is to have array position resource which is attached to specific array. $pos = array_fetch_position($array); // Get array position resource of $array. Like external hash position in C. reset($pos); // Reset position; $var = current($array, $pos); $var = next($array, $pos) $var = next($array, $pos) This requires script modification, though. External position resource would be useful for complex array operations. It could be implemented in standard module. It may be used to traverse depth first, breadth first,etc. I might work on this. > >> Alternatively, we could achieve consistency the other way round, by making >>> foreach() reset the internal pointer even when it *doesn't* use it, and >>> documenting that fact like we do for certain array functions. Code >>> relying >>> on the current behaviour when mixing them is probably buggy anyway, >>> because >>> it is not well-defined, so the BC concern should be low. >>> >> >> IIRC, there were discussions when foreach is added that point out >> internal position >> pointer usage. Before foreach, everyone was using >> current()/next()/reset(). I think >> the reason foreach sets internal position pointer to the end was the code >> used in >> those days. It was not a technical/performance reasons, wasn't it? I >> don't remember >> well because it was more than 10 years ago. Anyone? >> >> IIRC, foreach was introduced by Andi. I think foreach used external >> position pointer and >> didn't touch internal position pointer at all with his first proposal. >> Please correct me if I'm >> wrong. >> >> As for nesting, I think PHP is doing the right thing for plain arrays and >>> non-rewindable integrators, because I would expect a for each loop to >>> start >>> at the beginning, thus have an implicit reset/rewind. The rewindable >>> behaviour is awkward, though - intuitively, the iteraror needs to be able >>> to track multiple positions simultaneously, which isn't guaranteed by the >>> interface design. Maybe an error should be issued if the integrator is >>> already subject of a foreach loop? >>> >> >> Iterator is headache, I agree. >> Iterator nesting may be detected by a flag easily, but it may be better >> to generate >> independent iterators for each "foreach". >> >> > I don't think that's an either/or situation: in most cases, it would be up > to the user to create that independent iterator, because there's no general > algorithm for cloning one that the engine could use. (Think of an iterator > proxying a database cursor, for instance - it could be rewindable directly, > but cloning it would imply issuing a new query to create a new cursor with > separate state.) So the engine needs to issue an error telling the user > that they're reusing an object in an dangerous way. An extended interface > could be created later for those cases that can handle the situation, > although I doubt it would be that common a use case. > I think if users need nested database cursor based iterator or like, they should create new iterator object by themselves. i.e. User should have __clone() that opens new cursor. It might be good to have un-rewindable iterator also. e.g. If rewind() method returns FALSE, then it's not rewindable. Anyway, position independent array iteration with foreach might be good feature for PHP 7. This behavior is reasonable. http://3v4l.org/lJTQG but this is not reasonable now. (It was used to be) http://3v4l.org/HbVnd Regards, -- Yasuo Ohgaki yohg...@ohgaki.net