Re: [PHP-DEV] Re: Moving to an AST-based parsing/compilation process

2012-09-06 Thread Morgan L. Owens

On 2012-09-06 10:39, Stas Malyshev wrote:

Hi!

... and no real
benefit for average PHP user.

Well, apart from perhaps leaving them with a simpler language that 
doesn't have the inconsistencies and corner cases that currently exist 
(and documented ad nauseum) not because of any design decision but 
"because the parser is written that way".



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: Moving to an AST-based parsing/compilation process

2012-09-05 Thread Morgan L. Owens

On 2012-09-05 07:57, Nikita Popov wrote:

Hey folks!

Some people asked me what the advantages of using an AST-based
parsing/compilation process are, so I put together a few quick notes
in an RFC:

https://wiki.php.net/rfc/ast_based_parsing_compilation_process

It would be nice to get a few comments from other core devs on this.

Nikita

I'm not a core dev, but I would like to add to the notes above that 
"third parties", such as myself, who want to do things with PHP source 
other than run it through a PHP interpreter would also appreciate such a 
separation of concerns.


To date, I've been basing work, which exposes syntactic structure, on 
phc's maketea grammar (Phalanger's is more up to date, but also more 
complicated what with its provenance and the Linq and generics and all), 
but it's reverse-engineered and certainly wrong (oh, that reminds 
me...); the existing grammar is unsuitable because no-one wants to see 
_that_.


Something authoritative that _by definition_ tracks the current version 
would be more reassuring as regards accuracy and compatibility (and be 
more likely to result in something that deserves to be let out into the 
world with confidence).



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] $obj->attr::method() is not allowed

2012-09-04 Thread Morgan L. Owens

On 2012-09-04 18:36, Stas Malyshev wrote:

Hi!


The problem is that the only formal definition of the language _is_ the
parser - there's no grammar outside
zend_language_scanner.l/zend_langauge_parser.y.


I'm not sure - why exactly is it a problem here? I can understand how
having such document might be useful and interesting (not useful and
interesting enough that for the last 10 years anybody bothered to do it
but still, I can imagine) - but how existence of such document would
solve your problem?


My problem? I'm not the one who wrote:

On 2012-09-04 03:18, Andrew Faulds wrote:
>
>
> Can't do this:
>
>self::$views[$path]();
>
> But I *can* do this:
>
>$x = self::$views[$path]; $x();
>

But I'm certainly among those who've raised this in the past. A quote 
from https://wiki.php.net/rfc/fcallfcall


"What does not work currently is $foo→bar()() - since it is surprisingly 
hard to tell parser it's not {$foo→bar}()() - which of course is not 
what I want to do."


The complexity of the current grammar is working against development.


The developers of phc (http://www.phpcompiler.org/) ended up having to
write their own AST-based grammar (which may possibly serve as a
starting point for a grammar rewrite) because PHP's own parsing of even
simple expressions is so long-winded and unintuitive. That site gives a
couple of examples: (for 'printf("Hi!");' (27 nodes, including
"r_variable") and '$a->b->c' (26 nodes, including
"base_variable_with_function_calls")).


I don't see any relationship between the fact that implementation of
separate compiler required separate parser (which is only logical since
existing parser is built to suit the needs of existing compiler and not
whatever needs phc authors had) and the parser being "unintuitive". I
also do not see how the number of states the parser passes is a measure
of anything and why 27 should looks like a horrible number. There are a
lot of things that can happen in expression, in function call and in
function argument, the parser must understand it all.

>
Actually, looking at the parse tree the authors gave, I think there's a 
bug in the diagram and there may only be 22.


How many of those things really are distinct, and how many are an 
artefact of the implementation? To pick just one a recent example: why 
is the $x in "foreach($array as $x)" such a different beast from the $x 
in "$x = $array[0]" that the latter could be replaced by "list($a, $b)" 
but the former couldn't?



For the arguments given by phc's authors, what their needs were (a 
PHP->x86 assembly compiler), other use cases for a language 
specification, and what they've done about it, I redirect you to the 
site cited, since they know them better than I (having worked on it for 
longer than I have). Mainly


https://www.cs.tcd.ie/publications/tech-reports/reports.07/TCD-CS-2007-47.pdf

rather than try to summarise them here and have questions directed at 
the summary rather than the source (as has already happened).



As for r_variable, it's just an expression that is not writable (i.e.,
no printf("Hi")=1) - what's so strange about it?

Ah, so "r_variable" means "non-writable expression". Makes perfect 
sense. Incidentally, "r_variable" expands to "variable", then to 
"base_variable_with_function_calls", and _then_ to "function_call".


Then again, $a->b->c is _also_ an "r_variable", so does that mean I 
can't write "$a->b->c = 1;"?


Morgan L. Owens



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] $obj->attr::method() is not allowed

2012-09-03 Thread Morgan L. Owens

On 2012-09-04 04:17, Laruence wrote:

On Tue, Sep 4, 2012 at 12:05 AM, Andrew Faulds  wrote:

On 03/09/12 17:03, Laruence wrote:


It is great parser(simple and fast) which make php works well for years..


No, it's not, it's overly complex. You have to define all sorts of different
expression and variable variations :(

if you don't, you will get a complex executor.




rewrite the parser for what?  for more academism?


No, because current one is constraining implementation of some things, and
making some things impossible. (we are limited, for instance, in options for
order that generator expressions (e.g. Python's [i*2 for i in range(6)])
could be)


we will rewrite it soon or later, but it's not because current one is
bad,  it will because a better parser tool shows up...


Well, current one is bad. It only works well for some common cases. For
others, it doesn.t

I really don't get it, you are saying current is bad, but every
language has defects,  either parser, or executor.

The problem is that the only formal definition of the language _is_ the 
parser - there's no grammar outside 
zend_language_scanner.l/zend_langauge_parser.y.


As well as making for a language full of special cases and gotchas about 
what expressions can be written where (and this thread only mentions a 
couple - everyone here can think of others) it also gets in the way of 
writing tools that work with the language.


The developers of phc (http://www.phpcompiler.org/) ended up having to 
write their own AST-based grammar (which may possibly serve as a 
starting point for a grammar rewrite) because PHP's own parsing of even 
simple expressions is so long-winded and unintuitive. That site gives a 
couple of examples: (for 'printf("Hi!");' (27 nodes, including 
"r_variable") and '$a->b->c' (26 nodes, including 
"base_variable_with_function_calls")).


Morgan L. Owens


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] Support negative indexes for arrays and strings

2012-09-01 Thread Morgan L. Owens

On 2012-09-01 20:17, Kris Craig wrote:


This discussion kinda reminds me of some of the debates over AUTO_INCREMENT
behavior in the MySQL community.  Specifically, they end up having to
tackle the same funcamental, conceptual dilemma:  If I
assign/insert/whatever an arbitrary value to a container that can be
incremented, and then I direct said container to generate the next
increment, what value should that be?  What's the most sensible (or perhaps
the least unsensible) way to determine that?

Well, in that case SQL provides sequences; iterators that are 
independent of the column that you might be using them for. You can 
specify an explicit value or default to "the next value from the 
sequence". If you have a UNIQUE constraint on the column in question a 
collision would be an error (but since the attempt would have caused the 
sequence to advance, you could just try again); if not then you end up 
with two records with the same value in that column.


So, in PHP terms, pretty much what there is now - a distinct sequence 
generator for the array - with the difference being that in the case of 
key collision the later value overwrites the former.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] Support negative indexes for arrays and strings

2012-09-01 Thread Morgan L. Owens

On 2012-09-01 21:23, Sherif Ramadan wrote:


$array[0] = 'first element';
$array[9] = 'second element';
var_dump($array);
/*
array(2) {
   [9]=>
   string(14) "second element"
   [0]=>
   string(13) "first element"
}
*/


Just correcting this as it was a copy/paste fail... The above code
would produce:

array(2) {
   [0]=>
   string(13) "first element"
   [9]=>
   string(14) "second element"
}



Or:


And, just to be explicit, following that up with:


===>

Array
(
[1] => 42
[0] => 17
[2] => 99
)
1   42
0   17
2   99


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] [VOTE] Generators

2012-08-30 Thread Morgan L. Owens

On 2012-08-30 20:39, Derick Rethans wrote:

On Wed, 29 Aug 2012, Gustavo Lopes wrote:


On Wed, 29 Aug 2012 22:10:52 +0200, Derick Rethans  wrote:



Nothing in the core throws an exception, why would this?!


This is not accurate. All the iterators throw exceptions on similar
situations. Generators are iterators, so I see no deviation from the
norm here.


Nothing shows that they are. In any example I saw, I don't even see any
OO syntax/functionality. For me, "yield" is a core syntax *keyword*, and
hence: no exceptions.

cheers,
Derick

I'm generally of the same opinion as you - issue a warning and skip the 
second loop - but for the fact that the "idiomatic" use


 $item)
{
...
}
?>

is equivalent to the more explicit "manual" use

rewind();
while($generator->valid())
{
$k = $generator->key();
$item = $generator->current();
...
$generator->next();
}
?>

Since explicit control of generators can be useful, they won't 
necessarily be wrapped in the foreach() idiom; would not exceptions be 
the expected behaviour then?


There might be a compromise - trigger a warning in idiomatic use, an 
exception in manual use - but that leaves a nasty taste in the mouth and 
would just cause more problems later.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] [RFC] Generators

2012-08-23 Thread Morgan L. Owens

On 2012-08-23 00:10, Lester Caine wrote:


Then the next example is an 'iterator' ... which you are right ... I do
not appreciate either, because they require an insane amount of overhead
for what would be easy if the first example had been done right! I did
try them, in the past, but the overhead outweighed any advantage and I
can't find them in any of the projects I work with apart as a blank
frame in ADOdb ... which nobody seems to use.

But as far as I've been able to make out, your method _still_ has the 
overhead of maintaining state during the iteration - only now it's each 
consumer's responsibility to remember its state between successive 
notifications by the producer. The producer can't help with that, 
because each consumer has different maintenance requirements; managing 
the iteration process has therefore been split across two distinct 
objects with the consumer doing the hard part (which is why the producer 
looks so simple).


I suppose if you're lucky the consumer doesn't need to remember any 
state between one notification and the next, and it can treat every 
dataum like it's the first. But that is hardly the general case.





To describe iterators further, here are a few links (note that they're 
not necessarily the best - they're just the first non-Wikipedia, 
non-Amazon links I got when Googling for "Design Patterns"):


http://www.dofactory.com/Patterns/PatternIterator.aspx
http://www.oodesign.com/iterator-pattern.html
http://sourcemaking.com/design_patterns/iterator

It might also help to compare the Observer pattern, in the special case 
where the one-many relationship is actually one-one.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] [RFC] Generators

2012-08-23 Thread Morgan L. Owens

On 2012-08-23 02:03, Lester Caine wrote:


I accept your point about not caring about how the data was created, but
on the other side, if the data creation is handling a lot more data than
the consumer needs there is an amount of processing time that is wasted.
The quick way of doing something does not equate to the best way of
doing it.

Only if the producer does work unnecessary for determining the next 
datum required by the consumer. It doesn't have to create all the data 
at once (if it did you might as well stuff it all in a big array and use 
that).




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] Official Userland Library (was: removing an item from an array)

2012-08-22 Thread Morgan L. Owens

On 2012-08-22 19:45, Lester Caine wrote:


Personally I'm looking for a 'Official Userland Library' that provides
EXAMPLES of how to do operations rather than yet another downloadable
library.

This is also the sort of thing I would like to see. The examples within 
the manual are written for exposition of the feature being described, 
not for Best Practice in their workaday use.


One example of the difference just off the top of my head is that weekly 
I see people asking about why they get "Undefined index" errors because 
they didn't check that $_POST['foo'] existed before they tried to use 
its value.


The page on external variables doesn't mention the fact that variables 
that might not exist need to be checked first, and none of the examples 
make such checks except for #4. (Nor should it have to, because the page 
is about external variables specifically, and this fact is true of 
variables in general - and _it_ is mentioned on the variable basics page).


>

There are a few such 'samples' in the manual, but the user added ones
tend to be messy, and finding them is rather hit and miss. I HAVE found
good samples, but getting back to them can be a problem.

>
User submission could be a source of examples, but any collection would 
require expert curating to ensure quality and maintain currency. So it 
would mean work for someone.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] Official Userland Library (was: removing an item from an array)

2012-08-22 Thread Morgan L. Owens

On 2012-08-22 19:45, Lester Caine wrote:


Personally I'm looking for a 'Official Userland Library' that provides
EXAMPLES of how to do operations rather than yet another downloadable
library.

This is also the sort of thing I would like to see. The examples within 
the manual are written for exposition of the feature being described, 
not for Best Practice in their workaday use.


One example of the difference just off the top of my head is that weekly 
I see people asking about why they get "Undefined index" errors because 
they didn't check that $_POST['foo'] existed before they tried to use 
its value.


The page on external variables doesn't mention the fact that variables 
that might not exist need to be checked first, and none of the examples 
make such checks except for #4. (Nor should it have to, because the page 
is about external variables specifically, and this fact is true of 
variables in general - and _it_ is mentioned on the variable basics page).


>

There are a few such 'samples' in the manual, but the user added ones
tend to be messy, and finding them is rather hit and miss. I HAVE found
good samples, but getting back to them can be a problem.

>
User submission could be a source of examples, but any collection would 
require curating to ensure quality and maintain currency.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] [RFC] Generators

2012-08-21 Thread Morgan L. Owens

Oh, yes...


On 2012-08-22 04:35, Lester Caine wrote:

 and doesn't seem to
provide any outstanding advantages?



As I wrote in an earlier post:

On 2012-08-09 15:30, Morgan L. Owens wrote:
> I for one am lazy, and would much prefer writing:
>  function append_iterator($first, $second)
> {
>foreach($first as $i)
>{
>yield $i;
>}
>foreach($second as $i)
>{
>yield $i;
>}
> }
> ?>
>
> to
>
>  87 lines elided
> ?>

I'm not making any assumption about where the generator returned by 
append_iterator() is going to be used. In particular, it might not be in 
a public object method or other callable. But its use would probably 
look something like:






--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] [RFC] Generators

2012-08-21 Thread Morgan L. Owens

On 2012-08-22 04:35, Lester Caine wrote:

John LeSueur wrote:

Again, the case you've cited is probably not a case where generators
give much
advantage. But anytime the logic to produce some data becomes more
complicated
(Imagine that you needed to skip some of the lines in the .csv file
based on
some kind of filter), then being able to extricate the producer logic
from the
consumer logic is helpful.


But in my 'method of working' one simply changes the '$contact->' object
and loads the one that matches the data you are handling. Sage, paypal,
moneybox, nlpg and so on. I've a dozen or so data models currently.

>
That is one of the differences. With the Iterator pattern that is being 
automated here, all the boilerplate of maintaining state between 
iteration steps is contained within the iterator object, and is the same 
for all uses of that object, so only needs to be written once (and, with 
this extension, that writing is automated). Using the Observer pattern 
like you do requires boilerplate to be written afresh for every distinct 
_use_ of the iterator. That boilerplate isn't directly related to the 
consumer's actual job but to the iteration process (limiting the range 
of producers that consumer can be fed from).



but I still can't see why the work flow has to be
reorganised rather than simply following the natural flow that the job
requires.

The adjective "natural" would only apply if you always see iteration as 
the producer pushing its productions out to the consumer ("here - go 
work on this"), and never picture the situation as the consumer pulling 
them from the producer ("give me something to work on"). Otherwise it's 
just a different perspective.


The Observer pattern certainly does have its uses (otherwise it wouldn't 
be a pattern): if the "things being produced" are interpreted as "events 
being handled" ("hey, a new line of CSV has just come in!"), there may 
be multiple consumers interested in being notified when those events occur.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] [RFC] Generators

2012-08-20 Thread Morgan L. Owens

On 2012-08-21 01:10, Lester Caine wrote:


For the third one ... I'm still waiting for some clarification on how
yield is SUPPOSED to work anyway? If you are using a 'generator' to
return a sequence of data elements, then just what does happen between
each call to the generator ... DOES the generator get called for each
value until something flags there are no more? I still don't see the
advantage of restructuring how things are called to get around a problem
which can just as easily be solved by proper design in the first place.

Have you even read the RFC yet? Calling a generator function returns an 
instance of a Generator object. That object has a method interface - 
described in the RFC. That interface is accessible to the foreach() 
construct so the loop can iterate over the successive values returned by 
the object.



If by "proper design" you mean your own personal preference for using 
callbacks, that's also discussed in the RFC. You were told this last month:


On Sat, Jul 28, 2012 as 05:34 AM, Nikita Popov  wrote
> On Wed, Jul 25, 2012 at 10:36 PM, Lester Caine  
wrote:
>> But WHY would you not simply put the function that is handling the 
returned

>> data in place of the yield?
>> Why do you want to keep exiting and re-entering the 'do loop' when 
the data

>> can simply be forwarded direct to a function to handle it?
>
>This question has come up a few times now, i.e. why one can't just use
>callbacks. So I added a section about this, explaining it with a few
>examples: 
https://wiki.php.net/rfc/generators#why_not_just_use_callback_functions

>
>Hope it helps,
>Nikita



But let's compare the two side-by-side.

So there is a "producer" that has or generates the successive elements 
of a sequence, and a "consumer" that does something with each element in 
turn. One can make a choice about which of the two - producer or 
consumer - gets to drive the process.


In one (basically the Observer pattern), the producer drives, and the 
consumer gets to observe by supplying a callback that the producer calls 
for each new element. Since the producer has no idea about the internal 
workings of the consumer, the consumer has to ensure that it can 
properly resume the next time it is called.


In the other (basically the Iterator pattern), the consumer drives, 
calling the producer each time a new element is required. Since the 
consumer has no idea about the internal workings of the producer, the 
producer has to ensure that it can properly resume the next time it is 
called.


There are a couple of differences between using an observer and using an 
iterator. The main difference is because at least one of the producer or 
consumer needs to maintain its own state between iterations.


For the producer-driven observer approach, it is the consumer's job to 
maintain state. - the producer's state is safe on the call stack.


For the consumer-driven iterator approach, it is the producer's job to 
maintain state - the consumer's state is safe on the call stack.


In the observer approach, writing the state-maintenance code falls to 
whoever is writing the consumer; the producer is unable to help with 
that at all because it's different for each consumer.


In the iterator approach, writing the state-maintenance code falls to 
whoever is writing the producer; that code is the same regardless of the 
consumer.


As this proposal and existing implementations show, it is possible, 
given the kernel behaviour of the iteration, to mechanically generate 
all the state-handling and other boilerplate necessary to produce a 
Generator object.


That's the main difference. The other difference is that in the case of 
the iterator, in part because the code generation can be mechanised, the 
iteration mechanism can be done in such a way that it fits in with the 
iteration mechanisms that PHP already has built into the language: the 
foreach() statement doesn't care whether it's iterating over an array or 
an iterator or a generator. Either could be supplied at runtime.


In contrast, to achieve the analogous effect with an observer mechanism 
would involve (among other things) searching for occurrences of control 
flow statements that reference the sequence, excising those statements 
and using them as the kernel for a callback method to be supplied to the 
sequence producer ... but only if the sequence in question wants all 
that done in the first place.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] removing an item from an array

2012-08-18 Thread Morgan L. Owens

On 2012-08-19 04:08, Levi Morrison wrote:

On Sat, Aug 18, 2012 at 12:42 AM, Alexey Zakhlestin  wrote:


On 16.08.2012, at 0:18, Rasmus Schultz  wrote:


How come there is no straight-foward obvious way to simply remove a given
value from an array?


Well, this sounds like a reason for creating SplSet class



There's already SplObjectStorage which CAN act like a Set for objects
only.  It's a terrible solution in my opinion and am working on
creating a proper one.  I don't know if that effort will be accepted,
but I wanted to point out that a set already exists in the SPL.


SplObjectStorage would indeed cover one given example:

On 2012-08-19 10:18, Andrew Faulds wrote:
> Hmm. I can think of a particular instance I've needed to remove an
> item from an array by value, and where keys wouldn't be an option. In
> PHP you can only have string or int keys, you can't have object keys.
> In Python, I've used Twisted (an excellent asynchronous OOP
> networking framework) to write client-server applications. One thing
> I did was maintain a list of clients (that is, a Python list). When a
> client disconnected, I would do list.remove(client). Of course there
> might be better ways to implement this, but it was simple and worked.
> Anything wrong with allowing the same in PHP?
>
SplObjectStorage even has a detach() method for removing an element by 
value, as well as implementing the various array access interfaces.




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] re: removing an item from an array

2012-08-18 Thread Morgan L. Owens

On 2012-08-19 10:25, Andrew Faulds wrote:

On 18/08/12 14:52, Morgan L. Owens wrote:

How simple is it? Does it:

1) Remove one occurrence of the element (presumably the first) or all?
2) Reindex the array (as someone else argued was necessary to make it
"properly indexed" afterwards) or not?
3) Modify the array in-place or return a modified array?
4) Use type-strict or normal comparisons?


So to answer you, 1) one, 2), no, 3) in-place, 4) type-strict (I don't
see how a weakly-typed comparison for identifying an array value would
help anyone).
So ... using a different definition of "value" than that used by 
array_search() and array_keys(), then?



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] re: removing an item from an array

2012-08-18 Thread Morgan L. Owens

Rasmus Schultz wrote:

> I disagree - this is (or should be) a simple, atomic operation...
> yet, you've got a function-call, an intermediary variable, a boolean
> test, and an unset statement repeating the name of the array you're
> deleting from.
>
> This should be a simple statement or function/method-call, and in
> most other languages it would be...

How simple is it? Does it:

1) Remove one occurrence of the element (presumably the first) or all?
2) Reindex the array (as someone else argued was necessary to make it 
"properly indexed" afterwards) or not?

3) Modify the array in-place or return a modified array?
4) Use type-strict or normal comparisons?

That's sixteen different interpretations of "remove a given value from 
an array" right there. Leave support for any of them out and someone is 
going to complain that they have to write a function to do what should 
be a "simple statement". Support all of them and there will probably 
_still_ be someone who wants their own particular set of circumstances 
catered for without having to write their own function to do it.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] removing an item from an array

2012-08-15 Thread Morgan L. Owens

On 2012-08-16 17:55, Sherif Ramadan wrote:
> That doesn't make any sense. What if the values are present more than
> once? array_flip will cause the keys to be overwritten.
>
Not to mention converting all of the array's elements to strings and/or 
integers.




Now your array is something completely different from what you wanted.
The solution stated earlier is the most sane one (just using
array_keys() with a search value).

The problem isn't very complicated and doesn't require a complex solution.

This thread is overstating a rudimentary problem (and that's the lack
of understanding PHP arrays).


I was actually alluding to something more disturbing:

...which makes array_diff and its ilk only useful for scalar elements 
(hence the need for the more complicated approach using array_keys(), 
which provides a strictness flag).




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] removing an item from an array

2012-08-15 Thread Morgan L. Owens

On 2012-08-16 08:27, Nikita Popov wrote:

On Wed, Aug 15, 2012 at 10:22 PM, Stas Malyshev  wrote:

Hi!


How come there is no straight-foward obvious way to simply remove a given
value from an array?
Just look at the number of horrible ways people solve this obvious problem:


I see:
if(($key = array_search($del_val, $messages)) !== false) {
 unset($messages[$key]);
}

Nothing horrible here.


Btw, deleting all values (not just the first) is also very easy currently:

foreach (array_keys($array, $delValue) as $key) {
 unset($array[$key]);
}


$array = array_diff($array, [$delValue]);



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Generators in PHP

2012-08-09 Thread Morgan L. Owens

hakre wrote:

>> Also, currently yield looks very similar to return and I think this
>> is a nice thing as it is similar semantically. yield($foo) would
>> give it different semantics, imho.
>
> I love this point a lot. Return is very common and yield is some
> kind of return.
>
I agree also: yield behaves far more like a return statement than a 
function call. (In a sense, a yield is a return that doesn't discard the 
current function's execution state.)


Which (as it happens) reminds me of the difference between
return $foo;
and
return ($foo);

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Generators in PHP

2012-08-09 Thread Morgan L. Owens

Mike Ford wrote:
>
> The signposting needn't even be as in-your-face as a generator
> keyword (either instead of or in addition to function): I could get
> behind a variation such as:
>
> function f($x, $y) yields { ... yield $z; ... }
>
> Or even (stretching a bit to re-use an existing keyword!):
>
> function f($x, $y) return { ... yield $z; ... }
>
> Although I like the concept of generators, I would be -1 for any
> implementation that doesn't differentiate them in some way from
> regular functions.
>
In other words you want to have return-value type-hinting _in one 
specific instance_: when calling f() returns a Generator object.




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] Generators in PHP

2012-08-08 Thread Morgan L. Owens

On 2012-08-09 08:42, Nikita Popov wrote:


Without parenthesis their behavior in array definitions and nested
yields is ambigous:

array(yield $key => $value)
// can be either
array((yield $key) => $value)
// or
array((yield $key => $value))

yield yield $key => $value;
// can be either
yield (yield $key) => $value;
// or
yield (yield $key => $value);

Apart from that particular case there is the general operator
precedence inclarity, e.g.

yield $foo . $bar;
// could be
(yield $foo) . $bar;
// or
yield ($foo . $bar);

Is this complicating yield a bit too much? All these ambiguities would 
go away if 'yield' had the same grammatical status as 'return' - in 
other words, if it were treated as a control-flow keyword rather than as 
an operator.




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] Re: Generators in PHP

2012-08-08 Thread Morgan L. Owens

On 2012-08-09 14:25, Larry Garfield wrote:

On 07/27/2012 07:23 AM, Lester Caine wrote:


Nikita - I am looking for a well reasoned argument as to why generator
has to be added at all! 'Just because it can be' is not a valid
argument, but perhaps you could add to the RFC the performance
implication or advantage of what is being proposed. That would at
least be some comparison with the current methods of doing the same
thing?



Anthony had a very good writeup on generators and how they compare to
iterators last week:

http://blog.ircmaxell.com/2012/07/what-generators-can-do-for-you.html

I think that does a good job of laying out the case for generators as
"low-effort iterators".


I for one am lazy, and would much prefer writing:


to

first = $first;
$this->second = $second;
}

public function rewind()
{
$this->first->rewind();
$this->second->rewind();
$this->state = 0;
}

public function current()
{
return $this->_current;
}

public function key()
{
return $this->_key;
}

public function next()
{
switch($this->_state)
{
case -1:
return;
case 0:
$this->first->rewind();
$this->_state = 1;
case 1:
if($this->first->valid())
{
$this->_current = $this->first->current();
$this->_key = $this->first->key();
$this->first->next();
return;
}
else
{
$this->second->rewind();
$this->_state = 2;
}
case 2:
if($this->second->valid())
{
$this->_current = $this->second->current();
$this->_key = $this->second->key();
$this->second->next();
return;
}
else
{
$this->_state = -1;
return;
}
}
}

public function valid()
{
return $this->_state != -1;
}
}
?>



One question, though: It looks based on the voting like finally {}
blocks are going in.  So... what should happen in the following situation:

function stuff() {
   try {
 foreach (range(1, 100) as $i) {
   yield $i;
 }
   }
   finally {
 print "All done";
   }
}

Does "All done" get printed once, or 101 times?  Similarly:

function things() {
   $i = 1;
   try {
 while (true) {
   yield $i++;
 }
   }
   finally {
 print "All done";
   }
}

That will run indefinitely.  So will "All done" ever print, or does that
finally become unreachable?

My own gut expectation (the answers that would surprise me the least) 
would be (a) once (because the foreach is _inside_ the try block, and 
the catch/finally handler is hence _outside_ the loop); and (b) the 
finally{} is unreachable (assuming that loop doesn't manage to throw an 
exception!) because control flow never gets to the end of the try block.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PHP 5.x Documentend End Of Life Dates

2012-08-02 Thread Morgan L. Owens

On 2012-08-02 20:42, Peter Cowburn wrote:

On 2 August 2012 09:36, Morgan L. Owens  wrote:

Just as each release announcement dated with detailed kept on a distinct
page (linked in that list), all that's needed there is a date when support
ended, with (any available) information on what was obsoleted in the process
also on a distinct page - avoid cluttering up the release list.



The details on things being "obsoleted" should be in the migration guides.

Then that would be where the links go, in a similar manner to the 
Changelog links.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] PHP 5.x Documentend End Of Life Dates

2012-08-02 Thread Morgan L. Owens

On 2012-08-02 19:56, Peter Cowburn wrote:

On 2 August 2012 07:35, Adam Harvey  wrote:

Thoughts? (Do we even want to auto-fill this from $OLDRELEASES, or
would we rather have a manual array?) Specific notes on
vulnerabilities to add to branches? Better versions of the copy in the
initial blurb?


Why is this information not just displayed on "Unsupported Historical
Releases" (/releases)? Do you have plans for the EOL page that
wouldn't be suitable for the other one?  My worry is spreading this
information across multiple places, when we already have a very
detailed page about unsupported versions.  The X.Y version table could
be listed at the top of /releases to give a quick summary, before the
detail for each individual version listed chronologically.  We could
add more notes like the existing "Support for PHP 4 has been
discontinued since 2007-12-31." at the appropriate places in that
list.

Just as each release announcement dated with detailed kept on a distinct 
page (linked in that list), all that's needed there is a date when 
support ended, with (any available) information on what was obsoleted in 
the process also on a distinct page - avoid cluttering up the release list.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC] foreach_variable supporting T_LIST

2012-07-21 Thread Morgan L. Owens

On 2012-07-21 12:19, Stas Malyshev wrote:

Hi!


So when I have a function that has a two- or multi-part result then -
instead of having one part as the return value and the others by
reference - in Python I'd return a tuple.


PHP functions can return arrays and some do - e.g. pathinfo(). But not
all of them. Maybe getting more functions return result sets instead of
modifying parameters could be one target of that API refactoring people
are talking about. However, in some case - e.g. function returning
result and maybe error code - tuple may not be that convenient, in most
cases you just want the result, and unpacking tuple (or array in PHP)
would be additional complication.

Oh, I'm talking about my own user functions, not changing any native PHP 
functionality; things like returning a lat/long pair, or 
measurement/precision, or - well, pretty much any situation where a 
function returns an array with a known sequence of elements.


If it can happen once (hence the existence of list in the first place) 
it can happen several times (in the iteration of a loop).


Falling back on reference parameters makes it really hard to get at 
return values functionally:


function foo_wrapper($arg)
{
$t2 = null;
$t1 = foo($arg, $t2); // $t2 passed by reference
return [$t1, $t2];
}



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] Re: [RFC] foreach_variable supporting T_LIST

2012-07-20 Thread Morgan L. Owens

On 2012-07-21 10:28, Stas Malyshev wrote:

Hi!


If I understand this correctly, this is like what Python let's you do
with tuples. It's handy for getting vector components, hostnames and
port numbers, etc. (I apologise for the Python comparison, it is just
the language where I usually encounter this, and it makes heavy use of
foreach-style loops and tuples)


There's no need to apologize for Python comparison, Python is not a
dirty word :) However, in PHP functions rarely return sets of tuples
that can be manageably unpacked by this foreach syntax - usually it's
either something like DB result set, which has unpredictable number of
values, or one set of values, which doesn't need foreach. That's why I
wanted to see a use case where this is beneficial.

The number of values in a DB result set may be unpredictable, but the 
number of _elements_ in each value shouldn't be - and the list() will be 
on those elements:


foreach($resultset as list($id, $name, $address, $phone))
{
  
}

(Assuming I don't use array_map instead - $resultset might not be an array.)

Incidentally this also touches on something you say later:

On 2012-07-21 11:03, Stas Malyshev wrote:
>
> IMHO assignment is always more readable than function with magic
> unobvious effects. I'd rather always have assignments than scratch my
> head each time reading the code - is this the one that modifies the
> argument or the one that doesn't?
>
I agree fully with this; having functions actually behaving like, well, 
_functions_ makes it much easier to see what's going on (I don't have to 
look up the definition of foo($bar) to see whether it's modifying $bar 
or not).


So when I have a function that has a two- or multi-part result then - 
instead of having one part as the return value and the others by 
reference - in Python I'd return a tuple.


I like what has been done to improve support for functional programming 
(despite the elephant in the room); I guess that's why I'm cold to the 
other proposal to graft OO syntax onto non-OO semantics.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] [proposal + pull request] Replace logo GUIDs with data URIs

2012-07-15 Thread Morgan L. Owens

On 2012-07-15 09:48, Stas Malyshev wrote:

Hi!


And I actually know of websites using the functions to display the logo..
Is there some way we could provide a BC function for it somehow?
Maybe rather then removing the functions, make then return the data uris?


Having the functions to get the images sounds like a good idea, some
sites may want to use them to display the logos. However, I don't think
we should use the same function, as then deciding what the function
actually does is complicated. I'd rather prefer doing something like:

if(function_exists('php_logo_guid')) {
   $url = "/index.php?=".php_logo_guid();
} else if(function_exists('php_logo_url')) {
   $url = php_logo_url();
}
if(!empty($url))
echo "";

it's clear what each function gives me then.



There are three of these functions: one for the PHP logo, one for the 
Zend logo, and one easter egg.


If php_logo_guid()/php_logo_url() could take an argument ('php', 'zend', 
'easteregg') all three could be handled by the one function.


By itself this wouldn't change anything because the other functions 
would need to stay around for BC - at least for a while. But it opens up 
the possibility of extending the mechanism to the authors of third-party 
extensions: a ready source of site badges, and a visual identifier on 
phpinfo() pages.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] 6.0 And Moving Forward

2012-07-14 Thread Morgan L. Owens

On 2012-07-14 04:12, jpauli wrote:

On Fri, Jul 13, 2012 at 5:33 PM, Anthony Ferrara  wrote:


4. Rewrite the entire parser completely. I keep hearing about how bad PHP's
parser is, and how it's growing out of control. Perhaps this is a good time
to rewrite it (perhaps changing semantics slightly) to be better adapted
towards future changes...


We have an RFC and a patch playing with lemon parser. Actually, just a
conversion, reentrant.
Refer to Pierrick or Felipe about it :-)

Will the conversion include writing a formal definition of the language 
from the perspective of someone planning to write a PHP parser? Right 
now the only way to get such a definition is to reverse-engineer the 
Bison source.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] [DRAFT] RFC - array_column() function

2012-06-27 Thread Morgan L. Owens

On 2012-06-26 07:22, Ben Ramsey wrote:


However, in Prototype.js and Underscore.js, pluck seems behave more like
array_map() in PHP:

http://api.prototypejs.org/language/Enumerable/prototype/pluck/
http://documentcloud.github.com/underscore/#pluck

Nevertheless, it would technically have the same effect as the "column"
functionality, since calling that method/property in Javascript simply
returns the value of the property or result of the method call.

Remember that, in JavaScript, "foo.length" is equivalent to 
"foo['length']". The use case for map that pluck was created to cover is 
basically the same one here - essentially:


function array_{column|pluck}($array, $key)
{
return array_map(function($element)use($key)
{
return $element[$key];
}, $array);
};

Or, as it's implemented in Prototype.js (which inlines the map invocation),

function pluck(property) {
var results = [];
this.each(function(value) {
results.push(value[property]);
});
return results;
}


I will say I'm not sold on "pluck" since it describes the physical 
action (a sharp tug) rather than the intended result (you pluck feathers 
from a chicken but you pluck fruit from a tree). Other alternatives to 
array_column that have crossed my mind include:


An extra argument to array_values() analogous to the extra argument to 
array_keys(): I'm worried the analogy isn't close enough to excuse the 
differences.

array_project(): too mathematically esoteric
array_select(): potentially also overloaded - what next, array_join()? - 
but I admit that this is the name I typically use when I write the sort 
of function that I gave above.


  ***

One thing about the existing implementation: it doesn't retain the keys 
of the original array.


This throws away information that might still be needed. It's possible 
that some elements in the original array didn't supply values to the 
result (they lacked have the key in question); without the original 
array's keys to provide a mapping, you won't know which ones they were.


If you did this twice on different keys, then as soon as one result 
array came up short, the two sets would no longer be reconcilable.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] [DRAFT] RFC - array_column() function

2012-06-25 Thread Morgan L. Owens

On 2012-06-25 04:19, Ralph Schindler wrote:

The term 'column' makes a lot of sense for PDO working with database
columns, but there is no concept of a 'column' in the array structure or
PHP as a whole, outside of database related functions.


In the case of this addition, I do indeed like array_pluck as well, as
column feel like an overloaded term.

While casting around for other array_* names, I remembered that 
array_keys() can take a second (and third) argument. With it, the 
function returns all keys indexing a given value.


What we're discussing here is basically a dual to that: return all 
values indexed by a given key.[*]


If the duality is continued, then what we get is

array_values(array $input[, mixed $key])

(A third $strict argument would probably do more harm than good: the key 
can only be an integer or a non-numeric string, and using the same type 
coercion rules would smooth out differences in the $key argument.)


[*] Yes, I know the symmetry isn't perfect, and why; the question is 
whether the difference is more likely to result in "Well, duh!" or "WTF?!".


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] concatenation operator

2012-06-14 Thread Morgan L. Owens

On 2012-06-15 04:00, Ángel González wrote:

On 13/06/12 05:26, Morgan L. Owens wrote:

After reading the performance improvements RFC about interned strings,
and its passing mention of a "special data structure (e.g.
zend_string) instead of char*", I've been thinking a little bit about
this and what such a structure could be.

But rather than interned strings, I thought that _implicit_
concatenation would be a bigger win in the long term. Like interning,
it relies on strings being immutable.

This zend_string is a composite type. Leaves are _almost_ identical to
existing string zvals - char* val, int len - but also an additional
"child_count" field. For leaves, child_count is zero (not incidentally
indicating that it _is_ a leaf). For internal nodes, "val" is a list
of zend_strings (child_count of them). "len" still refers to the total
string length (the sum of the len fields of its children).

So a string that has been built up through concatenation is
represented by a tree (actually a dag) of zend_strings. The edges in
this dag are all properly reference-counted; discarding a string
decrements the reference counts of its children.

How do you list then? As a single-linked list?
That would avoid reuse of the component strings in different
superstrings except from matching ends...

I was thinking just in terms of an array (the composite would be 
pointing either to an array of characters or an array of strings). 
Mainly just because that's how I pictured it (and haven't thought of a 
reason not to, since the number of children is known when the 
concatenated string is created, and fixed due to immutability).


Component strings aren't copied as such, only referenced. In that sense 
the choice of array vs. list comes down to where that reference is kept 
- in the parent string or the elder sibling. Sharing common suffixes 
would save a number of references, but when concatenating two existing 
strings, the list of component references in the _prefix_ would need to 
be copied for the sake of whatever else is using it at the time 
(otherwise they would end up with the concatenated string as well).


Speaking of concatenation, unless potentially scary stuff is done, 
concatenating three strings is done by concatenating two of them, then 
concatenating the result with the third, giving a binary tree; so why am 
I suggesting an array of arbitrary length? Think of an implementation of 
PHP's join()/implode() that exploits this structure.



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: Re: [PHP-DEV] concatenation operator

2012-06-12 Thread Morgan L. Owens

On 2012-06-08 08:18, Johannes Schlüter wrote:

On Thu, 2012-06-07 at 12:53 -0700, Adi Mutu wrote:

Ok Johannes, thanks for the answer. I'll try to look deeper.
I basically just wanted to know what happens when you concatenate two
strings? what emalloc/efree happens.


This depends. As always. As said what has to be done is one allocation
for the result value ... and then the zval magic, which depends on
refcount, references, ...





So, when having two constant strings there's a single malloc, in this
case allocating 7 bytes (strlen("foo")+strlen("bar")+1), if you have a
different type it has to be converted first ...



After reading the performance improvements RFC about interned strings, 
and its passing mention of a "special data structure (e.g. zend_string) 
instead of char*", I've been thinking a little bit about this and what 
such a structure could be.


But rather than interned strings, I thought that _implicit_ 
concatenation would be a bigger win in the long term. Like interning, it 
relies on strings being immutable.


This zend_string is a composite type. Leaves are _almost_ identical to 
existing string zvals - char* val, int len - but also an additional 
"child_count" field. For leaves, child_count is zero (not incidentally 
indicating that it _is_ a leaf). For internal nodes, "val" is a list of 
zend_strings (child_count of them). "len" still refers to the total 
string length (the sum of the len fields of its children).


So a string that has been built up through concatenation is represented 
by a tree (actually a dag) of zend_strings. The edges in this dag are 
all properly reference-counted; discarding a string decrements the 
reference counts of its children.


Only when the character data is needed for something does it need to be 
allocated for and copied into one place (the internal node can then 
become a leaf).



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Scheme identification in stream wrappers: (Partial) disambiguation?

2005-04-10 Thread Morgan L. Owens
Wez Furlong wrote:
On Apr 9, 2005 6:07 AM, Morgan L. Owens <[EMAIL PROTECTED]> wrote:
> Really?
> Can you guarantee that your user-space code to sniff out the path is
> going to work 100% of the time on all platforms?
>
Who said user-space? I meant in the implementation of fopen().

fopen() is implemented in user-space, meaning, not kernel-space.
Only the kernel knows what logic it will really really use to resolve
a valid path.
Emulating that code for each supported platform on which PHP runs is
plain stupid.
 
I still don't see why emulation code would be necessary. What happens
when fopen() is passed an invalid file path now? It's obviously capable
of coping with failure. What is so unimplementable with it coping with
its initial failure by going "Drat. Okay, might it be a URL? Looks like
it has a scheme... let's try it". If you insist, you could wrap that in
#ifdefs so that the code is only compiled on platforms where 'foo:'
might validly appear at the start of a file path.
if(the string might be a URL)
{
#ifdef FILEPATHS_CAN_START_WITH_SCHEMELIKE_SEQUENCE
if(attempt to open as local file succeeds)
return handle
else
#endif
if(scheme is registered && attempt to open as stream succeeds)
return handle
else
return fail
}
else
{
if(attempt to open as local file succeeds)
return handle
else
return fail
}
I suggest that you go and re-read RFC 1738, section 3.1, Common
Internet Scheme Syntax.  We support that, because the original
wrappers implementation was solely for "URL schemes that involve the
direct use of an IP-based protocol to a specified host on the
Internet".
...
In other words, there is no defined mapping to a streaming data
source, which is what the wrappers layer in PHP is built for.
Sorry, I didn't see any mention of that in the documentation. Just saw
"URL" and assumed that it referred to RFC 2396, which IIRC would have 
been the current standard at the time. Nothing about direct use of an 
IP-based protocol to a specified host (like a local variable or a 
gzip'ed file?). If I could search back through the development 
discussion list over the relevant time period I'd've found this?

In short, what you're talking about supporting is (what is now) Section 
3.2 of RFC 3986. Thanks for clearing that up.

Do you think that by trying to insult PHP you'll motivate the
developers to try and change it to the way you think it should work?
Just characterising the sort of response I can imagine people not
married to PHP making when they first run into this. Hey, none of this 
is the result of any decision of mine. So I wasn't aware of which 
interpretation of "URL" you had chosen to use. Now that you've gotten 
around to saying which (to me, if not to those other users I mentioned), 
I've got my answer.

MLO
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] Scheme identification in stream wrappers: (Partial) disambiguation?

2005-04-09 Thread Morgan L. Owens
Wez Furlong wrote:
>
> Really?
> Can you guarantee that your user-space code to sniff out the path is
> going to work 100% of the time on all platforms?
>
Who said user-space? I meant in the implementation of fopen().
>
> It's not just 1 letter.  There are 3 letter special device names too.
>
Natch; yeah. You still can't create a directory named "aux". Or "com1".
>
> The file:// protocol is a load of rubbish, because it neglects to
> specify how remote file access should work.
>
It explicitly states as much.
"The file URL scheme is unusual in that it does not specify an Internet
protocol or access method for such files; as such, its utility in
network protocols between hosts is limited."
>
> Is there a clear win for PHP, that outweighs the strong risk of
> breaking PHP until all the edge cases have been resolved?
>
Probably not. It would have been easier to fix it earlier on before
kludging file paths and URLs together in the same namespace without
thinking through the consequences, but that opportunity is long lost. I
guess it will just have to be another case where we have to say "Screw
the standards. They're too difficult to implement." Ah well, it's only a
couple of characters. tel://+1-816-555-1212 it is, then. And explain to
users _why_ some URLs need to be massaged (in user space) before PHP can
recognise them. It's not a bug, it's a feature.
MLO
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP-DEV] Scheme identification in stream wrappers: (Partial) disambiguation?

2005-04-06 Thread Morgan L. Owens
If there's something that looks like a scheme (i.e., a well-formed 
sequence of
characters followed by ':'),
	see if it's registered;
	if it is,
		the appropriate wrapper should be used.
	Otherwise,
		on platforms where ':' has significance,
			try it again as a file path.
Otherwise, it fails due to an absent stream wrapper.

This limits problems to users who are trying to access a child directory 
of the
current path which happens to have the same name as a registered scheme. The
problems will consist of the stream failing because the URL it's 
receiving is
bogus. People in such a situation can use the file: scheme explicitly to
disambiguate (assuming they can't have a directory whose name starts with
a double slash!).

Alternately,
on problem platforms,
if the string is ambiguous,
see if it is well-formed as a file path.
If it is,
try it as such.
If it's not, or it fails,
see if it starts with a registered scheme name and if 
so,
try that.
It would be easier to check if a string is a well-formed file path than 
it is to
check if it's a valid URL according to some arbitrary scheme (impossible in
general).

Assuming no-one tries to register a one-letter scheme, the Windows build 
can get
away with seeing if the "scheme" is only one letter long, and if it is, 
assume
that it's a drive letter.

I dunno; Windows users use '/' as the directory separator in file:// 
URLs, since
it's supposed to be up to the application to map a URL to the actual 
resource
('\' is frequently tolerated however) in whatever platform-specific 
manner is
appropriate; does the same hold for platforms that use ':' as the directory
separator in their file: URLs? Currently the standard for file URLs is still
RFC1738; this is supposed to be updated at some stage, however. It 
explicitly
gives the VMS example of mapping
 DISK$USER:[MY.NOTES]NOTE123456.TXT
on vms.host.edu to the URL
	file://vms.host.edu/disk$user/my/notes/note12345.txt
and noting that to refer to the local machine the host part can be either
'localhost' or the empty string.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php


[PHP-DEV] Scheme identification in stream wrappers

2005-04-05 Thread Morgan L. Owens
It was suggested I post this here.
In PHP, the character sequence "://" separates the protocol name from 
the protocol-specific part of a stream name. Clearly, the intention is 
that these stream names are URLs (i.e., URIs that actually provide a 
location for the identified resource). However, the URI specification 
(RFC 3986) states that the scheme delimiter is merely ":", and that 
"://" is only applicable for _some_ URI formats. There are URLs in 
common use that do not use "://" (e.g., mailto:), and in fact support 
for "zlib:" is a hardwired exception in the present code.

May I propose that the parser which parses out the scheme from the rest 
of the URL look only for the initial ":" in the stream name, rather than 
"://".

Existing uses of stream wrappers will continue to function, since the 
name of the scheme won't actually change, and it's the wrapper author's 
responsibility to parse the rest of the URL anyway; but it will become 
possible to correctly write, e.g., "mailto:[EMAIL PROTECTED]" instead of 
"mailto://[EMAIL PROTECTED]", or (to use the example used in the manual 
to describe stream_wrapper_register()) "var:myvar" instead of "var://myvar".

This would also make use of parse_url() more consistent as, for example, 
parse_url('var:myvar') will put the name of myvar into the path element 
of the returned array, instead of mistakenly putting it in the host element.

MLO
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php