>Hi list,
>
>I'm considering writing an RFC to add a 3rd parameter to fgets which 
>accepts a user defined function. If we had this today we wouldn't need 
>fgetcsv with the added benefit of fgetcsv style support for data 
>packaging formats we would otherwise create more 1 off functions for.  
>For example, if we decided to support reading json from files in the 
>same manner as our current fgetcsv functionality today, we would create 
>an fgetjson function.
>
>This change unifies the way in which we support native transliteration 
>of data packaging formats from files into php data structures through a 
>single interface. The other major design benefit, from my point of 
>view, is the unification of userland transliteration 
>functions/libraries with the same modality as our native support for 
>these types of use cases. I believe this will ultimately result in more 
>intuitive userland code around this type of functionality.

It's an interesting idea, but I can't immediately picture how it would work - 
what would the callback be given, and what would it return? Would it somehow be 
able to manipulate the number of characters read from the stream?

For any variant of CSV, reading a line at a time is what you want anyway, and 
you can easily build an Iterator which post-processes each line as it is read, 
giving the memory efficiency of fgetcsv() but much more flexibility.

For JSON, newlines aren't the delimiter you want, but with nested structures, 
I'm not sure how you'd parse a partial structure anyway. Are there JSON 
equivalents of SAX (event-based) parsers?

The callback would be given the string as returned by fgets today. The 
functional equivalent to fgetjson today is handled by something like 
$handle = fopen(~some file~, 'r');
while (($data = fgets($handle)) !== FALSE) {
    $data = json_decode($data, true);
    ...other stuff...
}
and would change to
$handle = fopen(~some file~, 'r');
$decode = json_decode($data, true);
while (($data = fgets($handle,0,$decode)) !== FALSE) {
   ...other stuff...
}
fgetcsv equivalent would be 
$handle = fopen(~some file~, 'r');
$decode = str_getcsv(...options...);
while (($data = fgets($handle,0,$decode)) !== FALSE) {
   ...other stuff...
}
userland benefits from having an API that promotes consistency through a 
flexible interface
$handle = fopen(~some file~, 'r');
$decode = function($foo) { ...do stuff...; return $bar;}
while (($data = fgets($handle,0,$decode)) !== FALSE) {
   ...other stuff...
}

On a side-note, small json data packages, delimited with newlines, and stored 
on cheap disk is an increasingly popular (in my circles) way to handle storing 
raw data that could be subject to later scrutiny or processing. In these cases 
parsing json just like a csv file and converting into a native format is 
necessary. I've found json packages in files to have huge advantages over 
storing the equivalent data + relationship in csv format. With that said, json 
isn't the end-all and be-all. We need to anticipate this model evolving into 
any existing or future data format. Providing a clean and consistent way to 
handle all of the existing and yet-to-be-determined use cases around fgets and 
different data packaging formats is the primary purpose of my proposal. 


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to