Hi Marc

On 29/09/2023 09:39, Marc Bennewitz wrote:
> Hi Niels,
> 
> On 29.09.23 09:07, Niels Dossche wrote:
>> Hi internals
>>
>> Discussion seems to have died down.
>> Today, it's been 14 days since the last major change was done to the RFC 
>> (i.e. the class hierarchy update).
>> And it's also been close to 4 weeks since I first announced the RFC it on 
>> the mailing list.
>> I'd like to start the vote on Monday (20:00 PM GMT+2) and I intend to let it 
>> run for 2 weeks.
>> Any final complaints should be raised now.
> 
> Not much to complain but a question - not sure if it was discussed before.
> 
> Naming: `XMLDocument::fromEmpty` vs. `HTMLDocument::createEmpty` in the PHP 
> code section.

Oops. Well spotted! This should be createEmpty everywhere.
I just checked and only in that class definition I used fromEmpty accidentally.
I fixed this now in the RFC text.
This happened when updating the method names, the emails from back then do 
refer to the right name though.

> 
> For both, `XMLDocument::fromEmpty` and `HTMLDocument::createEmpty` there is 
> an argument available to define the encoding but none of the other 
> `createFrom*` methods have this argument.
> 
> As far as I understand, in the these other cases the encoding gets detected 
> from the content of the passed source but what happens is the source does not 
> contain any information about the encoding?. E.g. you load an XML/HTML 
> document over HTTP, the encoding is defined via HTTP header but the content 
> itself doesn't contain it.
> 

Right, we follow the HTML spec in this regard. Roughly speaking we determine 
the charset in the following order of priorities.
If one option fails, it will fall through to the next one.
1. The Content-Type HTTP header from which you loaded the document.
2. BOM sniffing in the content. I.e. UTF-8 with BOM and UTF-16 LE/BE prepend 
the content with byte markers. This is used to detect encoding.
3. Meta tag in the content.

If it could not be determined at all, UTF-8 will be assumed as it's the default 
in HTML.

>> Kind regards
>> Niels
>>
> Best,
> Marc

Kind regards
Niels

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: https://www.php.net/unsub.php

Reply via email to