Re: The HTML module design

Bo Yang Thu, 13 Aug 2009 01:19:23 -0700

On Wed, Aug 12, 2009 at 10:11 PM, John-Mark Bell<[email protected]> 
wrote:
> On Mon, 2009-08-10 at 11:33 +0800, Bo Yang wrote:
>> On Mon, Aug 10, 2009 at 4:06 AM, John-Mark Bell<[email protected]> 
>> wrote:
>> > On Sun, 2009-08-09 at 22:02 +0800, Bo Yang wrote:
>> >> 1. Change the parser wrapper structure now. In HTMLDocument, there are
>> >> methods like HTMLDocument.open, HTMLDocument.write, which are used for
>> >> writing some string to DOM. This require the HTMLDocument must know
>> >> its parser. But our in the structure of libDOM, the parser is created
>> >> before the Document and it is the parser creating the document, I
>> >> think this should be changed.
>> >>
>> >>     The HTMLDocument will get created firstly and it has a parser
>> >> within it. It is the client of HTMLDocument who is responsible for
>> >> passing corresponding parser to HTMLDocument. This mean, we will have
>> >> a function like:
>> >>
>> >> dom_html_document_create(dom_alloc, void *, lwc_context *, parser *, 
>> >> ....);
>> >>
>> >> In the future, the Netsurf will create a Parser according the HTTP
>> >> response header's content type (text/html to create a hubbub parser,
>> >> and text/xml to create a libxml parser), and pass it to the
>> >> HTMLDocument to create a one instance of it. And then, the loading and
>> >> parsing starts.
>> >
>> > Please flesh this proposal out more, with specific APIs etc. Then we'll
>> > have more idea if it's sane.
>>
>> The main API:
>>
>> 1. Change the parser wrappers' API. Now, the API are :
>> parser_create
>> parser_destroy
>> parser_parse_chunk
>> parser_complete
>> parser_get_document
>>
>> It should be changed to:
>> parser_create(const char *aliases,  const char *enc, bool fix_enc,
>> dom_alloc alloc, void *pw, dom_msg msg, void *mctx, struct
>> lwc_context_s *ctx, struct dom_document *doc);
>>
>> parser_destroy
>> parser_parse_chunk
>> parser_complete
>>
>>
>> That is that removing the parser_get_document, and add a new parameter
>> to parser_create to pass the document in.
>
> That sounds OK, _providing_ that the correct kind of Document is
> created. Now, we can either have it so that the Document object supports
> all type-specific methods or require that the correct kind of Document
> is created.
>
> Consider the case of mixed DOMs (e.g. XHTML+SVG), where the Document
> object does not just have methods for one particular vocabulary.
>
>> 2. The dom_html_document is like:
>>
>> struct dom_html_document {
>>     struct dom_document base;
>>     dom_hubbub_parser *hp;
>>     dom_xml_parser  *xp;
>>     ....
>> };
>
> Why does the document need a handle for the parser? You can stick the
> parser pointers into a union as there'll only ever be one of them.


Yeah, a union is better. Thanks.

>> typedef enum {
>>     DOM_HTML,
>>     DOM_XML
>> } parser_type;
>>
>> And we provide corresponding API to dom_html_document.
>>
>> /* Create the HTMLDocument */
>> dom_exception dom_html_document_create(dom_alloc, void *, dom_msg,
>> void *, lwc_context *, parser_type, ui_handler);
>>
>> /* Parse data chunk */
>> dom_exception dom_html_document_write_data(uint8_t *, size_t);
>>
>> /* Tell the document is complete */
>> dom_exception dom_html_document_data_complete(void);
>
> This should be ok.
>
>> 3. Bootstrap considertaion.
>> The Core spec 
>> http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Level-2-Core-DOM-createDocument
>> said, specific Document such HTMLDocument can be created using API
>> createHTMLDocument, but I found no such API definition in HTML level 2
>> at all. I think we can just ignore it.
>
> The reason for this is so that the correct type of Document object is
> created. What the specification calls createHTMLDocument, you've called
> dom_html_document_create.
>

I will start a branch for this. Thanks!

Regards!
Bo

Re: The HTML module design

Reply via email to