On Tue, Oct 31, 2017, at 05:46 AM, Henri Sivonen wrote:
> (Context: I'm trying to understand the requirements for our
> serializers in case we rewrite them [in Rust].)
> 
> The HTML fragment parsing algorithm can have only one context node.
> The context is never a chain of nodes towards to the root, since such
> a thing wouldn't affect the result per the HTML parsing algorithm.
> 
> However, when the HTML parsing algorithm is in the non-fragment mode,
> some tags get ignored without appropriate parent, so e.g. to represent
> <td> in the non-fragment mode, you need to include <table>, etc. But
> that's about it.
> 
> The Windows CF_HTML clipboard format,
> https://msdn.microsoft.com/en-us/library/windows/desktop/ms649015(v=vs.85).aspx
> , represents fragments by designating them in a full HTML document, so
> what are logically fragments have to work with non-fragment parsing.
> 
> This indicates that when we export a fragment to the clipboard, we
> should serialize its parent if not table-related or reconstruct a full
> table if table-related.
> 
> Yet, it seems that we serialize much more ancestor context.
> 
> Is there a good reason to? For example, does Microsoft office (our old
> bugs suggest that Excel is the pickiest consumer) or other CF_HTML
> consumers on Windows care about more context than the standard HTML
> parsing algorithm? What could consumers possibly do with knowlegde
> about ancestors beyond parent or the nearest <table>? (I'm ignoring
> SVG and MathML for the moment.)
> 
> OTOH, it seems that we include only some element types in the context
> (https://searchfox.org/mozilla-central/source/dom/base/nsDocumentEncoder.cpp#1540).
> It's unclear to me why. The first revision of the list came from jst
> during the Netscape 6 crunch without an explanation either in Bugzilla
> or code comments. (https://bugzilla.mozilla.org/show_bug.cgi?id=50742)
> 
> Does anyone know why?

I don't know exactly why, but I did try to fix pasting table cells into
Excel a long time ago (someone else eventually fixed it), and it was
definitely tricky and underspecified:
https://bugzilla.mozilla.org/show_bug.cgi?id=137450

Comments on the bug indicate that there are non-table cases where the
context is important, like `<ol><li>` to ensure you wind up pasting
numbered list items.

-Ted
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to