Any answer on the other question regarding what the expected outcome of a call like below would be? Currently we're throwing a JS exception<https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/modules/clipboard/clipboard_promise.cc;drc=c5ac981ddffb22c613baf38bf69f3554f51894d0;l=248> if the unsanitized list contains a format other than `text/html`. In theory we could also add other built-in formats in the future where sanitization is needed by-default on read(), but unsanitized content is returned if the author explicitly opts into it. e.g. For `image/svg+xml`, we could return sanitized content by-default<https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/modules/clipboard/clipboard_writer.cc;drc=f5bdc89c7395ed24f1b8d196a3bdd6232d5bf771;l=225> where styles would be inlined and <meta> tags would be stripped out by the sanitizer, but if the authors want unsanitized content, then they can explicitly opt into it by adding this format to the unsanitized list.
Probably you could even remove the "hello" in `<div id="logDiv">hello</div>` so the DIV is entirely empty to avoid any and all misunderstandings. Done. ________________________________ From: Thomas Steiner <[email protected]> Sent: Monday, October 9, 2023 8:56 AM To: Anupam Snigdha <[email protected]> Cc: [email protected] <[email protected]>; Sanket Joshi (EDGE) <[email protected]>; Evan Stade <[email protected]>; [email protected] <[email protected]>; Ana Sollano Kim <[email protected]> Subject: [EXTERNAL] Re: [blink-dev] Intent to Ship: Async Clipboard API: Read unsanitized HTML and write well-formed HTML format. As the author of the web custom formats article<https://developer.chrome.com/blog/web-custom-formats-for-the-async-clipboard-api/>, just for me to better understand: the problem is that the clipboard gets populated with `text/html` by random (web or native) apps. If the clipboard were populated from the start with `web text/html`, the contents could be read unsanitized, even without this new parameter. So this new parameter is the escape hatch that developers can use via `navigator.clipboard.read({unsanitized: ["text/html"]})`. So, the problem is that, for sites like Excel Online, they aren't sure where the user is going to paste, so they always have to produce both 'web text/html' and 'text/html'. That way if an app doesn't have support for web custom format, then they can use the native HTML format. Same thing for native apps that produce a web custom format. There are also legacy native apps (old Office versions that are used by Enterprises) that don't have support for the new web custom format, so the site has to produce the standard HTML format for those apps as well. But you are right that if both source and target apps support web custom format, then it can be used to access unsanitized HTML content. Crystal-clear now, thanks for confirming my theory. An immediate question that I ask myself is whether this mechanism could be expanded to other values than just `"text/html"`. Currently we are focusing on the standard HTML format to better align with the DataTransfer APIs. In theory you could add support for other built-in formats as well, but the main intent here is to produce similar fidelity of HTML format so sites that use DataTransfer APIs to read HTML do not experience any regression when they move over to async clipboard API for copy-paste operations. Here is a document where I described the regressions and impact on the apps when sanitization is performed: https://docs.google.com/document/d/1nLny6t3w0u9yxEzusgFJSj-D6DZmDIAzkr1DdgWcZXA/edit?usp=sharing Some native apps that I surveyed for impact of this new proposal: https://docs.google.com/document/d/1O2vtCS23nB_6aJy7_xcdaWKw7TtqYm0fERzEjtLyv5M/edit?usp=sharing Well understand the need for HTML. I'm just looking at this with the eyes of a developer new to this who might ask themselves whether they can just put something else there. It's a generic-sounding option "unsanitized", but that is hardcoded to just "text/html", as per the spec. Maybe it could be renamed to something very specific like "unsanitizedHTML" and accept a boolean? Any answer on the other question regarding what the expected outcome of a call like below would be? `navigator.clipboard.read({unsanitized: ["hahaha/lol", "text/html", "application/json", "text/plain"]})` FWIW, this demo was initially a bit misleading, since I expected "some text" to be on the clipboard, or whatever I put into the `contenteditable` box, but it's hardcoded. Maybe remove the box. Oops, sorry about that. Copy-paste error 🙂 I fixed it now. Probably you could even remove the "hello" in `<div id="logDiv">hello</div>` so the DIV is entirely empty to avoid any and all misunderstandings. -- You received this message because you are subscribed to the Google Groups "blink-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/DM6PR00MB0846C1023D78FFB768376A24CFCEA%40DM6PR00MB0846.namprd00.prod.outlook.com.
