On Tue, 03 May 2011 07:10:10 +0900, João Eiras <joao.ei...@gmail.com>
wrote:
event.clipboardData.getDocumentFragment()
which would return a parsed and when applicable sanitized view of any
markup the implementation supports from the clipboard.
This is already covered by doing
x=createElement;x.innerHTML=foo;traverse x
Of course it is. The point was simply to see if there was interest in
possibly optimising away an extra serialize->parse roundtrip, if
developers feel it would be more convenient to get the DOM right away
rather than the markup.
Regarding simplifying the pasted html to remove stuff that could be
malicious, consider a rogue app that injects a script in the clipboard
and expects the user to hit paste on his bank site.
Well, I've never seen a bank site with a rich text editor /
contentEditable-based feature customers are meant to use ;-)
Rouge scripts and social engineering to paste them in the comment field on
Facebook is still a threat to worry about. If the implementation knows
that the content originates from another website it should definitely be
sanitized. I don't think it adds much security to sanitise content from a
local application though - an application running locally already has
quite a lot of possibilities, for example to tell the browser to launch a
javascript: URL directly or go into the DOM to modify things through the
browser's accessibility APIs. Using social engineering to make the user
paste something would be an awkward way to try to launch an exploit, no?
There is little the user agent can do but to provide quick and easy
methods to sanatize this. There is already the toStaticHTML API that IE
implements.
I'm planning to *not* leave sanitization to the script author, but have it
as a default and (currently) non-overridable mode for cross-origin HTML
paste. So the user-agent will do it all behind the scenes before the
script event gets to see a single tag of the markup.
I would suggest supporting and implementing it. Or even add a sister
property of innerHTML, innerStaticHTML which would not return scripts or
event handlers on reading, and would parse out those when setting.
That sounds like a good idea, but would be best followed up in a separate
context.
--
Hallvord R. M. Steen, Core Tester, Opera Software
http://www.opera.com http://my.opera.com/hallvors/