Alright, I think we that <template> is the furthest
we get now and we got to mind the specs!
Thanks Craig.
On 4/7/19 2:01 PM, Craig Francis wrote:
Hi Joris,
I suspect it's just how the web has developed, where the mixing of
JavaScript and imperfect HTML is normal.
I quite like this video as a demo:
https://www.youtube.com/watch?v=lG7U3fuNw3A
Where I think your point is raised when comparing the different
parsing of:
1) <div><script title="</div>">
2) <script><div title="</script">
My favourite exploit is very similar...
*<script>*
user_name = "Craig*</script>*Hello";
</script>
Personally I'd like to say to the browser, similar to the old/obsolete
<plaintext> element, you won't find any JavaScript code after this
point (maybe it can block all scripts in the <body>?)... but this is
only because I load my JS files in the <head>, and attach event
listeners after DOMContentLoaded, but I know so few developers do
this, so it won't be useful to add.
I think this is the main reason Content Security Policy came into
existence, where I can skip "unsafe-inline" to block any inline
JavaScript, and limit the JavaScript files that can be included.
You can kind of get an idea of what happens with the browser parsing
by using JavaScript to load the HTML into a <template> element... but
that does raise the question on how you get the unsafe variables to
the JavaScript in the first place.
As an aside, I use <meta name="js_name" content="..." /> tags...
sometimes with JSON encoded data in the content attribute, where I'd
use something like the following to get the content:
var my_data = document.querySelector('meta[name="js_data"]');
if (my_data) {
try {
my_data = JSON.parse(my_data.getAttribute('content'));
} catch (e) {
my_data = null;
}
}
But going forwards, the HTML5 spec does cover how the browser (and
third-party libraries) should be parsing imperfect HTML, so hopefully
these differences will reduce (but I don't imagine they will all be
perfectly aligned, in the same way different browsers aren't).
Craig
On Sun, 7 Apr 2019 at 09:00, joris <joris.gutj...@gmail.com
<mailto:joris.gutj...@gmail.com>> wrote:
I agree, that would be a vulnerability.
But I think this is not the core of my wonder.
I wonder, why do Web developers have to
guess what the Browser thinks is JS and executes
it and what isn't?
Why can't they just ask the Web Browser to do that
for them?
That would be more secure because
all third-party libraries parse somewhat differently
than all the Web Browser they are used with.
On 4/6/19 12:51 PM, Craig Francis wrote:
While I quite like the simplicity of this idea, where it kind of
reminds me of the @inert attribute.
My main concern is how to bypass it, take the code:
<div noscripts="true"><?= $unsafe_user_name ?></div>
Where the attacker can set their username to
`X*</div>*<script>evil_code</script><div>`
---
Unfortunately, I think this is why we need to work with more
complicated/advanced solutions...
We need to sanitise all strings that are included in the HTML on
the server side - e.g. using templating systems; or passing the
string though something like HTML Purifier:
http://htmlpurifier.org/
Or, and you have to be careful here... escaping all HTML output
though functions like htmlentities() / htmlencode(), where this
does not fix `<a href=<?= htmlentities($unsafe_url)>` due to the
url being able to start with `javascript:`, or being able to take
advantage of the missing quotation marks on the attribute via `
onclick=evil_code`.
And when working with strings in JavaScript - you should use safe
methods like `element.textContent`, or pass them though something
to sanitise the HTML (both in removing the many ways JavaScript
can be included, but also just making sure the HTML is well formed):
https://github.com/google/closure-library/blob/master/closure/goog/html/sanitizer/htmlsanitizer.js
https://github.com/punkave/sanitize-html
Then you would ideally add a Content Security Policy to limit the
scripts on the page, just incase you miss something.
https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP
And as an extra bonus, start playing with the (currently in
development) Trusted Types, to make sure you aren't using unsafe
things like element.innerHTML.
https://developers.google.com/web/updates/2019/02/trusted-types
Or for even more fun (pain), on your local development server,
try setting the header:
Content-Type: application/xhtml+xml; charset=UTF-8
Do not do this on live, as any bad formatting of your HTML will
break the page - but this ensures all of your attributes are
quoted, and all of your tags are perfectly nested (this includes
`<br>` needing to be `<br />`, the attribute `selected` needing
to be `selected="selected"`, etc).
Craig
On Fri, 5 Apr 2019 at 23:47, Yog Bii <joris.gutj...@gmail.com
<mailto:joris.gutj...@gmail.com>> wrote:
XSS prevention is a very important and costly part of a
Websites Security.
Because XSS is currently prevented by matching for JS in user
input
and is than either blocked or masked by the Web Developer,
each on his
own site,
XSS attacks find differences between the matching of the Web
Developer
and the Browser, such that the Web Developer's matching doesn't
recognize JS as JS, but the Browser executes it.
This is a constant fight between the Web Developer and the
XSS attacker,
that costs many resources needed somewhere else instead.
And this fight favors larger business over small Web developers.
I think that this fight can be terminated by letting the
Web Developer not guess what the Browser may think to be JS
and instead tell him explicitly that somewhere shouldn't be
any code.
The Browser then behaves in that region like
he would have JS disabled.
I would do that with a new attribute, called noscripts.
Inside an HTML element with noscripts = "true",
the Browser handle anything inside that element like
JS would be disabled globally.
An example HTML would look like this:
<!doctype html>
<html>
...
<div noscripts="true">
<script>
// No danger by unescaped <script> tags
</script>
<button onclick="nor by Event listeners">Click me</button>
...
</html>
If you know a way to do this without any differences between
what the
Browser executes and what ever that mechanic lets pass, let
me know
and let me know why it isn't thought in every HTML/JS
Tutorial and
every Documentation about Web Development.
_______________________________________________
dev-security mailing list
dev-security@lists.mozilla.org
<mailto:dev-security@lists.mozilla.org>
https://lists.mozilla.org/listinfo/dev-security
_______________________________________________
dev-security mailing list
dev-security@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-security