Re: [whatwg] Proposal: toDataURL “image/png” compression control
On Sun, Jun 1, 2014 at 8:58 AM, Glenn Maynard gl...@zewt.org wrote: But again, image decoding *can't* be done efficiently in script: platform-independent code with performance competitive with native SIMD assembly is a thing of myth. (People have been trying unsuccessfully to do that since day one of MMX, so it's irrelevant until the day it actually happens.) Anyhow, I think I'll stop helping to derail this thread and return to the subject. I believe that a spec-conforming canvas implementation must support PNG, so a PNG encoder/decoder is required. If others want to replace their native libs (libpng, libjpeg_turbo, and so on) with JS implementations of same, well that's up to them. Won't be happening in Chrome anytime soon due to dependent factors: speed, memory use, and security, come to mind. But agree, let's return to the subject :) Noel, if you're still around, I'd suggest fleshing out your suggestion by providing some real-world benchmarks that compare the PNG compression rates against the relative time it takes to compress. If spending 10x the compression time gains you a 50% improvement in compression, that's a lot more compelling than if it only gains you 10%. I don't know what the numbers are myself. For the test case attached, and https://codereview.chromium.org/290893002 compression 0.0, time 0.230500 ms, toDataURL length 2122 compression 0.1, time 0.209900 ms, toDataURL length 1854 compression 0.2, time 0.215200 ms, toDataURL length 1850 compression 0.3, time 0.231100 ms, toDataURL length 1774 compression 0.4, time 0.518100 ms, toDataURL length 1498 compression 0.5, time 0.532000 ms, toDataURL length 1494 compression 0.6, time 0.612600 ms, toDataURL length 1474 compression 0.7, time 0.727750 ms, toDataURL length 1470 compression 0.8, time 1.511150 ms, toDataURL length 1334 compression 0.9, time 3.138100 ms, toDataURL length 1298 compression 1.0, time 3.182050 ms, toDataURL length 1298 I'd be careful using compression rates / encoding times as figures of merit though -- those depend on the source material (the input to the PNG encoder). Given incompressible source material, PNG encoding cannot gain compression at all. The question (for me) is whether developers should be allowed to control the compression using a pre-existing API. The browser has a default compression value, it's a compromise that ... surprise, surprise ... doesn't always meet developer expectations [1]. ~noel [1] https://bugs.webkit.org/show_bug.cgi?id=54256 -- Glenn Maynard
[whatwg] Stricter data URL policy
At the moment data URLs inherit the origin of the context that fetches them. This is not the case in Chrome and we'd like this to be no longer the case in Gecko. https://bugzilla.mozilla.org/show_bug.cgi?id=1018872 is tracking this. The reasoning is that data URLs require being careful with a URL being handed to you whereas most other URLs do not. If you put it in an iframe or worker it can leak information from your origin to a third party. The proposal is to add a flag to Fetch with regards to origin inheritance: same-origin data URL flag. This is set by img and XMLHttpRequest, but not by iframe. For iframe we'd require iframe allowsameorigindataurl. And then it would only be set for the initial fetch, not after the iframe has been navigated. Workers might be harder as there might be content relying on workers working with data URLs. That needs to be investigated. I'll be updating Fetch shortly with this new policy, I hope HTML can be similarly aligned or at least that we come to an agreement here on the above plan (I can imagine HTML might want to wait until it integrates with Fetch in general). -- http://annevankesteren.nl/
Re: [whatwg] Stricter data URL policy
On Mon, Jun 2, 2014 at 11:19 AM, Anne van Kesteren ann...@annevk.nl wrote: Workers might be harder as there might be content relying on workers working with data URLs. That needs to be investigated. Per http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=3046 (thanks Simon) workers throw in Chrome when constructed from a data URL so it seems like we have an opportunity there. -- http://annevankesteren.nl/
Re: [whatwg] Stricter data URL policy
On 6/2/14, 5:19 AM, Anne van Kesteren wrote: This is not the case in Chrome and we'd like this to be no longer the case in Gecko. Note that it's not clear to me what we means in this case. For example, I'm unconvinced we want to change Gecko behavior here. And then it would only be set for the initial fetch, not after the iframe has been navigated. More precisely, it would be set for loads due to the iframe's src changing but not ones due to link clicks and location changes? Or do you really mean only for the initial fetch and not later changes to @src? I'll be updating Fetch shortly with this new policy This seems fine, since we want it no matter what; the only disagreement is about when that flag should be set, right? -Boris
Re: [whatwg] Stricter data URL policy
On Mon, Jun 2, 2014 at 2:48 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 6/2/14, 5:19 AM, Anne van Kesteren wrote: This is not the case in Chrome and we'd like this to be no longer the case in Gecko. Note that it's not clear to me what we means in this case. For example, I'm unconvinced we want to change Gecko behavior here. You're not persuaded by the attack scenario? And then it would only be set for the initial fetch, not after the iframe has been navigated. More precisely, it would be set for loads due to the iframe's src changing but not ones due to link clicks and location changes? Or do you really mean only for the initial fetch and not later changes to @src? Actual changes to @src seems fine since they are under the control of the page. (At least as much as the allowsameorigindataurl attribute.) I'll be updating Fetch shortly with this new policy This seems fine, since we want it no matter what; the only disagreement is about when that flag should be set, right? Provided we agree that it is always unset after any redirect, yes. -- http://annevankesteren.nl/
Re: [whatwg] Stricter data URL policy
On 6/2/14, 9:00 AM, Anne van Kesteren wrote: You're not persuaded by the attack scenario? Correct. I mean, the same scenario applies to srcdoc, document.write() into an iframe, etc. Why are data urls special? Provided we agree that it is always unset after any redirect, yes. We agree on that. -Boris
Re: [whatwg] Stricter data URL policy
On Mon, Jun 2, 2014 at 3:03 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 6/2/14, 9:00 AM, Anne van Kesteren wrote: You're not persuaded by the attack scenario? Correct. I mean, the same scenario applies to srcdoc, document.write() into an iframe, etc. Why are data urls special? The attack is the URL. A developer has to specifically consider data URLs and realize their implications. Other URLs will do the right thing and not run potentially hostile code stealing same-origin data. Provided we agree that it is always unset after any redirect, yes. We agree on that. Great! -- http://annevankesteren.nl/
Re: [whatwg] Stricter data URL policy
On 6/2/14, 10:15 AM, Anne van Kesteren wrote: The attack is the URL. A developer has to specifically consider data URLs and realize their implications. Note that this is already true for javascript: URLs in @src values (but not in location sets and the like, I agree). -Boris
Re: [whatwg] Proposal: toDataURL “image/png” compression control
On Sat, May 31, 2014 at 8:44 AM, Robert O'Callahan rob...@ocallahan.org wrote: On Sat, May 31, 2014 at 3:44 AM, Justin Novosad ju...@google.com wrote: My point is, we need a proper litmus test for the just do it in script argument because, let's be honnest, a lot of new features being added to the Web platform could be scripted efficiently, and that does not necessarily make them bad features. Which ones? The examples I had in mind when I wrote that were Path2D and HitRegions. Rob -- Jtehsauts tshaei dS,o n Wohfy Mdaon yhoaus eanuttehrotraiitny eovni le atrhtohu gthot sf oirng iyvoeu rs ihnesa.rt sS?o Whhei csha iids teoa stiheer :p atroa lsyazye,d 'mYaonu,r sGients uapr,e tfaokreg iyvoeunr, 'm aotr atnod sgaoy ,h o'mGee.t uTph eann dt hwea lmka'n? gBoutt uIp waanndt wyeonut thoo mken.o w
Re: [whatwg] Proposal: toDataURL “image/png” compression control
On Sat, May 31, 2014 at 1:46 PM, Glenn Maynard gl...@zewt.org wrote: On Fri, May 30, 2014 at 1:25 PM, Justin Novosad ju...@google.com wrote: I think this proposal falls short of enshrining. The cost of adding this feature is minuscule. I don't think the cost is ever really miniscule. https://codereview.chromium.org/290893002 True, you'd never want to use toDataURL with a compression operation that takes many seconds ((or even tenths of a second) to complete, and data URLs don't make sense for large images in the first place. It makes sense for toBlob(), though, and having the arguments to toBlob and toDataURL be different seems like gratuitous inconsistency. Yes, toBlob is async, but it can still be polyfilled. (I'm not sure how this replies to what I said--this feature makes more sense for toBlob than toDataURL, but I wouldn't add it to toBlob and not toDataURL.) What I meant is that I agree that adding the compression argument to toBlob answers the need for an async API (being synchronous was one of the criticisms of the original proposal, which only mentioned toDataURL). However, this does not address the other criticism that we should not add features to toDataURL (and by extension to toBlob) because the new functionality could implemented more or less efficiently in JS. -- Glenn Maynard
Re: [whatwg] Proposal: toDataURL “image/png” compression control
On Mon, Jun 2, 2014 at 10:05 AM, Justin Novosad ju...@google.com wrote: On Sat, May 31, 2014 at 8:44 AM, Robert O'Callahan rob...@ocallahan.org wrote: On Sat, May 31, 2014 at 3:44 AM, Justin Novosad ju...@google.com wrote: My point is, we need a proper litmus test for the just do it in script argument because, let's be honnest, a lot of new features being added to the Web platform could be scripted efficiently, and that does not necessarily make them bad features. Which ones? The examples I had in mind when I wrote that were Path2D Crossing the JS boundary is still an issue so implementing this in pure JS would be too slow. Path2D is only there to minimize DOM calls. and HitRegions. I agree that most of hit regions can be implemented using JS. The reason for hit regions is a11y and people felt a feature that just does accessibility, will end up unused or unimplemented.
Re: [whatwg] Proposal: toDataURL “image/png” compression control
On Mon, Jun 2, 2014 at 10:16 AM, Justin Novosad ju...@google.com wrote: On Sat, May 31, 2014 at 1:46 PM, Glenn Maynard gl...@zewt.org wrote: On Fri, May 30, 2014 at 1:25 PM, Justin Novosad ju...@google.com wrote: I think this proposal falls short of enshrining. The cost of adding this feature is minuscule. I don't think the cost is ever really miniscule. https://codereview.chromium.org/290893002 That's implementation cost to you :-) Now we need to convince the other vendors. Do they want it, want more, want it in a different way? Then it needs to be documented. How can authors discover that this is supported? How can it be poly-filled? True, you'd never want to use toDataURL with a compression operation that takes many seconds ((or even tenths of a second) to complete, and data URLs don't make sense for large images in the first place. It makes sense for toBlob(), though, and having the arguments to toBlob and toDataURL be different seems like gratuitous inconsistency. Yes, toBlob is async, but it can still be polyfilled. (I'm not sure how this replies to what I said--this feature makes more sense for toBlob than toDataURL, but I wouldn't add it to toBlob and not toDataURL.) What I meant is that I agree that adding the compression argument to toBlob answers the need for an async API (being synchronous was one of the criticisms of the original proposal, which only mentioned toDataURL). However, this does not address the other criticism that we should not add features to toDataURL (and by extension to toBlob) because the new functionality could implemented more or less efficiently in JS. -- Glenn Maynard
Re: [whatwg] Stricter data URL policy
On Mon, 2 Jun 2014, Anne van Kesteren wrote: At the moment data URLs inherit the origin of the context that fetches them. To be precise, the origin of data: URLs themselves is the unique origin. It's the origin of resources that come from data: URLs that's different. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Proposal: toDataURL “image/png” compression control
On Mon, Jun 2, 2014 at 12:49 PM, Rik Cabanier caban...@gmail.com wrote: That's implementation cost to you :-) Now we need to convince the other vendors. Do they want it, want more, want it in a different way? Then it needs to be documented. How can authors discover that this is supported? How can it be poly-filled? Polyfill isn't really an issue, since this is just a browser hint. We definitely need a way to feature test option arguments, but we should start another thread for that. This needs a bit more guidance in the spec as far as what different numbers mean. A quality number of 0-1 with JPEG is fairly well-understood--you won't always get the same result, but nobody interprets 1 as spend 90 seconds trying as hard as you possibly can to make the image smaller. There's no common understanding for PNG compression levels, and there's a wide variety of ways you can try harder to compress a PNG, with wildly different space/time tradeoffs. By order of cost: - Does 0 mean output a PNG as quickly as possible, even if it results in zero compression? - What number means be quick, but don't turn off compression entirely? - What number means use a reasonable tradeoff, eg. the default today? - What number means prefer smaller file sizes, but I'm expecting an order of 25% extra time cost, not a 1500%? - Does 1 mean spend two minutes if you want, make the image as small as you can? (pngcrush does this, and Photoshop in some versions does this--which is incredibly annoying, by the way.) If there's no guidance given at all, 0 might mean either of the first two, and 1 might mean either of the last two. My suggestion is an enum, with three values: fast, normal, small, which non-normative spec guidance suggesting that fast means make the compression faster if possible at the cost of file size, but don't go overboard and turn compression off entirely, and small means spend a bit more time if it helps create a smaller file, but don't go overboard and spend 15x as long. If we want to support the other two, they can be added later (eg. uncompressed and crush). Since this is only a hint, implementations can choose which ones to implement; if the choice isn't known, fall back on default. A normative requirement for all PNG compression is that it should always round-trip the RGBA value for each pixel. That means that--regardless of this option--a UA can use paletted output only if the image color fits in a palette, and it prohibits things like clamping pixels with a zero alpha to #00, which is probably one strategy for improving compression (but if you're compressing non-image data, like helper textures for WebGL, you don't want that). On Mon, Jun 2, 2014 at 1:23 PM, Nils Dagsson Moskopp n...@dieweltistgarnichtso.net wrote: As an author, I do not see why I should ever want to tell a browser losslessly encoding an image any other quality argument different from „maximum speed“ or „minimum size“ – on a cursory look, anything else would probably not be interoperable. Also, is 0.5 the default value? Image compression is uninteroperable from the start, in the sense that the each UA can always come up with different output files. This feature (and the JPEG quality level feature) doesn't make it worse. -- Glenn Maynard
Re: [whatwg] Stricter data URL policy
On Mon, Jun 2, 2014 at 6:03 AM, Boris Zbarsky bzbar...@mit.edu wrote: On 6/2/14, 9:00 AM, Anne van Kesteren wrote: You're not persuaded by the attack scenario? Correct. I mean, the same scenario applies to srcdoc, document.write() into an iframe, etc. Why are data urls special? srcdoc is like eval(). Yes, it's definitely a tool that enables you to run 3rd party code in your own context and with your own principal. However whenever you use the feature you (should) know that it's running code in your own context and with your own principal. So hopefully pages will make sure to not pass untrusted 3rd party code to neither srcdoc nor eval(). We've seen this happen internally in Gecko where chrome code will get XSSed by being tricked to load data URLs. And I've been trying to move us towards only allowing data: to run with a chrome principal if chrome code explicitly opts in to that. I don't see why websites wouldn't face the same challenges and why the same solution wouldn't work there. / Jonas