Re: [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
On Friday, January 23, 2015, Darin Adler da...@apple.com wrote: On Jan 22, 2015, at 10:36 PM, Alexey Proskuryakov a...@webkit.org javascript:; wrote: 22 янв. 2015 г., в 17:57, Darin Adler da...@apple.com javascript:; написал(а): What about the test I cited? svg/css/svg-resource-fragment-identifier-img-src.html This particular test is buggy - it is a hidpi test, so it dumps results as a 1600x1200 image, but its -expected.html is not hidpi, and is dumped as 800x600, so hashes are obviously different. I now filed https://bugs.webkit.org/show_bug.cgi?id=140815 about this test. When we compare pixels, we draw both images into bitmaps of the same size, so they become similar enough for ImageDiff to consider them identical. Earlier today, Simon and I agreed that we should just silence the error message, because it only tells us about minor color differences that are inevitable when comparing compositing vs. non-compositing. Looks like it tells us about more actionable issues, too. Also, I just found https://bugs.webkit.org/show_bug.cgi?id=69444, and I think that its rationale applies in this case, too. So we should probably keep this error message. I'm not sure whether we should make it a hard error though. I think that to improve things we should make an informative message for this particular mistake where the expected file has the wrong size or resolution. For me, the bad thing about the current message is not simply that it’s a false positive, but that it’s an unclear error message that covers too many different possibilities. I’m not sure that keeping things as is will be a good strategy. A message that is often expected but sometimes indicates a real problem is not good for the project. The average engineer has no idea whether to ignore these or act on them! We could add a new test expectation like ImageDiff to suppress these or we could expose a new internals or testRunner methods to mark a test as such. - R. Niwa -- - R. Niwa ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
On Jan 23, 2015, at 10:01 AM, Ryosuke Niwa rn...@webkit.org wrote: On Friday, January 23, 2015, Darin Adler da...@apple.com mailto:da...@apple.com wrote: On Jan 22, 2015, at 10:36 PM, Alexey Proskuryakov a...@webkit.org javascript:; wrote: 22 янв. 2015 г., в 17:57, Darin Adler da...@apple.com javascript:; написал(а): What about the test I cited? svg/css/svg-resource-fragment-identifier-img-src.html This particular test is buggy - it is a hidpi test, so it dumps results as a 1600x1200 image, but its -expected.html is not hidpi, and is dumped as 800x600, so hashes are obviously different. I now filed https://bugs.webkit.org/show_bug.cgi?id=140815 https://bugs.webkit.org/show_bug.cgi?id=140815 about this test. When we compare pixels, we draw both images into bitmaps of the same size, so they become similar enough for ImageDiff to consider them identical. Earlier today, Simon and I agreed that we should just silence the error message, because it only tells us about minor color differences that are inevitable when comparing compositing vs. non-compositing. Looks like it tells us about more actionable issues, too. Also, I just found https://bugs.webkit.org/show_bug.cgi?id=69444 https://bugs.webkit.org/show_bug.cgi?id=69444, and I think that its rationale applies in this case, too. So we should probably keep this error message. I'm not sure whether we should make it a hard error though. I think that to improve things we should make an informative message for this particular mistake where the expected file has the wrong size or resolution. For me, the bad thing about the current message is not simply that it’s a false positive, but that it’s an unclear error message that covers too many different possibilities. I’m not sure that keeping things as is will be a good strategy. A message that is often expected but sometimes indicates a real problem is not good for the project. The average engineer has no idea whether to ignore these or act on them! Agreed. The size difference certainly needs to be a failure. We could add a new test expectation like ImageDiff to suppress these or we could expose a new internals or testRunner methods to mark a test as such. I don’t think that addresses the issue. Darin’s point is that if a developer sees this output (possibly when running tests after making a change), it needs to describe something actionable. Just suppressing this output for existing tests doesn’t help when the output crops up unexpectedly. Simon ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
23 янв. 2015 г., в 10:01, Ryosuke Niwa rn...@webkit.org написал(а): We could add a new test expectation like ImageDiff to suppress these or we could expose a new internals or testRunner methods to mark a test as such. Good idea! An ImageHashMismatch expectation seems like a reasonable way to silence the noise. Also, I am surprised that we weren't getting a specific message about size mismatch here, as Darin suggested. In fact, we used to have this very output on other tests until very recently (see https://bugs.webkit.org/show_bug.cgi?id=139884). - Alexey ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
On Jan 22, 2015, at 10:36 PM, Alexey Proskuryakov a...@webkit.org wrote: 22 янв. 2015 г., в 17:57, Darin Adler da...@apple.com написал(а): What about the test I cited? svg/css/svg-resource-fragment-identifier-img-src.html This particular test is buggy - it is a hidpi test, so it dumps results as a 1600x1200 image, but its -expected.html is not hidpi, and is dumped as 800x600, so hashes are obviously different. I now filed https://bugs.webkit.org/show_bug.cgi?id=140815 about this test. When we compare pixels, we draw both images into bitmaps of the same size, so they become similar enough for ImageDiff to consider them identical. Earlier today, Simon and I agreed that we should just silence the error message, because it only tells us about minor color differences that are inevitable when comparing compositing vs. non-compositing. Looks like it tells us about more actionable issues, too. Also, I just found https://bugs.webkit.org/show_bug.cgi?id=69444, and I think that its rationale applies in this case, too. So we should probably keep this error message. I'm not sure whether we should make it a hard error though. I think that to improve things we should make an informative message for this particular mistake where the expected file has the wrong size or resolution. For me, the bad thing about the current message is not simply that it’s a false positive, but that it’s an unclear error message that covers too many different possibilities. I’m not sure that keeping things as is will be a good strategy. A message that is often expected but sometimes indicates a real problem is not good for the project. The average engineer has no idea whether to ignore these or act on them! — Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
On Thu, Jan 22, 2015 at 1:56 PM, Darin Adler da...@apple.com wrote: On Jan 22, 2015, at 12:58 PM, Dirk Pranke dpra...@chromium.org wrote: Of course, Chromium was more willing to incur the cost of keeping all of the pixel tests up to date when trivial things changed the output This comment, and much of the discussion, seems to be about pixel tests. But what does this have to do with reference tests? The message has nothing to do with whether the test is a pixel test or a ref test. My reply was intended to explain why we had the hashes didn't match but diff passed message, and the discrepancy around fuzzy matching; on the Chromium ports, if run-webkit-tests had printed this message, it would've served as a warning to the user/developer that there was something weird going on. If you all decide you wanted to keep fuzzy matching enabled (and hence that message would happen regularly and could be safely ignored), I would either delete the message outright or at least make it be a debug message. You could also decide you wanted to do fuzzy matching only on ref tests, or only on pixel tests, and change the logic accordingly, of course. -- Dirk ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
On Jan 22, 2015, at 11:30 AM, Simon Fraser simon.fra...@apple.com wrote: For example, some of the tests in question render a green box either via CALayers, or by painting, and there’s a slight color difference between the two code paths. What about the test I cited? svg/css/svg-resource-fragment-identifier-img-src.html I don’t think the layer vs. painting explains the difference. Is there any reason to expect that one to have a “close but not identical” result? — Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
On Jan 22, 2015, at 8:44 AM, Darin Adler da...@apple.com wrote: I noticed that when running tests I get messages like this: svg/css/svg-resource-fragment-identifier-img-src.html - ref test hashes didn't match but diff passed I understand that we can have hash mismatches in actual checked in images (expected.png files), and there, at least, I believe the fix might be to regenerate the image. But for a ref test I am really unclear on this. And further, I think it’s distracting to get messages like this unless you are specifically working on the testing infrastructure. Why should the average WebKit engineer care about this? Does this reflect a real performance problem when running tests? If so, who would fix it and what would that person do? Alexey, Tim and I looked at this yesterday. This happens when the expected and actual images are very close, but not identical. ImageDiff has some built-in rounding that effectively acts as a small tolerance, so the hashes are different, but ImageDiff incorrectly says the images are the same. For example, some of the tests in question render a green box either via CALayers, or by painting, and there’s a slight color difference between the two code paths. My preference for how to fix this would be to fix ImageDiff to remove its slight built-in tolerance, and then to expose testRunner API to allow a test to set an explicit tolerance. There are many cases where we’d like to use ref tests, but are unable to because of slight, justifiable rendering differences, and having an explicit tolerance would permit the use of ref tests in these cases. Simon ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
22 янв. 2015 г., в 11:30, Simon Fraser simon.fra...@apple.com написал(а): This happens when the expected and actual images are very close, but not identical. ImageDiff has some built-in rounding that effectively acts as a small tolerance, so the hashes are different, but ImageDiff incorrectly says the images are the same. For example, some of the tests in question render a green box either via CALayers, or by painting, and there’s a slight color difference between the two code paths. My preference for how to fix this would be to fix ImageDiff to remove its slight built-in tolerance, and then to expose testRunner API to allow a test to set an explicit tolerance. There are many cases where we’d like to use ref tests, but are unable to because of slight, justifiable rendering differences, and having an explicit tolerance would permit the use of ref tests in these cases. One thing about tolerance is that it is super confusing - are we talking about the number of pixels that are different, or about how different the pixels are? Also, a lot of failures only cause small differences in pixel results. Even a 100x100 box that becomes red instead of green is only a small portion of the 800x600 image, and it's even more the case for tests that check e.g. text rendering. It is not currently known what the root causes are for the tests that say ref test hashes didn't match but diff passed. Given that the differences are very tiny, one guess is that even though compositing and non-compositing code paths are mathematically equivalent, there are different rendering steps taken, and rounding at each step adds up to slight differences. Another theory is that we have actual bugs, such as with color management. If it's just rounding differences, then the right thing to do is probably to silence the console output, keeping behavior the same otherwise. - Alexey ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
[webkit-dev] run-webkit-tests question; hashes when comparing ref test output
I noticed that when running tests I get messages like this: svg/css/svg-resource-fragment-identifier-img-src.html - ref test hashes didn't match but diff passed I understand that we can have hash mismatches in actual checked in images (expected.png files), and there, at least, I believe the fix might be to regenerate the image. But for a ref test I am really unclear on this. And further, I think it’s distracting to get messages like this unless you are specifically working on the testing infrastructure. Why should the average WebKit engineer care about this? Does this reflect a real performance problem when running tests? If so, who would fix it and what would that person do? — Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev
Re: [webkit-dev] run-webkit-tests question; hashes when comparing ref test output
22 янв. 2015 г., в 17:57, Darin Adler da...@apple.com написал(а): What about the test I cited? svg/css/svg-resource-fragment-identifier-img-src.html This particular test is buggy - it is a hidpi test, so it dumps results as a 1600x1200 image, but its -expected.html is not hidpi, and is dumped as 800x600, so hashes are obviously different. I now filed https://bugs.webkit.org/show_bug.cgi?id=140815 about this test. When we compare pixels, we draw both images into bitmaps of the same size, so they become similar enough for ImageDiff to consider them identical. Earlier today, Simon and I agreed that we should just silence the error message, because it only tells us about minor color differences that are inevitable when comparing compositing vs. non-compositing. Looks like it tells us about more actionable issues, too. Also, I just found https://bugs.webkit.org/show_bug.cgi?id=69444, and I think that its rationale applies in this case, too. So we should probably keep this error message. I'm not sure whether we should make it a hard error though. - Alexey ___ webkit-dev mailing list webkit-dev@lists.webkit.org https://lists.webkit.org/mailman/listinfo/webkit-dev