Hi, Can you provide some data regarding how well these metrics correlate with subjective measurements? I expect each will be more suitable for different types of content, but it would be interesting to know how these perform.
BR, On Thu, Feb 26, 2015 at 12:11 AM, Thomas Daede <[email protected]> wrote: > To start the discussion, here is a brief overview of the four metrics we > currently use in Daala. The reference code is in the tools/dump_*.c > files in the Daala repository. Note that all of these metrics are > applied to the luma plane only. > > ## PSNR > > PSNR is a traditional signal quality metric, measured in decibels. It is > directly drived from mean square error (MSE), or its square root (RMSE). > The formula used is: > > 20 * log10 ( MAX / RMSE ) > > or, equivalently: > > 10 * log10 ( MAX^2 / MSE ) > > which is the method used in the dump_psnr.c reference implementation. > > ## PSNR-HVS-M > > The PSNR-HVS metric performs a DCT transform of 8x8 blocks of the image, > weights the coefficients, and then calculates the PSNR of those > coefficients. Several different sets of weights have been considered. > The weights used by the dump_pnsrhvs.c tool have been found to be the > best match to real MOS scores. > > ## SSIM > > SSIM (Structural Similarity Image Metric) is a still image quality > metric introduced in 2004. It computes a score for each individual > pixel, using a window of neighboring pixels. These scores can then be > averaged to produce a global score for the entire image. The original > paper produces scores ranging between 0 and 1. > > For the metric to appear more linear on BD-rate curves, the score is > converted into a nonlinear decibel scale: > > -10 * log10 (1 - SSIM) > > ## Fast Multi-Scale SSIM > > Multi-Scale SSIM is SSIM extended to multiple window sizes. This is > implemented by downscaling the image a number of times, and computing > SSIM over the same number of pixels, then averaging the SSIM scores > together. The final score is converted to decibels in the same manner as > SSIM. > > On 02/25/2015 01:39 PM, Mo Zanaty (mzanaty) wrote: > > This is perhaps getting into charter bashing, but I think we will need > > some early milestone (close to requirements) for an evaluation criteria > > document that represents the workgroup consensus on comparative testing > > methodology and selection of solution candidates or specific tools. The > > set of test sequences will only be one small part of that. Metrics will > be > > a very important part of that. While I agree designing new metrics should > > probably be beyond the scope of proposed deliverables, I think we likely > > need a thorough evaluation and discussion of various metrics and reach > > some consensus on how proposed solutions and tools will be measured and > > adopted. > > > > Mo > > > > On 2/25/15, 3:05 PM, Timothy B. Terriberry <[email protected]> wrote: > > > > Harald Alvestrand wrote: > >> psnr values of 35 dB where x264 achieves 40 dB - it seems psnr isn't > >> particularly sensitive to the resulting blurriness). > > > > Yes, it's well-known that PSNR loves low-passing. It's not the only > > metric that's going to have these problems. FastSSIM will probably be > > similarly blind. Fixable problems, maybe, but I don't want to get in the > > business of designing my own metrics. I'm not even sure there's good > > data on human preferences for when one should downsample, but I haven't > > spent any time looking. > > > > _______________________________________________ > > video-codec mailing list > > [email protected] > > https://www.ietf.org/mailman/listinfo/video-codec > > > > _______________________________________________ > video-codec mailing list > [email protected] > https://www.ietf.org/mailman/listinfo/video-codec > -- Mohammed Raad, PhD. Partner RAADTECH CONSULTING P.O. Box 113 Warrawong NSW 2502 Australia Phone: +61 414451478 Email: [email protected]
_______________________________________________ video-codec mailing list [email protected] https://www.ietf.org/mailman/listinfo/video-codec
