Please test an image that is closer to the actual size you intend to compress in your application. The performance of the 227x149 test image in libjpeg-turbo is going to depend too heavily on overhead to be a good comparison. You want a much larger image so you can really test the maximum throughput.
On 2/16/21 2:54 PM, David Horman wrote: > > Thanks for the suggestion. Here are my results (apt helpfully > suggested that I install libjpeg-turbo-test): > > *_Ubuntu (Windows Subsystem on Linux):_* > > >>>>> RGB (Top-down) <--> JPEG 4:2:0 Q80 <<<<< > > Image size: 227 x 149 > Compress --> Frame rate: 7421.929573 fps > Output image size: 6068 bytes > Compression ratio: 16.721984:1 > Throughput: 251.031924 Megapixels/sec > Output bit stream: 360.290149 Megabits/sec > Decompress --> Frame rate: 9198.674991 fps > Throughput: 311.126784 Megapixels/sec > > *_Windows 10 (same computer), x64:_* > > >>>>> RGB (Top-down) <--> JPEG 4:2:0 Q80 <<<<< > > Image size: 227 x 149 > Compress --> Frame rate: 2274.411861 fps > Output image size: 6068 bytes > Compression ratio: 16.721984:1 > Throughput: 76.927432 Megapixels/sec > Output bit stream: 110.409049 Megabits/sec > Decompress --> Frame rate: 3659.631437 fps > Throughput: 123.779714 Megapixels/sec > > As you can see, still quite a big difference! x86 tjbench.exe was even > slower at 1660fps. > > I probably should have mentioned before, I'm using version 2.0.4. > > As for my code, it prepares and writes 64 rows at a time using > jpeg_write_scanlines. All the image data is already in RAM, I just > prepare it in strips because that's what libtiff expects you to do (it > outputs TIFF, PNG, or JPEG using the same code, varying only in the > call to the appropriate library once each strip is complete. and as > noted before PNG speed is the same on both Ubuntu and Windows). I also > tried 1, 16, 512, and the full 7444 rows at a time, but it didn't make > any difference. > > David > > On 16/02/2021 20:33, DRC wrote: >> The quickest way to know whether libjpeg-turbo is at fault for the >> performance difference is to run tjbench with the same input image and >> settings on both machines. For instance: >> >> /opt/libjpeg-turbo/bin/tjbench image.ppm 80 -rgb -subsamp 420 -nowrite >> or >> c:\libjpeg-turbo64\bin\tjbench image.ppm 80 -rgb -subsamp 420 -nowrite >> >> will test the raw compute performance of compressing the contents of >> image.ppm from an RGB pixel buffer into a JPEG image with quality 80 and >> 4:2:0 subsampling. >> >> That will also give you an idea of the performance ceiling, excluding >> I/O time. I suspect that the difference you're observing is due to I/O >> time, which is out of libjpeg-turbo's control (Windows I/O is just >> slower than Linux I/O.) However, here are some possible areas for >> optimization: >> >> -- If you can spare the memory, the most efficient way to compress a >> JPEG image is to load the entire source image into memory and use the >> in-memory destination manager. (That's what tjbench does.) However, >> it's understandable if this is an untenable proposition for a >> 110-megapixel image. >> >> -- If you have to use buffered I/O, then try increasing the size of your >> buffer. >> >> -- Check for any costly and unnecessary Extended-RGB-to-RGB color >> conversion algorithms that could be replaced with the use of the >> libjpeg-turbo colorspace extensions. I've seen older code, which was >> written for libjpeg, perform really inefficient per-pixel RGBA-to-RGB or >> BGRA-to-RGB conversion, and these algorithms are so slow that they >> effectively hide any speedup from libjpeg-turbo. >> >> I'm happy to review your JPEG compression kernel if you'll post a >> snippet of code. >> >> On 2/16/21 1:11 PM, David H wrote: >>> Hi all, >>> >>> I'm writing some software which uses libjpeg-turbo to write its output >>> file. I managed to build the turbojpeg-static project with Visual Studio >>> C++ to create the turbojpeg-static.lib file and linked it to my program, >>> also built with Visual Studio C++. So far so good. >>> >>> In testing, writing a 14849 x 7444 JPEG takes 1.47 seconds. >>> >>> However, when I compile the same program in my WSL Ubuntu environment >>> running on the same laptop, linking to libjpeg (apt-get install >>> libjpeg-dev), writing the JPEG only takes 0.72 seconds. The other parts >>> of the program all vary slightly in speed, as you'd expect with >>> different compilers, but none show such a huge disparity as JPEG output. >>> PNG output is the same speed from both builds. >>> >>> This seems like a pretty big difference to me, but I'm not sure where to >>> start figuring it out. I'm pretty sure I have all the good optimisations >>> turned on in the VC project, and I've tried it with /fp:fast, but it >>> doesn't seem to make a difference. >>> >>> Are there any known speed issues with libjpeg-turbo on WIndows that >>> would explain this, or can anyone suggest some things for me to check? > -- > You received this message because you are subscribed to the Google > Groups "libjpeg-turbo User Discussion/Support" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to libjpeg-turbo-users+unsubscr...@googlegroups.com > <mailto:libjpeg-turbo-users+unsubscr...@googlegroups.com>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/libjpeg-turbo-users/0a9ce63c-c554-0ef0-c472-42ffaff69ba3%40gmail.com > <https://groups.google.com/d/msgid/libjpeg-turbo-users/0a9ce63c-c554-0ef0-c472-42ffaff69ba3%40gmail.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "libjpeg-turbo User Discussion/Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to libjpeg-turbo-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/libjpeg-turbo-users/e779b384-6df4-627e-ea61-d7eb099269fa%40virtualgl.org.