Re: Speed of horizontal flip
On Wednesday, 1 April 2015 at 13:52:06 UTC, tchaloupka wrote: C#: PNG load - 90ms PNG flip - 10ms PNG save - 380ms D using dlib (http://code.dlang.org/packages/dlib): PNG load - 500ms PNG flip - 30ms PNG save - 950ms D using imageformats (http://code.dlang.org/packages/imageformats): PNG load - 230ms PNG flip - 30ms PNG save - 1100ms My implementation of flip takes 0ms ;) http://blog.thecybershadow.net/2014/03/21/functional-image-processing-in-d/
Re: Speed of horizontal flip
On 3/04/2015 4:27 a.m., John Colvin wrote: On Thursday, 2 April 2015 at 11:49:44 UTC, Rikki Cattermole wrote: On 3/04/2015 12:29 a.m., John Colvin wrote: On Thursday, 2 April 2015 at 09:55:15 UTC, Rikki Cattermole wrote: On 2/04/2015 10:47 p.m., Rikki Cattermole wrote: On 2/04/2015 2:52 a.m., tchaloupka wrote: Hi, I have a bunch of square r16 and png images which I need to flip horizontally. My flip method looks like this: void hFlip(T)(T[] data, int w) { import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(int i; 0..w) { auto row = data[i*w..(i+1)*w]; row.reverse(); } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs, "[ms]"); } With simple r16 file format its pretty fast, but with RGB PNG files (2048x2048) I noticed its somewhat slow so I tried to compare it with C# and was pretty surprised by the results. C#: PNG load - 90ms PNG flip - 10ms PNG save - 380ms D using dlib (http://code.dlang.org/packages/dlib): PNG load - 500ms PNG flip - 30ms PNG save - 950ms D using imageformats (http://code.dlang.org/packages/imageformats): PNG load - 230ms PNG flip - 30ms PNG save - 1100ms I used dmd-2.0.67 with -release -inline -O C# was just with debug and VisualStudio attached to process for debugging and even with that it is much faster. I know that System.Drawing is using Windows GDI+, that can be used with D too, but not on linux. If we ignore the PNG loading and saving (didn't tried libpng yet), even flip method itself is 3 times slower - I don't know D enough to be sure if there isn't some more effecient way to make the flip. I like how the slices can be used here. For a C# user who is expecting things to just work as fast as possible from a system level programming language this can be somewhat disappointing to see that pure D version is about 3 times slower. Am I doing something utterly wrong? Note that this example is not critical for me, it's just a simple hobby script I use to move and flip some images - I can wait. But I post it to see if this can be taken somewhat closer to what can be expected from a system level programming language. dlib: auto im = loadPNG(name); hFlip(cast(ubyte[3][])im.data, cast(int)im.width); savePNG(im, newName); imageformats: auto im = read_image(name); hFlip(cast(ubyte[3][])im.pixels, cast(int)im.w); write_image(newName, im.w, im.h, im.pixels); C# code: static void Main(string[] args) { var files = Directory.GetFiles(args[0]); foreach (var f in files) { var sw = Stopwatch.StartNew(); var img = Image.FromFile(f); Debug.WriteLine("Img loaded in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.RotateFlip(RotateFlipType.RotateNoneFlipX); Debug.WriteLine("Img flipped in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.Save(Path.Combine(args[0], "test_" + Path.GetFileName(f))); Debug.WriteLine("Img saved in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Stop(); } } Assuming I've done it correctly, Devisualization.Image takes around 8ms in debug mode to flip horizontally using dmd. But 3ms for release. module test; void main() { import devisualization.image; import devisualization.image.mutable; import devisualization.util.core.linegraph; import std.stdio; writeln("===\nREAD\n==="); Image img = imageFromFile("test/large.png"); img = new MutableImage(img); import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(i; 0 .. 1000) { img.flipHorizontal; } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs / 1000, "[ms]"); } I was planning on doing this earlier. But I discovered a PR I pulled which fixed for 2.067 broke chunk types reading. My bad, forgot I decreased test image resolution to 256x256. I'm totally out of the running. I have some serious work to do by the looks. Have you considered just being able to grab an object with changed iteration order instead of actually doing the flip? The same goes for transposes and 90º rotations. Sure, sometimes you do need actually rearrange the memory and in a subset of those cases you need it to be done fast, but a lot of the time you're better off* just using a different iteration scheme (which, for ranges, should probably be part of the type to avoid checking the scheme every iteration). *for speed and memory reasons. Need to keep the original and the transpose? No need to for any duplicates Note that this is what numpy does with transposes. The .T and .transpose methods of ndarray don't actually modify the data, they just set the memory order** whereas the transpose function actually moves memory around. **using a runtime flag, which is ok for them because internal iteration lets you only branch once on it. I've got it down to ~ 12ms
Re: Speed of horizontal flip
On Wednesday, 1 April 2015 at 14:00:52 UTC, bearophile wrote: If you have to perform performance benchmarks then use ldc or gdc. Also disable bound tests with your compilation switches. Add the usual pure/nothrow/@nogc/@safe annotations where you can (they don't increase speed much, usually). if you are using classes don't forget to make the method final. Profile the code and look for the performance bottlenecks. This very text should be placed somewhere prominent at the D homepage if we don't want to constantly dissapoint people who come with the impession that D should be at the same speed level as C/C++ but their test programs aren't.
Re: Speed of horizontal flip
On Thursday, 2 April 2015 at 11:49:44 UTC, Rikki Cattermole wrote: On 3/04/2015 12:29 a.m., John Colvin wrote: On Thursday, 2 April 2015 at 09:55:15 UTC, Rikki Cattermole wrote: On 2/04/2015 10:47 p.m., Rikki Cattermole wrote: On 2/04/2015 2:52 a.m., tchaloupka wrote: Hi, I have a bunch of square r16 and png images which I need to flip horizontally. My flip method looks like this: void hFlip(T)(T[] data, int w) { import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(int i; 0..w) { auto row = data[i*w..(i+1)*w]; row.reverse(); } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs, "[ms]"); } With simple r16 file format its pretty fast, but with RGB PNG files (2048x2048) I noticed its somewhat slow so I tried to compare it with C# and was pretty surprised by the results. C#: PNG load - 90ms PNG flip - 10ms PNG save - 380ms D using dlib (http://code.dlang.org/packages/dlib): PNG load - 500ms PNG flip - 30ms PNG save - 950ms D using imageformats (http://code.dlang.org/packages/imageformats): PNG load - 230ms PNG flip - 30ms PNG save - 1100ms I used dmd-2.0.67 with -release -inline -O C# was just with debug and VisualStudio attached to process for debugging and even with that it is much faster. I know that System.Drawing is using Windows GDI+, that can be used with D too, but not on linux. If we ignore the PNG loading and saving (didn't tried libpng yet), even flip method itself is 3 times slower - I don't know D enough to be sure if there isn't some more effecient way to make the flip. I like how the slices can be used here. For a C# user who is expecting things to just work as fast as possible from a system level programming language this can be somewhat disappointing to see that pure D version is about 3 times slower. Am I doing something utterly wrong? Note that this example is not critical for me, it's just a simple hobby script I use to move and flip some images - I can wait. But I post it to see if this can be taken somewhat closer to what can be expected from a system level programming language. dlib: auto im = loadPNG(name); hFlip(cast(ubyte[3][])im.data, cast(int)im.width); savePNG(im, newName); imageformats: auto im = read_image(name); hFlip(cast(ubyte[3][])im.pixels, cast(int)im.w); write_image(newName, im.w, im.h, im.pixels); C# code: static void Main(string[] args) { var files = Directory.GetFiles(args[0]); foreach (var f in files) { var sw = Stopwatch.StartNew(); var img = Image.FromFile(f); Debug.WriteLine("Img loaded in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.RotateFlip(RotateFlipType.RotateNoneFlipX); Debug.WriteLine("Img flipped in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.Save(Path.Combine(args[0], "test_" + Path.GetFileName(f))); Debug.WriteLine("Img saved in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Stop(); } } Assuming I've done it correctly, Devisualization.Image takes around 8ms in debug mode to flip horizontally using dmd. But 3ms for release. module test; void main() { import devisualization.image; import devisualization.image.mutable; import devisualization.util.core.linegraph; import std.stdio; writeln("===\nREAD\n==="); Image img = imageFromFile("test/large.png"); img = new MutableImage(img); import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(i; 0 .. 1000) { img.flipHorizontal; } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs / 1000, "[ms]"); } I was planning on doing this earlier. But I discovered a PR I pulled which fixed for 2.067 broke chunk types reading. My bad, forgot I decreased test image resolution to 256x256. I'm totally out of the running. I have some serious work to do by the looks. Have you considered just being able to grab an object with changed iteration order instead of actually doing the flip? The same goes for transposes and 90º rotations. Sure, sometimes you do need actually rearrange the memory and in a subset of those cases you need it to be done fast, but a lot of the time you're better off* just using a different iteration scheme (which, for ranges, should probably be part of the type to avoid checking the scheme every iteration). *for speed and memory reasons. Need to keep the original and the transpose? No need to for any duplicates Note that this is what numpy does with transposes. The .T and .transpose methods of ndarray don't actually modify the data, they just set the memory order** whereas the transpose function actually moves memory around. **using a runtime flag, which is ok for them because internal iteration lets you only branch once on it. I've got it down to ~ 12ms u
Re: Speed of horizontal flip
On 3/04/2015 12:29 a.m., John Colvin wrote: On Thursday, 2 April 2015 at 09:55:15 UTC, Rikki Cattermole wrote: On 2/04/2015 10:47 p.m., Rikki Cattermole wrote: On 2/04/2015 2:52 a.m., tchaloupka wrote: Hi, I have a bunch of square r16 and png images which I need to flip horizontally. My flip method looks like this: void hFlip(T)(T[] data, int w) { import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(int i; 0..w) { auto row = data[i*w..(i+1)*w]; row.reverse(); } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs, "[ms]"); } With simple r16 file format its pretty fast, but with RGB PNG files (2048x2048) I noticed its somewhat slow so I tried to compare it with C# and was pretty surprised by the results. C#: PNG load - 90ms PNG flip - 10ms PNG save - 380ms D using dlib (http://code.dlang.org/packages/dlib): PNG load - 500ms PNG flip - 30ms PNG save - 950ms D using imageformats (http://code.dlang.org/packages/imageformats): PNG load - 230ms PNG flip - 30ms PNG save - 1100ms I used dmd-2.0.67 with -release -inline -O C# was just with debug and VisualStudio attached to process for debugging and even with that it is much faster. I know that System.Drawing is using Windows GDI+, that can be used with D too, but not on linux. If we ignore the PNG loading and saving (didn't tried libpng yet), even flip method itself is 3 times slower - I don't know D enough to be sure if there isn't some more effecient way to make the flip. I like how the slices can be used here. For a C# user who is expecting things to just work as fast as possible from a system level programming language this can be somewhat disappointing to see that pure D version is about 3 times slower. Am I doing something utterly wrong? Note that this example is not critical for me, it's just a simple hobby script I use to move and flip some images - I can wait. But I post it to see if this can be taken somewhat closer to what can be expected from a system level programming language. dlib: auto im = loadPNG(name); hFlip(cast(ubyte[3][])im.data, cast(int)im.width); savePNG(im, newName); imageformats: auto im = read_image(name); hFlip(cast(ubyte[3][])im.pixels, cast(int)im.w); write_image(newName, im.w, im.h, im.pixels); C# code: static void Main(string[] args) { var files = Directory.GetFiles(args[0]); foreach (var f in files) { var sw = Stopwatch.StartNew(); var img = Image.FromFile(f); Debug.WriteLine("Img loaded in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.RotateFlip(RotateFlipType.RotateNoneFlipX); Debug.WriteLine("Img flipped in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.Save(Path.Combine(args[0], "test_" + Path.GetFileName(f))); Debug.WriteLine("Img saved in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Stop(); } } Assuming I've done it correctly, Devisualization.Image takes around 8ms in debug mode to flip horizontally using dmd. But 3ms for release. module test; void main() { import devisualization.image; import devisualization.image.mutable; import devisualization.util.core.linegraph; import std.stdio; writeln("===\nREAD\n==="); Image img = imageFromFile("test/large.png"); img = new MutableImage(img); import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(i; 0 .. 1000) { img.flipHorizontal; } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs / 1000, "[ms]"); } I was planning on doing this earlier. But I discovered a PR I pulled which fixed for 2.067 broke chunk types reading. My bad, forgot I decreased test image resolution to 256x256. I'm totally out of the running. I have some serious work to do by the looks. Have you considered just being able to grab an object with changed iteration order instead of actually doing the flip? The same goes for transposes and 90º rotations. Sure, sometimes you do need actually rearrange the memory and in a subset of those cases you need it to be done fast, but a lot of the time you're better off* just using a different iteration scheme (which, for ranges, should probably be part of the type to avoid checking the scheme every iteration). *for speed and memory reasons. Need to keep the original and the transpose? No need to for any duplicates Note that this is what numpy does with transposes. The .T and .transpose methods of ndarray don't actually modify the data, they just set the memory order** whereas the transpose function actually moves memory around. **using a runtime flag, which is ok for them because internal iteration lets you only branch once on it. I've got it down to ~ 12ms using dmd now. But if the image was much bigger (lets say a height of
Re: Speed of horizontal flip
On Thursday, 2 April 2015 at 09:55:15 UTC, Rikki Cattermole wrote: On 2/04/2015 10:47 p.m., Rikki Cattermole wrote: On 2/04/2015 2:52 a.m., tchaloupka wrote: Hi, I have a bunch of square r16 and png images which I need to flip horizontally. My flip method looks like this: void hFlip(T)(T[] data, int w) { import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(int i; 0..w) { auto row = data[i*w..(i+1)*w]; row.reverse(); } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs, "[ms]"); } With simple r16 file format its pretty fast, but with RGB PNG files (2048x2048) I noticed its somewhat slow so I tried to compare it with C# and was pretty surprised by the results. C#: PNG load - 90ms PNG flip - 10ms PNG save - 380ms D using dlib (http://code.dlang.org/packages/dlib): PNG load - 500ms PNG flip - 30ms PNG save - 950ms D using imageformats (http://code.dlang.org/packages/imageformats): PNG load - 230ms PNG flip - 30ms PNG save - 1100ms I used dmd-2.0.67 with -release -inline -O C# was just with debug and VisualStudio attached to process for debugging and even with that it is much faster. I know that System.Drawing is using Windows GDI+, that can be used with D too, but not on linux. If we ignore the PNG loading and saving (didn't tried libpng yet), even flip method itself is 3 times slower - I don't know D enough to be sure if there isn't some more effecient way to make the flip. I like how the slices can be used here. For a C# user who is expecting things to just work as fast as possible from a system level programming language this can be somewhat disappointing to see that pure D version is about 3 times slower. Am I doing something utterly wrong? Note that this example is not critical for me, it's just a simple hobby script I use to move and flip some images - I can wait. But I post it to see if this can be taken somewhat closer to what can be expected from a system level programming language. dlib: auto im = loadPNG(name); hFlip(cast(ubyte[3][])im.data, cast(int)im.width); savePNG(im, newName); imageformats: auto im = read_image(name); hFlip(cast(ubyte[3][])im.pixels, cast(int)im.w); write_image(newName, im.w, im.h, im.pixels); C# code: static void Main(string[] args) { var files = Directory.GetFiles(args[0]); foreach (var f in files) { var sw = Stopwatch.StartNew(); var img = Image.FromFile(f); Debug.WriteLine("Img loaded in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.RotateFlip(RotateFlipType.RotateNoneFlipX); Debug.WriteLine("Img flipped in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.Save(Path.Combine(args[0], "test_" + Path.GetFileName(f))); Debug.WriteLine("Img saved in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Stop(); } } Assuming I've done it correctly, Devisualization.Image takes around 8ms in debug mode to flip horizontally using dmd. But 3ms for release. module test; void main() { import devisualization.image; import devisualization.image.mutable; import devisualization.util.core.linegraph; import std.stdio; writeln("===\nREAD\n==="); Image img = imageFromFile("test/large.png"); img = new MutableImage(img); import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(i; 0 .. 1000) { img.flipHorizontal; } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs / 1000, "[ms]"); } I was planning on doing this earlier. But I discovered a PR I pulled which fixed for 2.067 broke chunk types reading. My bad, forgot I decreased test image resolution to 256x256. I'm totally out of the running. I have some serious work to do by the looks. Have you considered just being able to grab an object with changed iteration order instead of actually doing the flip? The same goes for transposes and 90º rotations. Sure, sometimes you do need actually rearrange the memory and in a subset of those cases you need it to be done fast, but a lot of the time you're better off* just using a different iteration scheme (which, for ranges, should probably be part of the type to avoid checking the scheme every iteration). *for speed and memory reasons. Need to keep the original and the transpose? No need to for any duplicates Note that this is what numpy does with transposes. The .T and .transpose methods of ndarray don't actually modify the data, they just set the memory order** whereas the transpose function actually moves memory around. **using a runtime flag, which is ok for them because internal iteration lets you only branch once on it.
Re: Speed of horizontal flip
On Thursday, 2 April 2015 at 05:21:08 UTC, thedeemon wrote: std.algorithm.reverse uses ranges, and shamefully DMD is really bad at optimizing away range-induced costs. The specialisation of reverse selected for slices does not use the range interface, it's all just indexing. The only overheads come from: a) function calls, if the inliner isn't doing its job (which it really should be in these cases). b) a check for aliasing in swapAt, which is only done for ranges of static arrays. Again, should be optimised away in this case, but it's possible DMD doesn't manage it. Either way, it's a trivially predictable branch and should be effectively free at the CPU level. Once you've got past those, it's just straight loop I posted before.
Re: Speed of horizontal flip
On 2/04/2015 10:47 p.m., Rikki Cattermole wrote: On 2/04/2015 2:52 a.m., tchaloupka wrote: Hi, I have a bunch of square r16 and png images which I need to flip horizontally. My flip method looks like this: void hFlip(T)(T[] data, int w) { import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(int i; 0..w) { auto row = data[i*w..(i+1)*w]; row.reverse(); } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs, "[ms]"); } With simple r16 file format its pretty fast, but with RGB PNG files (2048x2048) I noticed its somewhat slow so I tried to compare it with C# and was pretty surprised by the results. C#: PNG load - 90ms PNG flip - 10ms PNG save - 380ms D using dlib (http://code.dlang.org/packages/dlib): PNG load - 500ms PNG flip - 30ms PNG save - 950ms D using imageformats (http://code.dlang.org/packages/imageformats): PNG load - 230ms PNG flip - 30ms PNG save - 1100ms I used dmd-2.0.67 with -release -inline -O C# was just with debug and VisualStudio attached to process for debugging and even with that it is much faster. I know that System.Drawing is using Windows GDI+, that can be used with D too, but not on linux. If we ignore the PNG loading and saving (didn't tried libpng yet), even flip method itself is 3 times slower - I don't know D enough to be sure if there isn't some more effecient way to make the flip. I like how the slices can be used here. For a C# user who is expecting things to just work as fast as possible from a system level programming language this can be somewhat disappointing to see that pure D version is about 3 times slower. Am I doing something utterly wrong? Note that this example is not critical for me, it's just a simple hobby script I use to move and flip some images - I can wait. But I post it to see if this can be taken somewhat closer to what can be expected from a system level programming language. dlib: auto im = loadPNG(name); hFlip(cast(ubyte[3][])im.data, cast(int)im.width); savePNG(im, newName); imageformats: auto im = read_image(name); hFlip(cast(ubyte[3][])im.pixels, cast(int)im.w); write_image(newName, im.w, im.h, im.pixels); C# code: static void Main(string[] args) { var files = Directory.GetFiles(args[0]); foreach (var f in files) { var sw = Stopwatch.StartNew(); var img = Image.FromFile(f); Debug.WriteLine("Img loaded in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.RotateFlip(RotateFlipType.RotateNoneFlipX); Debug.WriteLine("Img flipped in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.Save(Path.Combine(args[0], "test_" + Path.GetFileName(f))); Debug.WriteLine("Img saved in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Stop(); } } Assuming I've done it correctly, Devisualization.Image takes around 8ms in debug mode to flip horizontally using dmd. But 3ms for release. module test; void main() { import devisualization.image; import devisualization.image.mutable; import devisualization.util.core.linegraph; import std.stdio; writeln("===\nREAD\n==="); Image img = imageFromFile("test/large.png"); img = new MutableImage(img); import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(i; 0 .. 1000) { img.flipHorizontal; } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs / 1000, "[ms]"); } I was planning on doing this earlier. But I discovered a PR I pulled which fixed for 2.067 broke chunk types reading. My bad, forgot I decreased test image resolution to 256x256. I'm totally out of the running. I have some serious work to do by the looks.
Re: Speed of horizontal flip
On 2/04/2015 2:52 a.m., tchaloupka wrote: Hi, I have a bunch of square r16 and png images which I need to flip horizontally. My flip method looks like this: void hFlip(T)(T[] data, int w) { import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(int i; 0..w) { auto row = data[i*w..(i+1)*w]; row.reverse(); } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs, "[ms]"); } With simple r16 file format its pretty fast, but with RGB PNG files (2048x2048) I noticed its somewhat slow so I tried to compare it with C# and was pretty surprised by the results. C#: PNG load - 90ms PNG flip - 10ms PNG save - 380ms D using dlib (http://code.dlang.org/packages/dlib): PNG load - 500ms PNG flip - 30ms PNG save - 950ms D using imageformats (http://code.dlang.org/packages/imageformats): PNG load - 230ms PNG flip - 30ms PNG save - 1100ms I used dmd-2.0.67 with -release -inline -O C# was just with debug and VisualStudio attached to process for debugging and even with that it is much faster. I know that System.Drawing is using Windows GDI+, that can be used with D too, but not on linux. If we ignore the PNG loading and saving (didn't tried libpng yet), even flip method itself is 3 times slower - I don't know D enough to be sure if there isn't some more effecient way to make the flip. I like how the slices can be used here. For a C# user who is expecting things to just work as fast as possible from a system level programming language this can be somewhat disappointing to see that pure D version is about 3 times slower. Am I doing something utterly wrong? Note that this example is not critical for me, it's just a simple hobby script I use to move and flip some images - I can wait. But I post it to see if this can be taken somewhat closer to what can be expected from a system level programming language. dlib: auto im = loadPNG(name); hFlip(cast(ubyte[3][])im.data, cast(int)im.width); savePNG(im, newName); imageformats: auto im = read_image(name); hFlip(cast(ubyte[3][])im.pixels, cast(int)im.w); write_image(newName, im.w, im.h, im.pixels); C# code: static void Main(string[] args) { var files = Directory.GetFiles(args[0]); foreach (var f in files) { var sw = Stopwatch.StartNew(); var img = Image.FromFile(f); Debug.WriteLine("Img loaded in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.RotateFlip(RotateFlipType.RotateNoneFlipX); Debug.WriteLine("Img flipped in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.Save(Path.Combine(args[0], "test_" + Path.GetFileName(f))); Debug.WriteLine("Img saved in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Stop(); } } Assuming I've done it correctly, Devisualization.Image takes around 8ms in debug mode to flip horizontally using dmd. But 3ms for release. module test; void main() { import devisualization.image; import devisualization.image.mutable; import devisualization.util.core.linegraph; import std.stdio; writeln("===\nREAD\n==="); Image img = imageFromFile("test/large.png"); img = new MutableImage(img); import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(i; 0 .. 1000) { img.flipHorizontal; } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs / 1000, "[ms]"); } I was planning on doing this earlier. But I discovered a PR I pulled which fixed for 2.067 broke chunk types reading.
Re: Speed of horizontal flip
std.algorithm.reverse uses ranges, and shamefully DMD is really bad at optimizing away range-induced costs.
Re: Speed of horizontal flip
On Wednesday, 1 April 2015 at 16:08:14 UTC, John Colvin wrote: On Wednesday, 1 April 2015 at 13:52:06 UTC, tchaloupka wrote: I'm pretty sure that the flipping happens in GDI+ as well. You might be writing C#, but the code your calling that's doing all the work is C and/or C++, quite possibly carefully optimised over many years by microsoft. Yes thats right, load, flip and save are all performed by GDI+ so just pinvoke to optimised code from C#.
Re: Speed of horizontal flip
On Wednesday, 1 April 2015 at 14:00:52 UTC, bearophile wrote: tchaloupka: Am I doing something utterly wrong? If you have to perform performance benchmarks then use ldc or gdc. I tried it on my slower linux box (i5-2500K vs i7-2600K) without change with these results: C# (mono with its own GDI+ library): Img loaded in 108[ms] Img flipped in 22[ms] Img saved in 492[ms] dmd-2.067: Png loaded in: 150[ms] Img flipped in: 28[ms] Png saved in: 765[ms] gdc-4.8.3: Png loaded in: 121[ms] Img flipped in: 4[ms] Png saved in: 686[ms] ldc2-0_15: Png loaded in: 106[ms] Img flipped in: 4[ms] Png saved in: 610[ms] I'm ok with that, thx.
Re: Speed of horizontal flip
On Wednesday, 1 April 2015 at 13:52:06 UTC, tchaloupka wrote: I'm pretty sure that the flipping happens in GDI+ as well. You might be writing C#, but the code your calling that's doing all the work is C and/or C++, quite possibly carefully optimised over many years by microsoft. Are you even sure that your C# code truly performs a flip? It could easily just set the iteration scheme and return (like numpy.ndarray.T does, if you're familiar with python). dmd does not produce particularly fast code. ldc and gdc are much better at that. Sadly, std.algorithm.reserve isn't perhaps as fast as it could be for arrays of static arrays, at least in theory. Try this, but I hope that with a properly optimised build from ldc/gdc it won't make any difference: void reverse(ubyte[3][] r) { immutable last = r.length-1; immutable steps = r.length/2; foreach(immutable i; 0 .. steps) { immutable tmp = r[i]; r[i] = r[last - i]; r[last - i] = tmp; } } unittest { ubyte[3] a = [1,2,3]; ubyte[3] b = [7,6,5]; auto c = [a,b]; c.reverse(); assert(c == [b,a]); ubyte[3] d = [9,4,6]; auto e = [a,b,d]; e.reverse(); assert(e == [d,b,a]); auto f = e.dup; e.reverse; e.reverse; assert(f == e); }
Re: Speed of horizontal flip
tchaloupka: Am I doing something utterly wrong? If you have to perform performance benchmarks then use ldc or gdc. Also disable bound tests with your compilation switches. Sometimes reverse() is not efficient, I think, it should be improved. Try to replace it with a little function written by you. Add the usual pure/nothrow/@nogc/@safe annotations where you can (they don't increase speed much, usually). And you refer to flip as "method", so if you are using classes don't forget to make the method final. Profile the code and look for the performance bottlenecks. You can even replace the *w multiplications with an increment of an index each loop, but this time saving is dwarfed by the reverse(). Bye, bearophile
Speed of horizontal flip
Hi, I have a bunch of square r16 and png images which I need to flip horizontally. My flip method looks like this: void hFlip(T)(T[] data, int w) { import std.datetime : StopWatch; StopWatch sw; sw.start(); foreach(int i; 0..w) { auto row = data[i*w..(i+1)*w]; row.reverse(); } sw.stop(); writeln("Img flipped in: ", sw.peek().msecs, "[ms]"); } With simple r16 file format its pretty fast, but with RGB PNG files (2048x2048) I noticed its somewhat slow so I tried to compare it with C# and was pretty surprised by the results. C#: PNG load - 90ms PNG flip - 10ms PNG save - 380ms D using dlib (http://code.dlang.org/packages/dlib): PNG load - 500ms PNG flip - 30ms PNG save - 950ms D using imageformats (http://code.dlang.org/packages/imageformats): PNG load - 230ms PNG flip - 30ms PNG save - 1100ms I used dmd-2.0.67 with -release -inline -O C# was just with debug and VisualStudio attached to process for debugging and even with that it is much faster. I know that System.Drawing is using Windows GDI+, that can be used with D too, but not on linux. If we ignore the PNG loading and saving (didn't tried libpng yet), even flip method itself is 3 times slower - I don't know D enough to be sure if there isn't some more effecient way to make the flip. I like how the slices can be used here. For a C# user who is expecting things to just work as fast as possible from a system level programming language this can be somewhat disappointing to see that pure D version is about 3 times slower. Am I doing something utterly wrong? Note that this example is not critical for me, it's just a simple hobby script I use to move and flip some images - I can wait. But I post it to see if this can be taken somewhat closer to what can be expected from a system level programming language. dlib: auto im = loadPNG(name); hFlip(cast(ubyte[3][])im.data, cast(int)im.width); savePNG(im, newName); imageformats: auto im = read_image(name); hFlip(cast(ubyte[3][])im.pixels, cast(int)im.w); write_image(newName, im.w, im.h, im.pixels); C# code: static void Main(string[] args) { var files = Directory.GetFiles(args[0]); foreach (var f in files) { var sw = Stopwatch.StartNew(); var img = Image.FromFile(f); Debug.WriteLine("Img loaded in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.RotateFlip(RotateFlipType.RotateNoneFlipX); Debug.WriteLine("Img flipped in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Restart(); img.Save(Path.Combine(args[0], "test_" + Path.GetFileName(f))); Debug.WriteLine("Img saved in {0}[ms]", (int)sw.Elapsed.TotalMilliseconds); sw.Stop(); } }