Re: [Image-SIG] PIL vs Numpy in raster calculations
Script 3 is probably as fast as you can get with numpy -- it's probably slower than the PIL version because it does need to copy memory one way or the other. After all this talk and tests I definitely agree with you. This is the only logical explanation of why numpy is faster than PIL when dealing with small images, but slower when trying to calculate statistics of large images. I also received a message from a user of the numpy list. He suggested that PIL stores it's images as pointers-to-rows (or cols), i.e. to access a new row you need to dereference a pointer. NumPy on the other hand always stores its memory in blocks. When N grows larger, the N pointer lookups needed in PIL doesn't matter, but they do for low N. I couldn't find something that supports this, do you know anything about it? If it is true, then it might provide an additional explanation for the phenomenon, although I believe that the impact it should be limited. Concluding, I believe that on the basis of speed, the following are important: Statistics calculations for images larger than 200x200 pixels? Go with PIL Repeated statistics calculations for images smaller than 200x200 pixels? Go with numpy. ___ Image-SIG maillist - Image-SIG@python.org http://mail.python.org/mailman/listinfo/image-sig
Re: [Image-SIG] PIL vs Numpy in raster calculations
IIUC, PIL and numpy don't share exactly the same data model, so you may have to make a memory copy to go from one to the other -- that may the source of your performance decrease. If you really want to know, you could profile the code doing one step at a time (not the mean, for instance) to see where the time is going. Following your advice I wrote three different scripts and profiled them. #Script 1 - indexing for i in range(10): imarr[:,:,0].mean() imarr[:,:,1].mean() imarr[:,:,2].mean() #Script 2 - slicing for i in range(10): imarr[:,:,0:1].mean() imarr[:,:,1:2].mean() imarr[:,:,2:3].mean() #Script 3 - reshape for i in range(10): imarr.reshape(-1,3).mean(axis=0) For an RGB image 2000x2000 of ~11mb the times were: script 1: 5.432sec script 2: 10.234sec script 3: 4.980sec I tried the same without the mean(), but for 1000 loops, and the results were: script 1: 0.463sec (~6mb of RAM) script 2: 0.465sec (~3mb of RAM) script 3: 0.462sec (~2mb of RAM) Script 3, you proposed, has the best performance, while script 2 is very slow. I can't make a conclusion, but I'll use the third approach. I'll post my results back to the numpy list to see if they have an idea. ___ Image-SIG maillist - Image-SIG@python.org http://mail.python.org/mailman/listinfo/image-sig
Re: [Image-SIG] PIL vs Numpy in raster calculations
cp wrote: Following your advice I wrote three different scripts and profiled them. #Script 1 - indexing for i in range(10): imarr[:,:,0].mean() imarr[:,:,1].mean() imarr[:,:,2].mean() #Script 2 - slicing for i in range(10): imarr[:,:,0:1].mean() imarr[:,:,1:2].mean() imarr[:,:,2:3].mean() #Script 3 - reshape for i in range(10): imarr.reshape(-1,3).mean(axis=0) For an RGB image 2000x2000 of ~11mb the times were: script 1: 5.432sec script 2: 10.234sec script 3: 4.980sec I tried the same without the mean(), but for 1000 loops, and the results were: script 1: 0.463sec (~6mb of RAM) script 2: 0.465sec (~3mb of RAM) script 3: 0.462sec (~2mb of RAM) Script 3, you proposed, has the best performance, while script 2 is very slow. I think this is because with slicing, the resulting array is discontiguous in memory -- this may slow down the mean() computation. The fact that all three run at the same speed when not calculating the mean supports that concept. Shows you that you need to profile. Script 3 is probably as fast as you can get with numpy -- it's probably slower than the PIL version because it does need to copy memory one way or the other. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Image-SIG maillist - Image-SIG@python.org http://mail.python.org/mailman/listinfo/image-sig
Re: [Image-SIG] PIL vs Numpy in raster calculations
cp wrote: image_array.reshape(-1,3).mean(axis=0) what that does is reshape (without copying) the array into WxHx3 array I have just one question considering importing the PIL image with asarray. Suppose my initial image is an RGB image with 1600 pixels height and 1900 width. img.size (1900,1600) arr=asarray(img) arr.shape (1600,1900,3) In numpy syntax I would expect the last one to be (3,1900,1600). Now the returned array seems to have 1600 layers and not one for each color channel, leading to the reshape function you propose. Any idea why is that? it looks like you've followed up on the lumpy list, but: I misspoke a bit above: image_array.reshape(-1,3) means: make this a 2-d array, making the first dimension whatever it needs to be so that the second dimension is three. In this case, that flattens out the width and height, while keeping the color separate, which I think is what you wanted -- the mean of each of red, green and blue. from the reshape doc string: reshaped_array : disarrays This will be a new view object if possible; otherwise, it will be a copy. as your array was created from a PEEL image, it may need to make a copy to do this -- I'm not sure. You might try the slice method I proposed -- if it can prevent a data copy, the extra loop may be worth it. IIUC, PIL and numpy don't share exactly the same data model, so you may have to make a memory copy to go from one to the other -- that may the source of your performance decrease. If you really want to know, you could profile the code doing one step at a time (not the mean, for instance) to see where the time is going. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Image-SIG maillist - Image-SIG@python.org http://mail.python.org/mailman/listinfo/image-sig