Re: [Image-SIG] PIL vs Numpy in raster calculations

2009-05-29 Thread cp
 Script 3 is probably as fast as you can get with numpy -- it's probably 
 slower than the PIL version because it does need to copy memory one way 
 or the other.

After all this talk and tests I definitely agree with you. This is the only
logical explanation of why numpy is faster than PIL when dealing with small
images, but slower when trying to calculate statistics of large images. 

I also received a message from a user of the numpy list. He suggested that PIL
stores it's images as pointers-to-rows (or cols), i.e. to access a new row you
need to dereference a pointer. NumPy on the other hand always stores its memory
in blocks. When N grows larger, the N pointer lookups needed in PIL doesn't
matter, but they do for low N.

I couldn't find something that supports this, do you know anything about it?
If it is true, then it might provide an additional explanation for the
phenomenon, although I believe that the impact it should be limited. 
Concluding, I believe that on the basis of speed, the following are important:
Statistics calculations for images larger than 200x200 pixels? Go with PIL
Repeated statistics calculations for images smaller than 200x200 pixels? Go with
numpy. 


___
Image-SIG maillist  -  Image-SIG@python.org
http://mail.python.org/mailman/listinfo/image-sig


Re: [Image-SIG] PIL vs Numpy in raster calculations

2009-05-28 Thread cp
 IIUC, PIL and numpy don't share exactly the same data model, so you may 
 have to make a memory copy to go from one to the other -- that may the 
 source of your performance decrease.
 
 If you really want to know, you could profile the code doing one step at 
 a time (not the mean, for instance) to see where the time is going.

Following your advice I wrote three different scripts and profiled them.

#Script 1 - indexing
for i in range(10):
imarr[:,:,0].mean()
imarr[:,:,1].mean()
imarr[:,:,2].mean()

#Script 2 - slicing
for i in range(10):
imarr[:,:,0:1].mean()
imarr[:,:,1:2].mean()
imarr[:,:,2:3].mean()

#Script 3 - reshape
for i in range(10):
imarr.reshape(-1,3).mean(axis=0)

For an RGB image 2000x2000 of ~11mb the times were:
script 1: 5.432sec
script 2: 10.234sec
script 3: 4.980sec

I tried the same without the mean(), but for 1000 loops, and the results were:
script 1: 0.463sec (~6mb of RAM)
script 2: 0.465sec (~3mb of RAM)
script 3: 0.462sec (~2mb of RAM)

Script 3, you proposed, has the best performance, while script 2 is very slow. I
can't make a conclusion, but I'll use the third approach. I'll post my results
back to the numpy list to see if they have an idea. 

___
Image-SIG maillist  -  Image-SIG@python.org
http://mail.python.org/mailman/listinfo/image-sig


Re: [Image-SIG] PIL vs Numpy in raster calculations

2009-05-28 Thread Christopher Barker

cp wrote:

Following your advice I wrote three different scripts and profiled them.

#Script 1 - indexing
for i in range(10):
imarr[:,:,0].mean()
imarr[:,:,1].mean()
imarr[:,:,2].mean()

#Script 2 - slicing
for i in range(10):
imarr[:,:,0:1].mean()
imarr[:,:,1:2].mean()
imarr[:,:,2:3].mean()

#Script 3 - reshape
for i in range(10):
imarr.reshape(-1,3).mean(axis=0)

For an RGB image 2000x2000 of ~11mb the times were:
script 1: 5.432sec
script 2: 10.234sec
script 3: 4.980sec

I tried the same without the mean(), but for 1000 loops, and the results were:
script 1: 0.463sec (~6mb of RAM)
script 2: 0.465sec (~3mb of RAM)
script 3: 0.462sec (~2mb of RAM)

Script 3, you proposed, has the best performance, while script 2 is very slow.


I think this is because with slicing, the resulting array is 
discontiguous in memory -- this may slow down the mean() computation. 
The fact that all three run at the same speed when not calculating the 
mean supports that concept.


Shows you that you need to profile.

Script 3 is probably as fast as you can get with numpy -- it's probably 
slower than the PIL version because it does need to copy memory one way 
or the other.


-CHB

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Image-SIG maillist  -  Image-SIG@python.org
http://mail.python.org/mailman/listinfo/image-sig


Re: [Image-SIG] PIL vs Numpy in raster calculations

2009-05-27 Thread Christopher Barker

cp wrote:

image_array.reshape(-1,3).mean(axis=0)
what that does is reshape (without copying) the array into WxHx3 array


I have just one question considering importing the PIL image with asarray.
Suppose my initial image is an RGB image with 1600 pixels height and 1900 width.


img.size

(1900,1600)

arr=asarray(img)
arr.shape

(1600,1900,3)

In numpy syntax I would expect the last one to be (3,1900,1600). Now the
returned array seems to have 1600 layers and not one for each color channel,
leading to the reshape function you propose. Any idea why is that?


it looks like you've followed up on the lumpy list, but:

I misspoke a bit above:

image_array.reshape(-1,3)

means: make this a 2-d array, making the first dimension whatever it 
needs to be so that the second dimension is three. In this case, that 
flattens out the width and height, while keeping the color separate, 
which I think is what you wanted -- the mean of each of red, green and blue.


from the reshape doc string:

reshaped_array : disarrays
This will be a new view object if possible; otherwise, it will
be a copy.


as your array was created from a PEEL image, it may need to make a copy 
to do this -- I'm not sure.


You might try the slice method I proposed -- if it can prevent a data 
copy, the extra loop may be worth it.


IIUC, PIL and numpy don't share exactly the same data model, so you may 
have to make a memory copy to go from one to the other -- that may the 
source of your performance decrease.


If you really want to know, you could profile the code doing one step at 
a time (not the mean, for instance) to see where the time is going.


-Chris




--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Image-SIG maillist  -  Image-SIG@python.org
http://mail.python.org/mailman/listinfo/image-sig