Hello,

Consider this sample two columns of data:

 999999.9999 999999.9999
 999999.9999 999999.9999
 999999.9999 999999.9999
 999999.9999   1693.9069
 999999.9999   1676.1059
 999999.9999   1621.5875
    651.8040       1542.1373
    691.0138       1650.4214
    678.5558       1710.7311
    621.5777    999999.9999
    644.8341    999999.9999
    696.2080    999999.9999

Putting into this data into a file say "sample.data" and loading with:

a,b = np.loadtxt('sample.data', dtype="float").T

I[16]: a
O[16]:
array([  1.00000000e+06,   1.00000000e+06,   1.00000000e+06,
         1.00000000e+06,   1.00000000e+06,   1.00000000e+06,
         6.51804000e+02,   6.91013800e+02,   6.78555800e+02,
         6.21577700e+02,   6.44834100e+02,   6.96208000e+02])

I[17]: b
O[17]:
array([ 999999.9999,  999999.9999,  999999.9999,    1693.9069,
          1676.1059,    1621.5875,    1542.1373,    1650.4214,
          1710.7311,  999999.9999,  999999.9999,  999999.9999])

### interestingly, the second column is loaded as it is but a values
reformed a little. Why this could be happening? Any idea? Anyways, back to
masked arrays:

I[24]: am = ma.masked_values(a, value=999999.9999)

I[25]: am
O[25]:
masked_array(data = [-- -- -- -- -- -- 651.804 691.0138 678.5558 621.5777
644.8341 696.208],
             mask = [ True  True  True  True  True  True False False False
False False False],
       fill_value = 999999.9999)


I[30]: bm = ma.masked_values(b, value=999999.9999)

I[31]: am
O[31]:
masked_array(data = [-- -- -- -- -- -- 651.804 691.0138 678.5558 621.5777
644.8341 696.208],
             mask = [ True  True  True  True  True  True False False False
False False False],
       fill_value = 999999.9999)


So far so good. A few basic checks:

I[33]: am/bm
O[33]:
masked_array(data = [-- -- -- -- -- -- 0.422662755126 0.418689311712
0.39664667346 -- -- --],
             mask = [ True  True  True  True  True  True False False False
True  True  True],
       fill_value = 999999.9999)


I[34]: mean(am/bm)
O[34]: 0.41266624676580849

Unfortunately, matplotlib.mlab's prctile cannot handle this division:

I[54]: prctile(am/bm, p=[5,25,50,75,95])
O[54]:
array([  3.96646673e-01,   6.21577700e+02,   1.00000000e+06,
         1.00000000e+06,   1.00000000e+06])


This also results with wrong looking box-and-whisker plots.


Testing further with scipy.stats functions yields expected correct results:

I[55]: stats.scoreatpercentile(am/bm, per=5)
O[55]: 0.40877012449846228

I[49]: stats.scoreatpercentile(am/bm, per=25)
O[49]:
masked_array(data = --,
             mask = True,
       fill_value = 1e+20)

I[56]: stats.scoreatpercentile(am/bm, per=95)
O[56]:
masked_array(data = --,
             mask = True,
       fill_value = 1e+20)


Any confirmation?







-- 
Gökhan
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Reply via email to