Hello,
Consider this sample two columns of data:
999999.9999 999999.9999
999999.9999 999999.9999
999999.9999 999999.9999
999999.9999 1693.9069
999999.9999 1676.1059
999999.9999 1621.5875
651.8040 1542.1373
691.0138 1650.4214
678.5558 1710.7311
621.5777 999999.9999
644.8341 999999.9999
696.2080 999999.9999
Putting into this data into a file say "sample.data" and loading with:
a,b = np.loadtxt('sample.data', dtype="float").T
I[16]: a
O[16]:
array([ 1.00000000e+06, 1.00000000e+06, 1.00000000e+06,
1.00000000e+06, 1.00000000e+06, 1.00000000e+06,
6.51804000e+02, 6.91013800e+02, 6.78555800e+02,
6.21577700e+02, 6.44834100e+02, 6.96208000e+02])
I[17]: b
O[17]:
array([ 999999.9999, 999999.9999, 999999.9999, 1693.9069,
1676.1059, 1621.5875, 1542.1373, 1650.4214,
1710.7311, 999999.9999, 999999.9999, 999999.9999])
### interestingly, the second column is loaded as it is but a values
reformed a little. Why this could be happening? Any idea? Anyways, back to
masked arrays:
I[24]: am = ma.masked_values(a, value=999999.9999)
I[25]: am
O[25]:
masked_array(data = [-- -- -- -- -- -- 651.804 691.0138 678.5558 621.5777
644.8341 696.208],
mask = [ True True True True True True False False False
False False False],
fill_value = 999999.9999)
I[30]: bm = ma.masked_values(b, value=999999.9999)
I[31]: am
O[31]:
masked_array(data = [-- -- -- -- -- -- 651.804 691.0138 678.5558 621.5777
644.8341 696.208],
mask = [ True True True True True True False False False
False False False],
fill_value = 999999.9999)
So far so good. A few basic checks:
I[33]: am/bm
O[33]:
masked_array(data = [-- -- -- -- -- -- 0.422662755126 0.418689311712
0.39664667346 -- -- --],
mask = [ True True True True True True False False False
True True True],
fill_value = 999999.9999)
I[34]: mean(am/bm)
O[34]: 0.41266624676580849
Unfortunately, matplotlib.mlab's prctile cannot handle this division:
I[54]: prctile(am/bm, p=[5,25,50,75,95])
O[54]:
array([ 3.96646673e-01, 6.21577700e+02, 1.00000000e+06,
1.00000000e+06, 1.00000000e+06])
This also results with wrong looking box-and-whisker plots.
Testing further with scipy.stats functions yields expected correct results:
I[55]: stats.scoreatpercentile(am/bm, per=5)
O[55]: 0.40877012449846228
I[49]: stats.scoreatpercentile(am/bm, per=25)
O[49]:
masked_array(data = --,
mask = True,
fill_value = 1e+20)
I[56]: stats.scoreatpercentile(am/bm, per=95)
O[56]:
masked_array(data = --,
mask = True,
fill_value = 1e+20)
Any confirmation?
--
Gökhan
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users