On Thu, Sep 3, 2009 at 12:58 PM, Tim Michelsen<timmichel...@gmx-topmail.de> wrote: >> My first stop is usually wikipedia: > [...] > Thanks. > So I I'known that I have to call the beast a > "empirical inverse survival function", Robert would > also have foundit easier to help. > Anyway, step by step... > >> In the case of the weight of pigs, it would be to cumulative weight of >> all pigs with a weight less than the given bin boundary weight. >> If values were income, then it would be the aggregated income of all >> individual with an income below the bin bin boundary. >> So it makes sense, given this is what you want (below). > Exactly! > > Or for precipitation: > a) count: number of precipitation events that > ocurred up to a certain limit > b) sum: precipitation total registered up to that limit > >> there might be a mistake in the treatment of a cell when >> reversing, when I run your example the highest value is >> not equal to values.sum() > This has made me think again. Small point. > > See here: > ecdf_sums = np.hstack([0.0, sums[0].cumsum() ]) > ecdf_sums = np.hstack([sums[0].cumsum() ]) > > I had to adjust the classes in the spreadsheet by > replacing the first class limit by 0.0. > I had modifed this yesterday to a different value > (0.265152) as I was testing the code. > > from: > 0.265152, 0.487273, 0.709394, 0.931515, > 1.153636, 1.375758, 1.597879, 1.820000, > 2.042121, 2.264242, 2.486364 > > to: > 0.0, 0.487273, 0.709394, 0.931515, > 1.153636, 1.375758, 1.597879, 1.820000, > 2.042121, 2.264242, 2.486364 > > Now everything is fine. Results and curves match. > >> But I'm not sure yet, what's going on. > 1) first I didn't know how to develop the code for a > "empirical inverse survival function" in numpy > 2) I screwed my spreadsheet classes up while > testing and verifying my numpy code. > > Again, would a function for the > "empirical inverse survival function" qualify for the > inclusion into numpy or scipy?
Sorry, I'm too distracted, correcting myself a second time "this should *not* have inverse in it, using inverse was a cut and paste error" it's empirical survival function If it's just a one-liner with cumsum, then I don't think its necessary to have a function for it. But following also the previous discussion, it would be useful to have the combination of histogram and empirical cdf, sf, and/or pdf to define an empirical distribution. As interpretation in terms of distribution, normed=True would be necessary, but it could also be an option. One question to your application, in the plot you draw lines and not histograms. Is there a reason to use histograms in the calculation instead of the full ecdf. (i.e. cumsum on original values instead of cumsum on histogrammed values) ? Josef > > Thanks for the help. > > Best regards, > Timmie > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion