Re: [R] Percentiles/Quantiles with Weighting
On Tue, 17 Feb 2009, Brigid Mooney wrote: Thanks for pointing me to the quantreg package as a resource. I was hoping to ask be able to address one quick follow-up question... I get slightly different variants between using the rq funciton with formula = mydata ~ 1 as I would if I ran the same data using the quantile function. Example: mydata - (1:10)^2/2 pctile - seq(.59, .99, .1) quantile(mydata, pctile) 59%69%79%89%99% 20.015 26.075 32.935 40.595 49.145 rq(mydata~1, tau=pctile) Call: rq(formula = mydata ~ 1, tau = pctile) Coefficients: tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99 (Intercept)18 24.532 40.550 Degrees of freedom: 10 total; 9 residual Is it correct to assume this is due to the different accepted methods of calculating quantiles? If you try lapply(1:9, function(i)quantile(mydata, pctile,type=i)) the answers from type=1 or 2 agree with rq(). -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Percentiles/Quantiles with Weighting
Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles/Quantiles with Weighting
I do know that Harrell's Quantile function in the Hmisc package will allow quantile estimates from models. Whether it is general enough to extend to time series, I have no experience and cannot say. -- David Winsemius On Feb 17, 2009, at 11:57 AM, Brigid Mooney wrote: Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles/Quantiles with Weighting
Thanks for pointing me to the quantreg package as a resource. I was hoping to ask be able to address one quick follow-up question... I get slightly different variants between using the rq funciton with formula = mydata ~ 1 as I would if I ran the same data using the quantile function. Example: mydata - (1:10)^2/2 pctile - seq(.59, .99, .1) quantile(mydata, pctile) 59%69%79%89%99% 20.015 26.075 32.935 40.595 49.145 rq(mydata~1, tau=pctile) Call: rq(formula = mydata ~ 1, tau = pctile) Coefficients: tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99 (Intercept)18 24.532 40.550 Degrees of freedom: 10 total; 9 residual Is it correct to assume this is due to the different accepted methods of calculating quantiles? If so, do you know where I would be able to see the algorithms used in these functions? I'm not finding it in the documentation for function rq, and am new enough to R that I don't know where those references would generally be. On Tue, Feb 17, 2009 at 12:29 PM, roger koenker rkoen...@uiuc.edu wrote: http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869 gives one possibility... url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote: Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles/Quantiles with Weighting
url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Feb 17, 2009, at 1:58 PM, Brigid Mooney wrote: Thanks for pointing me to the quantreg package as a resource. I was hoping to ask be able to address one quick follow-up question... I get slightly different variants between using the rq funciton with formula = mydata ~ 1 as I would if I ran the same data using the quantile function. Example: mydata - (1:10)^2/2 pctile - seq(.59, .99, .1) quantile(mydata, pctile) 59%69%79%89%99% 20.015 26.075 32.935 40.595 49.145 rq(mydata~1, tau=pctile) Call: rq(formula = mydata ~ 1, tau = pctile) Coefficients: tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99 (Intercept)18 24.532 40.550 Degrees of freedom: 10 total; 9 residual Is it correct to assume this is due to the different accepted methods of calculating quantiles? If so, do you know where I would be able to see the algorithms used in these functions? I'm not finding it in the documentation for function rq, and am new enough to R that I don't know where those references would generally be. Yes, quantile() in base R documents 9 varieties of quantiles, 2 more than William Empson's famous 7 Types of Ambiguity. In quantreg the function rq() finds a solution to an underlying optimization problem and doesn't ask any further into the nature of the ambiguity -- it does often produce a warning indicating that there may be more than one solution. The default base R quantile is interpolated, while the default rq() with method = br using the simplex algorithm finds an order statistic, typically. If you prefer something more like interpolation, you can try rq() with method = fn which is using an interior point algorithm and when there are multiple solutions it tends to produce something more like the centroid of the solution set. I hope that this helps. On Tue, Feb 17, 2009 at 12:29 PM, roger koenker rkoen...@uiuc.edu wrote: http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869 gives one possibility... url:www.econ.uiuc.edu/~rogerRoger Koenker emailrkoen...@uiuc.eduDepartment of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote: Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Percentiles/Quantiles with Weighting
Here is one kind of weighted quantile function. The basic idea is very simple: wquantile - function( v, w, p ) { v - v[order(v)] w - w[order(v)] v [ which.max( cumsum(w) / sum(w) = p ) ] } With some more error-checking and general clean-up, it looks like this: # Simple weighted quantile # # v A numeric vector of observations # w A numeric vector of positive weights # p The probability 0=p=1 # # Nothing fancy: no interpolation etc. # Basic idea wquantile - function( v, w, p ) { v - v[order(v)] w - w[order(v)] v [ which.max( cumsum(w) / sum(w) = p ) ] } # Simple weighted quantile # # v A numeric vector of observations # w A numeric vector of positive weights # p The probability 0=p=1 # # Nothing fancy: no interpolation etc. wquantile - function(v,w=rep(1,length(v)),p=.5) { if (!is.numeric(v) || !is.numeric(w) || length(v) != length(w)) stop(Values and weights must be equal-length numeric vectors) if ( !is.numeric(p) || any( p0 | p1 ) ) stop(Quantiles must be 0=p=1) ranking - order(v) sumw - cumsum(w[ranking]) if ( is.na(w[1]) || w[1]0 ) stop(Weights must be non-negative numbers) plist - sumw/sumw[length(sumw)] sapply(p, function(p) v [ ranking [ which.max( plist = p ) ] ]) } I would appreciate any comments people have on this -- whether correctness, efficiency, style, -s On Tue, Feb 17, 2009 at 11:57 AM, Brigid Mooney bkmoo...@gmail.com wrote: Hi All, I am looking at applications of percentiles to time sequenced data. I had just been using the quantile function to get percentiles over various periods, but am more interested in if there is an accepted (and/or R-implemented) method to apply weighting to the data so as to weigh recent data more heavily. I wrote the following function, but it seems quite inefficient, and not really very flexible in its applications - so if anyone has any suggestions on how to look at quantiles/percentiles within R while also using a weighting schema, I would be very interested. Note - this function supposes the data in X is time-sequenced, with the most recent (and thus heaviest weighted) data at the end of the vector WtPercentile - function(X=rnorm(100), pctile=seq(.1,1,.1)) { Xprime - NA for(i in 1:length(X)) { Xprime - c(Xprime, rep(X[i], times=i)) } print(Percentiles:) print(quantile(X, pctile)) print(Weighted:) print(Xprime) print(Weighted Percentiles:) print(quantile(Xprime, pctile, na.rm=TRUE)) } WtPercentile(1:10) WtPercentile(rnorm(10)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.