Re: [R] Fitted Value Pareto Distribution
livia wrote: > I would like to fit a Pareto Distribution and I am using the following codes. > > I thought the fitted (fit1) should be the fitted value for the data, is it > correct? As the result of the "fitted" turns out to be a single value for > all. > > fit=vglm(ycf1 ~ 1, pareto1(location=alpha), trace=TRUE, crit="c") > fitted(fit) > > The result is > fitted(fit) > [,1] > [1,] 0.07752694 > [2,] 0.07752694 > [3,] 0.07752694 > [4,] 0.07752694 > [5,] 0.07752694 > [6,] 0.07752694 > [7,] 0.07752694 > [8,] 0.07752694 > [9,] 0.07752694 > [10,] 0.07752694 > [11,] 0.07752694 > [12,] 0.07752694 > [13,] 0.07752694 > > Could anybody give me some advice? > Your model only includes an intercept, so the fitted value is supposed to be the same for all units (there is nothing in your model that allows the fitted value to vary across units). markus -- Markus Jantti Abo Akademi University [EMAIL PROTECTED] http://www.iki.fi/~mjantti __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generalized moment method
Sebastian Kruk wrote: > Dear everyone: > > I have to finish my thesis to graduate as Bs. in Economics. > > I choose to estimate a New Keynesian Phillips Curve (NKPC) for Uruguay > using Generalized Moment Method (GMM). > > I do not know programming or R but I would like to use it. > > Should I use gee, geepack or gam? Dear Sebastián -- neither geepack nor gam provide GMM estimators. GMM -- or at least minimum distance estimation techniques -- rely on fitting by linear or more often non-linear least squares functions of smaller parameter vectors to the empirical moments of your problem. R is a suitable tool for this, but there is AFAIK know general GMM package. The details of our model would need to be known before any further advice can be given. Regards, Markus > > Thanks in advance, > > Sebastián. > > *** > > ¡Hola todos! > > Para terminiar mi licenciatura en Economía debo hacer un trabajo de > investigación monográfico. > > Elegi como tema la estimación de la curva de Phillips de los Nuevos > Keynesianos (CPNK). > > No se programar ni conosco el lenguaje R pero me gustaria usarlo para > estimar la CPNK usando el método generalizado de los momentos (MGM). > > ¿Debería usar el paquete gee, geepack o gam? > > Gracias a todos. > > Sebastián. > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Markus Jantti Abo Akademi University [EMAIL PROTECTED] http://www.iki.fi/~mjantti __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with gls
On Sat, 2007-01-06 at 01:21 -0500, Zhenqiang Lu wrote: > Hello R-users, > > I am using gls function in R to fit a model with certain correlation > structure. > > The medol as: > fit.a<-gls(y~1,data=test.data,correlation=corAR1(form=~1|aa),method="ML") > mu<-summary(fit.a)$coefficient > > With the toy data I made to test, the estimate of mu is exactly equal to > the overall mean of y which can not be true. > On the contrary, this has to be true. Due to a well-known result (a recent is is Greene, 2003, sec 14.2.2, pp 343-344) when the covariates are the same in every equation, a "seemingly unrelated regression" of this sort has GLS and OLS produce identical results. The OLS estimate is exactly the arithmetic average. @Book{greene2003, Author = {William H Greene}, Title = {Econometric Analysis}, Publisher = {Prentice-Hall International Ltd}, Address= {London}, Edition= {Fifth}, year = 2003, } > But, if I make a toy data with y more than two replicates (for each level > of aa we have more than 2 abservations, for example N=4), the estimates of > mu will not be as the same as the overall mean of y. This is what is strange. You should probably provide example code of this too. > > Would you please tell me why this happens? > The following is my testing code. The code had some typos in it (and you did not load nlme). I believe this would be correct: require(mvtnorm) library(nlme) rho=-0.5 N=2 sigma.a=2 Rho.a<-matrix(c(1,rho^(1:(N-1)))[outer(X=1:N,Y=1:N,function(x,y) 1+abs(x-y))],ncol=N) Sigma.a<-sigma.a^2 * Rho.a/(1-rho^2) y<-rep(0,N*20) for (ii in 1:20) {y[((ii-1)*N+1):(ii*N)]<-rmvnorm(1, mean=rep(0,N), sigma=Sigma.a) } test.data<-data.frame(y=y,aa=rep(1:20,each=N,len=N*20)) fit.a<-gls(y~1,data=test.data,correlation=corAR1(form=~1| aa),method="ML") mu.a<-summary(fit.a)$coefficient rho.a<-getVarCov(fit.a)[1,2]/getVarCov(fit.a)[1,1] print(c(mean(y),mu.a)) Regards, Markus > Thanks. > James > > > > require(mvtnorm) > rho=-0.5 > N=2 > sigma.a=2 > > Rho.a<-matrix(c(1,rho^(1:(N-1)))[outer(X=1:N,Y=1:N,function(x,y) > 1+abs(x-y))],ncol=N) > Sigma.a<-sigma.a^2 * Rho/(1-rho.a^2) > y<-rep(0,N*20) > for (ii in 1:20) > {y[((ii-1)*N+1):(ii*N)]<-rmvnorm(1, mean=rep(0,N), sigma=Sigma.a) > } > test.data<-data.frame(y=y,aa=rep(1:20,each=N,len=N*20)) > fit.a<-gls(y~1,data=test.data,correlation=corAR1(form=~1|aa),method="ML") > mu.a<-summary(fit.a)$coefficient > rho.a<-getVarCov(fit.a)[1,2]/getVarCov(fit.a)[1,1] > print(c(mean(y),mu.a)) > > __ > Zhenqiang (James) Lu > > Department of Statistics > Purdue University > West Lafayette, IN 47906 > TEL: (765) 494-0027 > FAX: (765) 494-0558 > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Markus Jantti Abo Akademi University [EMAIL PROTECTED] http://www.iki.fi/~mjantti __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using weighted.mean with tapply()
roger bos wrote: > I am trying to calculate the weighted mean for a of 10 deciles and I > get an error: > >>decile <- tapply(X=mat$trt1m, INDEX=mat$Rank, FUN=weighted.mean, w=mat$mcap) > > Error in FUN(X[[1]], ...) : 'x' and 'w' must have the same length > > All three of my inputs have the same length, as shown below, and the > weighted.mean calculation works by itself, just not in tapply() > Hi -- I asked pretty much this same question some years ago: http://www.r-project.org/nocvs/mail/r-help/1999/2160.html The answer turns out to be that you should pass the index to tapply rather than the data. In your case this would, I think, translate to decile <- tapply(seq(along=mat$trlm, mat$Rank, function(i, x=mat$trlm[i], w=mat$mcap[i]) weighted.mean(x[i], w[i])) hope this helps. regards, Markus > >>length(mat$Rank) > > [1] 1853 > >>length(mat$mcap) > > [1] 1853 > >>length(mat$trt1m) > > [1] 1853 > >>mean(mat$trt1m) > > [1] -0.04475397 > weighted.mean(mat$trt1m, w=mat$mcap) > [1] -0.04819243 > >>mat$mcap[is.na(mat$mcap)] <- min(mat$mcap, na.rm=TRUE) > > > I am probably making a simple error in how I pass the optional > parameter w. Any help would be greatly appreciated. > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > -- Markus Jantti Abo Akademi University [EMAIL PROTECTED] http://www.iki.fi/~mjantti __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] poly() in lm() leads to wrong coefficients (but correct residuals)
On Wed, 2005-06-29 at 18:19 +0200, Andreas Neumann wrote: > Dear all, > > I am using poly() in lm() in the following form. > > 1> DelsDPWOS.lm3 <- lm(DelsPDWOS[,1] ~ poly(DelsPDWOS[,4],3)) > > 2> DelsDPWOS.I.lm3 <- lm(DelsPDWOS[,1] ~ poly(I(DelsPDWOS[,4]),3)) > > 3> DelsDPWOS.2.lm3 <- >lm(DelsPDWOS[,1]~DelsPDWOS[,4]+I(DelsPDWOS[,4]^2)+I(DelsPDWOS[,4]^3)) > > 1 and 2 lead to identical but wrong results. 3 is correct. Surprisingly > (to me) the residuals are the same (correct) in all cases although the > coefficients of 1 (and 2) are wrong and do not in any way match the > stated regression polynomial. (summaries below) > > QUESTION: > Is there a correct way to use poly() in lm() since version 3 becomes quite > tedious if used more often especially with higher order polynomials? > The coefficients using 1 and 2 are not incorrect. poly() defines orthogonal polynomials, whereas your DelsPDWOS[,4]+I(DelsPDWOS[,4]^2)+I(DelsPDWOS[,4]^3 contruct defines an ordinary polynomial. You should be able to recover version 3 coefficients from 1 and 2. See help(poly) > x <- runif(10) > x [1] 0.1878 0.2415 0.5834 0.6556 0.4112 0.3399 0.8144 0.1134 0.7360 0.0463 > model.matrix(~ poly(x, 2)) (Intercept) poly(x, 2)1 poly(x, 2)2 11-0.27648 -0.0452 21-0.21052 -0.1899 31 0.20937 -0.2708 41 0.29799 -0.1021 51-0.00212 -0.4117 61-0.08970 -0.3621 71 0.49297 0.4968 81-0.36790 0.2148 91 0.39672 0.1620 10 1-0.45033 0.5082 attr(,"assign") [1] 0 1 1 > model.matrix(~ x + I(x^2)) (Intercept) x I(x^2) 11 0.1878 0.03528 21 0.2415 0.05834 31 0.5834 0.34040 41 0.6556 0.42982 51 0.4112 0.16911 61 0.3399 0.11554 71 0.8144 0.66320 81 0.1134 0.01286 91 0.7360 0.54169 10 1 0.0463 0.00214 attr(,"assign") [1] 0 1 2 > Regards, -- Markus Jantti Abo Akademi University [EMAIL PROTECTED] http://www.iki.fi/~mjantti __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem loading library Design
Emmanuelle Comets wrote: I have used library Design (Frank Harrell) in the past but when I tried this week, I get : > library(Design) Error in eval(expr, envir, enclos) : Object "formula.default" not found I updated to the latest version of the library (2.0) with the same result. My version of R may not be the latest but it's recent (and I probably changed since using Design for the last time) : > version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major1 minor9.1 year 2004 month06 day 21 language R I just tried this in > version _ platform i386-pc-linux-gnu arch i386 os linux-gnu system i386, linux-gnu status major2 minor0.1 year 2004 month11 day 15 language R where it works. Your R is pretty old and there were many changes in moving to 2.0 (including the namespaces, which may be the reason you are getting that particular error). Upgrading R would appear to be the solution. Regards, markus Could anyone please point me in the right direction to solve my problem? Thank you in advance, Emmanuelle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Markus Jantti Abo Akademi University [EMAIL PROTECTED] http://www.iki.fi/~mjantti __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Quantiles of data in a contingency table
Matt Mohebbi wrote: Hello, I have data of the following form: data <- data.frame(type=c("c","d","e"), size=c(10,20,30), count=c(20,10,5)) data type size count 1c 1020 2d 2010 3e 30 5 I would like to compute the quantiles of size given the counts. For instance, in this example, the median size would be 10. Is there an easy way of doing this? One at least is to the function wtd.median [and wtd.quantile] in package Hmisc by Frank Harrell. install.packages("Hmisc") library(Hmisc) wtd.median(data$size, data$weights) is likely a route to get you what you want. regards, markus Is there a good way to deal with data in this format in general? Much of R seems to center around having an entry for each item. This question (http://www.r-project.org/nocvs/mail/r-help/2000/0102.html) seems to be related but no one provided an answer. Thanks, Matt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Markus Jantti Abo Akademi University [EMAIL PROTECTED] http://www.iki.fi/~mjantti __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] using poly in a linear regression in the presence of NA fails (despite subsetting them out)
I ran into a to me surprising result on running lm with an orthogonal polynomial among the predictors. The lm command resulted in Error in qr(X) : NA/NaN/Inf in foreign function call (arg 1) Error during wrapup: despite my using a "subset" in the call to get rid of NA's. poly is apparently evaluated before any NA's are subsetted out of the data. Example code (attached to this e-mail as as a script): > ## generate some data > n <- 50 > x <- runif(n) > a0 <- 10 > a1 <- .5 > sigma.e <- 1 > y <- a0 + a1*x + rnorm(n)*sigma.e > tmp.d <- data.frame(y, x) > rm(list=c("n", "x", "a0", "a1", "sigma.e", "y")) > > print(lm.1 <- lm(y ~ poly(x, 2), data = tmp.d) + + ## now make a few NA's + + tmp.d$x[1:2] <- rep(NA, 2) Error: syntax error Error during wrapup: > > ## this fails, just as it should > print(lm.1 <- lm(y ~ poly(x, 2), data = tmp.d)) Call: lm(formula = y ~ poly(x, 2), data = tmp.d) Coefficients: (Intercept) poly(x, 2)1 poly(x, 2)2 10.380 -0.242 -1.441 > > ## these also fail, but should not? > > print(lm.2 <- lm(y ~ poly(x, 2), data = tmp.d, subset = !is.na(x))) Call: lm(formula = y ~ poly(x, 2), data = tmp.d, subset = !is.na(x)) Coefficients: (Intercept) poly(x, 2)1 poly(x, 2)2 10.380 -0.242 -1.441 > print(lm.3 <- lm(y ~ poly(x, 2), data = tmp.d, na.action = na.omit)) Call: lm(formula = y ~ poly(x, 2), data = tmp.d, na.action = na.omit) Coefficients: (Intercept) poly(x, 2)1 poly(x, 2)2 10.380 -0.242 -1.441 > > ## but this works > > print(lm.3 <- lm(y ~ poly(x, 2), data = subset(tmp.d, subset = !is.na(x Call: lm(formula = y ~ poly(x, 2), data = subset(tmp.d, subset = !is.na(x))) Coefficients: (Intercept) poly(x, 2)1 poly(x, 2)2 10.380 -0.242 -1.441 The documentation of lm is *not* misleading at this point, saying that subset an optional vector specifying a subset of observations to be used in the fitting process. which implies that data are subsetted once lm.fit is called. All the same, this behavior is a little unexpected to me. Is it to be considered a feature, that is, does it produce beneficial side effects which explain why it works as it does? Regards, Markus I am running R on a Debian testing system with kernel 2.6.10 and > version _ platform i386-pc-linux-gnu arch i386 os linux-gnu system i386, linux-gnu status major2 minor0.1 year 2004 month11 day 15 language R -- Markus Jantti Abo Akademi University [EMAIL PROTECTED] http://www.iki.fi/~mjantti ## generate some data n <- 50 x <- runif(n) a0 <- 10 a1 <- .5 sigma.e <- 1 y <- a0 + a1*x + rnorm(n)*sigma.e tmp.d <- data.frame(y, x) rm(list=c("n", "x", "a0", "a1", "sigma.e", "y")) print(lm.1 <- lm(y ~ poly(x, 2), data = tmp.d) ## now make a few NA's tmp.d$x[1:2] <- rep(NA, 2) ## this fails, just as it should print(lm.1 <- lm(y ~ poly(x, 2), data = tmp.d)) ## these also fail, but should not? print(lm.2 <- lm(y ~ poly(x, 2), data = tmp.d, subset = !is.na(x))) print(lm.3 <- lm(y ~ poly(x, 2), data = tmp.d, na.action = na.omit)) ## but this does not print(lm.3 <- lm(y ~ poly(x, 2), data = subset(tmp.d, subset = !is.na(x __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Std Err on Concentration measures
Angelo Secchi wrote: > Hi, > I'm using the ineq package to calculate some concentration measures (Gini, Herfindal, ...) and I was wondering if there's around also a function to calculate standard error on these measures. If not, is anybody aware of where I can find a reference on this point? > Thanks. > There are several suggestions for estimating the variance of the Gini coefficient:a literature search should reveal some suggestions. If the sampling design is complex, you may be best off using a resampling method, such as bootstrap. Here are a few references: @Article{nayakandgastwirth1989, Author = {Tapan K Nayak and Joseph L Gastwirth}, Title = {The use of diversity analysis to assess the relative influence of factors affecting the income distribution}, Journal= {Journal of Business \& Economic Statistics}, Volume = {7}, Pages = {453--460}, subject= {Asymptotic distribution Diversity measure Gini index U statistic Utility function}, year = 1989, month = oct, } @Article{sandstromwretmanandwalden1988a, Author = {Arne Sandström and Jan Wretman and Bertil Walden}, Title = {Variance estimators of the Gini coefficient - probability sampling}, Journal= {Journal of Business \& Economic Statistics}, Volume = {6}, Pages = {113--119}, subject= {Simulations}, year = 1988, month = jan, } @Article{davidsonandduclos1997, Author = {Russell Davidson and Jean-Yves Duclos}, Title = {Statistical inference for the measurement of the incidence of taxes and transfers}, Journal= {Econometrica}, Volume = {65}, Number = {6}, Pages = {1453--1465}, month = {November}, year = 1997, } cheers, markus -- Markus Jantti Abo Akademi University [EMAIL PROTECTED] http://www.iki.fi/~mjantti ### This message has been scanned by F-Secure Anti-Virus for Mic...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] weighted mean
On Wed, 2003-11-26 at 01:28, [EMAIL PROTECTED] wrote: > How do I go about generating a WEIGHTED mean (and standard error) of a > variable (e.g., expenditures) for each level of a categorical variable > (e.g., geographic region)? I'm looking for something comparable to PROC > MEANS in SAS with both a class and weight statement. > I asked this question a few years ago and this is the anwer I got works: see http://www.r-project.org/nocvs/mail/r-help/1999/2160.html tapply(seq(along=wage), list(Educc,Year), function(i, x=wage, w=weight) weighted.mean(x[i], w[i])) (wage and weight are the variable of interest and weight, while Educc and Year are the factors) Regards, Markus > > > Thanks. > > > > Marc > > > > > > > > > > > [[alternative HTML version deleted]] > > __ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- Markus Jäntti <[EMAIL PROTECTED]> Abo Akademi University __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] rbind.data.frame: character comverted to factor
Dear All, on rbind:ing together a number of data.frames, I found that character variables are converted into factors. Since this occurred for a data identifier, it was a little inconvenient and, to me, unexpected. (The help page explains the general procedure used. I also found that on forming a data frame, character variables are converted to factors. The help page on read.table has the 'as.is' argument, which I suppose kind of suggests that character variables tend to get converted into factors. Is there such a "preference" for factors and should this behaviour be expected? Example code d1 <- data.frame(id =letters[1:20], x = runif(20)) d2 <- data.frame(id =paste(letters[1:20],letters[1:20], sep = ""), x = rexp(20)) d3 <- rbind(d1, d2) str(d1) # <- id is factor str(d2) # <- id is factor str(d3) # <- id is factor d1[["id"]] <- as.character(d1[["id"]]) d2[["id"]] <- as.character(d2[["id"]]) d3 <- rbind(d1, d2) str(d1) # <- id is character str(d2) # <- id is character str(d3) # <- id is factor Regards, Markus -- Markus Jäntti <[EMAIL PROTECTED]> Statistics Finland __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help