Re: [R] glm with variance = mu+theta*mu^2?

2005-06-11 Thread Kjetil Brinchmann Halvorsen
Spencer Graves wrote:

>   How might you fit a generalized linear model (glm) with variance 
> = mu+theta*mu^2 (where mu = mean of the exponential family random 
> variable and theta is a parameter to be estimated)?
>
>   This appears in Table 2.7 of Fahrmeir and Tutz (2001) 
> Multivariate Statisticial Modeling Based on Generalized Linear Models, 
> 2nd ed. (Springer, p. 60), where they compare "log-linear model fits 
> to cellular differentiation data based on quasi-likelihoods" between 
> variance = phi*mu (quasi-Poisson), variance = phi*mu^2 
> (quasi-exponential), and variance = mu+theta*mu^2.  The "quasi" 
> function accepted for the family argument in "glm" generates functions 
> "variance", "validmu", and "dev.resids".  I can probably write 
> functions  to mimic the "quasi" function.  However, I have two 
> questions in regard to this:
>
>   (1) I don't know what to use for "dev.resids".  This may not 
> matter for fitting.  I can try a couple of different things to see if 
> it matters.
>
>   (2) Might someone else suggest something different, e.g., using 
> something like optim to solve an appropriate quasi-score function?
>
>   Thanks,
>   spencer graves
>
Since nobody has answerd this I will try. The variance function 
mu+theta*mu^2 is the variance function
of the negative binomial family. If this variance function is used to 
construct a quasi-likelihood, the resulting quasi-
likelihood is identical to the negative binomial likelihood, so for 
fitting we can simly use glm.nb from MASS, which
will give the correct estimated values. However, in a quasi-likelihood 
setting the (co)varince estimation from
glm.nb is not appropriate, and from the book (fahrmeir ..) it seems that 
the estimation method used is a
sandwich estimator, so we can try the sandwich package.  This works but 
the numerical results are somewhat different from  the book. Any 
comments on this?

my code follows:

 > library(Fahrmeir)
 > library(help=Fahrmeir)
 > library(MASS)
 >  cells.negbin <- glm(y~TNF+IFN+TNF:IFN, data=cells,
 family=negative.binomial(1/0.215))
 > summary(cells.negbin)

Call:
glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215),
data = cells)

Deviance Residuals:
Min   1Q   Median   3Q  Max 
-1.6714  -0.8301  -0.2153   0.4802   1.4282 

Coefficients:
   Estimate  Std. Error t value Pr(>|t|)   
(Intercept)  3.39874495  0.18791125  18.087  4.5e-10 ***
TNF  0.01616136  0.00360569   4.482  0.00075 ***
IFN  0.00935690  0.00359010   2.606  0.02296 * 
TNF:IFN -0.5910  0.7002  -0.844  0.41515   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(4.6512) family taken to be 
1.012271)

Null deviance: 46.156  on 15  degrees of freedom
Residual deviance: 12.661  on 12  degrees of freedom
AIC: 155.49

Number of Fisher Scoring iterations: 5

 > confint(cells.negbin)
Waiting for profiling to be done...
2.5 %   97.5 %
(Intercept)  3.0383197319 3.7890206510
TNF  0.0091335087 0.0238915483
IFN  0.0023292566 0.0170195707
TNF:IFN -0.0001996824 0.960427
 > library(sandwich)
Loading required package: zoo
 > vcovHC( cells.negbin )
   (Intercept)  TNF  IFN   
TNF:IFN
(Intercept)  0.01176249372 -0.0001279740135 -0.0001488223001  
0.0212541999
TNF -0.00012797401  0.039017282  0.021242875 
-0.0019793137
IFN -0.00014882230  0.021242875  0.054314079 
-0.0013277626
TNF:IFN  0.0212542 -0.001979314 -0.001327763  
0.0002370104
 > cov2cor(vcovHC( cells.negbin ))
(Intercept)TNFIFNTNF:IFN
(Intercept)   1.000 -0.5973702 -0.5887923  0.1272950
TNF  -0.5973702  1.000  0.4614542 -0.6508822
IFN  -0.5887923  0.4614542  1.000 -0.3700671
TNF:IFN   0.1272950 -0.6508822 -0.3700671  1.000
 > cells.negbin2 <- glm.nb( y~TNF+IFN+TNF:IFN, data=cells)
 > summary(cells.negbin)

Call:
glm(formula = y ~ TNF + IFN + TNF:IFN, family = negative.binomial(1/0.215),
data = cells)

Deviance Residuals:
Min   1Q   Median   3Q  Max 
-1.6714  -0.8301  -0.2153   0.4802   1.4282 

Coefficients:
   Estimate  Std. Error t value Pr(>|t|)   
(Intercept)  3.39874495  0.18791125  18.087  4.5e-10 ***
TNF  0.01616136  0.00360569   4.482  0.00075 ***
IFN  0.00935690  0.00359010   2.606  0.02296 * 
TNF:IFN -0.5910  0.7002  -0.844  0.41515   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(4.6512) family taken to be 
1.012271)

Null deviance: 46.156  on 15  degrees of freedom
Residual deviance: 12.661  on 12  degrees of freedom
AIC: 155.49

Number of Fisher Scoring iterations: 5

 > confint( cells.negbin2 )
Waiting for profiling to be done...
   

Re: [R] Performance difference between 32-bit build and 64-bit bu ild on Solaris 8

2005-06-11 Thread Liaw, Andy
> From: Peter Dalgaard
[snip]
> While your setup is in place, you might want to play around with the
> higher optimization levels. GCC on AMD64 sees a quite substantial
> speedup from -O2 to -O3.

On our SLES8 amd64 boxes, I had trouble with g77 -O3 (build failed).  Have
not tried with newer GCC.

Andy
 
> -- 
>O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark  Ph: 
> (+45) 35327918
> ~~ - ([EMAIL PROTECTED])  FAX: 
> (+45) 35327907
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error with function lda in package MASS (dimnames not equal?)

2005-06-11 Thread Joshua Gilbert
I fully understand that this is a volunteer project, I'm a Debian user
(not a developer... yet).

I have read the posting guide, but I forgot the protocol. First
offence, won't happen again.

Thanks.

On 6/11/05, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> The code will be changed to give a more informative error message in a
> future release.  However, do remember that this is volunteer code, and it
> is not reasonable to expect volunteers to anticipate that a user will
> apply it in extremely unlikely circumstances.
> 
> If you have a suggestion about code in a contributed package, please send
> it to the package maintainer (as the posting guide asks).
> 
> On Sat, 11 Jun 2005, Joshua Gilbert wrote:
> 
> > This is true, they are equal. I hadn't noticed that. Thank you.
> >
> > Now, if lda fails on this given input (equal means), shouldn't we
> > catch it and give a slightly better error message? I've spent a good
> > while going through the debugging process with lda.default. From that
> > perspective it appears that there is a simple change to remove the
> > problem. I am not saying that it is correct in any shape or form, but
> > there is a point where a single transpose would silence the error.
> >
> > So, from a usability standpoint, could we add a check for equal means
> > between classes and throw an error for that specific condition? Yes,
> > the user should not do that. But users may become more interested in
> > making the code run than checking on whether it's doing anything sane.
> >
> > If this isn't the place to do so, tell me. But, I'd like to petition
> > to alter the code of lda.default.
> >
> > On 6/10/05, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> >> lda.default <- MASS:::lda.default and proceed.
> >>
> >> Look at the group means of your data: they are identical to machine
> >> accuracy.
> >>
> >> The question has to be `why are you trying to use lda to separate
> >> two groups with identical means'?  Lda is not protected against that
> >> and it is rather unlikely unless you failed to inspect your data in any
> >> way.
> >>
> >> On Fri, 10 Jun 2005, Joshua Gilbert wrote:
> >>
> >>> This question appears to have been asked previously, but not answered.
> >>> the last response I can find to this previous thread is here:
> >>> http://tolstoy.newcastle.edu.au/R/help/04/07/0126.html. The asnwer was
> >>> to provide debugging info, not an answer.
> >>>
> >>> So the problem is that I'm trying to use lda on my dataset. You can
> >>> download my data here:
> >>> http://northstar-www.dartmouth.edu/~jgilbert/nolda, I used R's save
> >>> function to save objects data and classes (yes, I realize that I name
> >>> stomped the data function in package utils). To replicate my results,
> >>> simply enter the following:
>  library(MASS)
>  load('nolda')
>  lda(data,classes)
> >>> Error in lda.default(x, grouping, ...) : length of 'dimnames' [2] not
> >>> equal to array extent
> >>>
> >>> Now, I don't know what that means.
>  dimnames(data)
> >>> NULL
>  dimnames(classes)
> >>> NULL
> >>>
> >>> As for debugging, I don't know how. I cannot debug lda.default as I
> >>> get the following:
>  debug(lda.default)
> >>> Error: Object "lda.default" not found
> >>>
> >>> I think that that's pretty much it. Can anyone help me?
> >>>
> >>> __
> >>> R-help@stat.math.ethz.ch mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide! 
> >>> http://www.R-project.org/posting-guide.html
> >>>
> >>
> >> --
> >> Brian D. Ripley,  [EMAIL PROTECTED]
> >> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> >> University of Oxford, Tel:  +44 1865 272861 (self)
> >> 1 South Parks Road, +44 1865 272866 (PA)
> >> Oxford OX1 3TG, UKFax:  +44 1865 272595
> >>
> >
> >
> 
> --
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Performance difference between 32-bit build and 64-bit bu ild on Solaris 8

2005-06-11 Thread Scott Gilpin
On 6/11/05, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> For the record, I timed 1000x1000 not 3000x3000 (and said so).  I was not
> proposing to spend several hours running timings at ca 2200s each, not
> least as I used a public machine with a ban on running long jobs (we have
> other much faster machines for that purpose).

My mistake. 

For the 1000x1000 problem on my system, I got the following:

32-bit
[1] 23.37  0.03 23.41  0.00  0.00
64-bit
[1] 80.99  0.02 81.03  0.00  0.00

So it looks like I've still got some work to do.  Thanks for the suggestions.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Performance difference between 32-bit build and 64-bit bu ild on Solaris 8

2005-06-11 Thread Prof Brian Ripley
On Sat, 11 Jun 2005, Peter Dalgaard wrote:

> Scott Gilpin <[EMAIL PROTECTED]> writes:
>
>> Andy, Prof. Ripley - thanks for your replies.
>>
>>> CFLAGS was not specified, so should default to "-g -O2".  It is definitely
>>> worth checking, although almost all the time in these tests will be spent
>>> in Fortran code.
>> Yes - I verified that's the default.
>>
> neither build uses a BLAS.
>>>
>>> Well, they do, the one supplied with R (which is in Fortran).
>>
>> I guess I should have said that neither build is using an optimized
>> BLAS.  (which I am planning to install - I just haven't had the chance
>> yet.)   I also get a message in config.log "ld: fatal: library -lblas:
>> not found".  I need to investigate this more.
>
> Actually, you did say so...
>
> Don't worry about error messages like that in config.log; they only
> mean that there is no system BLAS and the only way to find out is by
> trial and failure. The configure script is full of that sort of code.
>
>>
>>> and for gcc 3.4.3 and the internal BLAS (which I had to build for these
>>> tests) I get
>>>
>>> 32-bit
>>> [1]  9.96  0.09 10.12  0.00  0.00
>>> 64-bit
>>> [1]  9.93  0.04 10.04  0.00  0.00
>>>
>>> so I am not seeing anything like the same performance difference between
>>> 32- and 64-bit builds (but it could well depend on the particular Sparc
>>> chip).
>>
>> These timings are much, much less than what I reported (~700s and
>> 2200s for 32 bit and 64 bit).  I read the admin manual and didn't see
>> anything specifically that needs to be set to use the internal BLAS.
>> I guess I'll go back and do some more investigation.

For the record, I timed 1000x1000 not 3000x3000 (and said so).  I was not 
proposing to spend several hours running timings at ca 2200s each, not 
least as I used a public machine with a ban on running long jobs (we have 
other much faster machines for that purpose).

> Yes. If you have hardcore linear algebra needs, those fast-BLAS
> speedups can be impressive (Brian might also have a faster machine
> than you, mind you). For code that is mainly interpreter-bound, it is
> much less so.
>
> While your setup is in place, you might want to play around with the
> higher optimization levels. GCC on AMD64 sees a quite substantial
> speedup from -O2 to -O3.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] predict function for GLMM

2005-06-11 Thread weihong
I use "predict" for predictions from glm. I am wondering if there is a 
"predict" function for predictions from the results of GLMM model?

Thanks ahead!

Weihong Li
Undergraduate Student in Statistics
University of Alberta

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error with function lda in package MASS (dimnames not equal?)

2005-06-11 Thread Prof Brian Ripley
The code will be changed to give a more informative error message in a 
future release.  However, do remember that this is volunteer code, and it 
is not reasonable to expect volunteers to anticipate that a user will 
apply it in extremely unlikely circumstances.

If you have a suggestion about code in a contributed package, please send 
it to the package maintainer (as the posting guide asks).

On Sat, 11 Jun 2005, Joshua Gilbert wrote:

> This is true, they are equal. I hadn't noticed that. Thank you.
>
> Now, if lda fails on this given input (equal means), shouldn't we
> catch it and give a slightly better error message? I've spent a good
> while going through the debugging process with lda.default. From that
> perspective it appears that there is a simple change to remove the
> problem. I am not saying that it is correct in any shape or form, but
> there is a point where a single transpose would silence the error.
>
> So, from a usability standpoint, could we add a check for equal means
> between classes and throw an error for that specific condition? Yes,
> the user should not do that. But users may become more interested in
> making the code run than checking on whether it's doing anything sane.
>
> If this isn't the place to do so, tell me. But, I'd like to petition
> to alter the code of lda.default.
>
> On 6/10/05, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
>> lda.default <- MASS:::lda.default and proceed.
>>
>> Look at the group means of your data: they are identical to machine
>> accuracy.
>>
>> The question has to be `why are you trying to use lda to separate
>> two groups with identical means'?  Lda is not protected against that
>> and it is rather unlikely unless you failed to inspect your data in any
>> way.
>>
>> On Fri, 10 Jun 2005, Joshua Gilbert wrote:
>>
>>> This question appears to have been asked previously, but not answered.
>>> the last response I can find to this previous thread is here:
>>> http://tolstoy.newcastle.edu.au/R/help/04/07/0126.html. The asnwer was
>>> to provide debugging info, not an answer.
>>>
>>> So the problem is that I'm trying to use lda on my dataset. You can
>>> download my data here:
>>> http://northstar-www.dartmouth.edu/~jgilbert/nolda, I used R's save
>>> function to save objects data and classes (yes, I realize that I name
>>> stomped the data function in package utils). To replicate my results,
>>> simply enter the following:
 library(MASS)
 load('nolda')
 lda(data,classes)
>>> Error in lda.default(x, grouping, ...) : length of 'dimnames' [2] not
>>> equal to array extent
>>>
>>> Now, I don't know what that means.
 dimnames(data)
>>> NULL
 dimnames(classes)
>>> NULL
>>>
>>> As for debugging, I don't know how. I cannot debug lda.default as I
>>> get the following:
 debug(lda.default)
>>> Error: Object "lda.default" not found
>>>
>>> I think that that's pretty much it. Can anyone help me?
>>>
>>> __
>>> R-help@stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide! 
>>> http://www.R-project.org/posting-guide.html
>>>
>>
>> --
>> Brian D. Ripley,  [EMAIL PROTECTED]
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel:  +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UKFax:  +44 1865 272595
>>
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with multinom ?

2005-06-11 Thread John Fox
Dear Marc,


> -Original Message-
> From: Marc Girondot [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, June 11, 2005 2:16 PM
> To: John Fox
> Cc: [EMAIL PROTECTED]
> Subject: RE: [R] Problem with multinom ?
> 
> >Dear Marc,
> >
> >I get the same results -- same coefficients, standard errors, and 
> >fitted probabilities -- from multinom() and glm(). It's true 
> that the 
> >deviances differ, but they, I believe, are defined only up 
> to an additive constant:
> >
> >>  predict(dt.b, type="response")
> > 1 2 3
> >0.4524946 0.5032227 0.5538845
> >
> >>  predict(dt.m, type="probs")
> > 1 2 3 4 5 6
> >0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846
> >
> >These fitted probabilities *are* correct: Since the members of each 
> >pair (1,2), (3,4), and (5,6) have identical values of factor 
> they are 
> >identical fitted probabilities.
> 
> I expected rather to obtain (note 1- before some terms):
> > 1 2 3 4 5 6
> 0.4524948 1-0.4524948 0.5032229 1-0.5032229 0.5538846 1-0.5538846
> 
> ...

But the fitted probabilities are for each observation in the data set; in
your data, these have identical values of factor for each pair and hence
identical fitted probabilities. When the response is dichotomous, you get
only one of the two fitted probabilities for each observation; for a
polytomous response, you get a matrix of fitted probabilities which sum to 1
row-wise.

Regards,
 John

> 
> Marc
> -- 
> 
> __
> Marc Girondot, Pr
> Laboratoire Ecologie, Systématique et Evolution Equipe de 
> Conservation des Populations et des Communautés CNRS, ENGREF 
> et Université Paris-Sud 11 , UMR 8079 Bâtiment 362
> 91405 Orsay Cedex, France
> 
> Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1 69 
> 15 56 96   e-mail: [EMAIL PROTECTED]
> Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
> Skype: girondot
> Fax in US: 1-425-732-6934

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Performance difference between 32-bit build and 64-bit bu ild on Solaris 8

2005-06-11 Thread Peter Dalgaard
Scott Gilpin <[EMAIL PROTECTED]> writes:

> Andy, Prof. Ripley - thanks for your replies.
> 
> > CFLAGS was not specified, so should default to "-g -O2".  It is definitely
> > worth checking, although almost all the time in these tests will be spent
> > in Fortran code.
> Yes - I verified that's the default.
> 
> > >> neither build uses a BLAS.
> > 
> > Well, they do, the one supplied with R (which is in Fortran).
> 
> I guess I should have said that neither build is using an optimized
> BLAS.  (which I am planning to install - I just haven't had the chance
> yet.)   I also get a message in config.log "ld: fatal: library -lblas:
> not found".  I need to investigate this more.

Actually, you did say so... 

Don't worry about error messages like that in config.log; they only
mean that there is no system BLAS and the only way to find out is by
trial and failure. The configure script is full of that sort of code.
 
> 
> > and for gcc 3.4.3 and the internal BLAS (which I had to build for these
> > tests) I get
> > 
> > 32-bit
> > [1]  9.96  0.09 10.12  0.00  0.00
> > 64-bit
> > [1]  9.93  0.04 10.04  0.00  0.00
> > 
> > so I am not seeing anything like the same performance difference between
> > 32- and 64-bit builds (but it could well depend on the particular Sparc
> > chip).
> 
> These timings are much, much less than what I reported (~700s and
> 2200s for 32 bit and 64 bit).  I read the admin manual and didn't see
> anything specifically that needs to be set to use the internal BLAS.  
> I guess I'll go back and do some more investigation.

Yes. If you have hardcore linear algebra needs, those fast-BLAS
speedups can be impressive (Brian might also have a faster machine
than you, mind you). For code that is mainly interpreter-bound, it is
much less so.

While your setup is in place, you might want to play around with the
higher optimization levels. GCC on AMD64 sees a quite substantial
speedup from -O2 to -O3.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] combination which limited

2005-06-11 Thread Gabor Grothendieck
On 6/11/05, Marc Schwartz <[EMAIL PROTECTED]> wrote:
> On Sat, 2005-06-11 at 20:44 +0200, Muhammad Subianto wrote:
> > Dear R-helpers,
> > I am learning about combination in R.
> > I want to combination all of
> > possible variable but it limited.
> > I am sorry I could not explain exactly.
> > For usefull I give an example
> >   interface <- c("usb","fireware","infra","bluetooth")
> >   screen<- c("lcd","cube")
> >   computer  <- c("pc","server","laptop")
> >   available <- c("yes","no")
> >
> > What the result I need, something like this below,
> >   usb  lcd pc  yes
> >   fireware lcd pc  yes
> >   infralcd pc  yes
> >   bluetoothlcd pc  yes
> >   usb  cubepc  yes
> >   usb  lcd server  yes
> >   usb  lcd laptop  yes
> >   usb  lcd pc  no
> >
> > How can I do that?
> > I was wondering if someone can help me.
> > Thanks you for your time and best regards,
> > Muhammad Subianto
> 
> Use:
> 
> > expand.grid(interface, screen, computer, available)
>Var1 Var2   Var3 Var4
> 1usb  lcd pc  yes
> 2   fireware  lcd pc  yes
> 3  infra  lcd pc  yes
> 4  bluetooth  lcd pc  yes
> 5usb cube pc  yes
> 6   fireware cube pc  yes
> 7  infra cube pc  yes
> 8  bluetooth cube pc  yes
> 9usb  lcd server  yes
> 10  fireware  lcd server  yes
> 11 infra  lcd server  yes
> 12 bluetooth  lcd server  yes
> 13   usb cube server  yes
> 14  fireware cube server  yes
> 15 infra cube server  yes
> 16 bluetooth cube server  yes
> 17   usb  lcd laptop  yes
> 18  fireware  lcd laptop  yes
> 19 infra  lcd laptop  yes
> 20 bluetooth  lcd laptop  yes
> 21   usb cube laptop  yes
> 22  fireware cube laptop  yes
> 23 infra cube laptop  yes
> 24 bluetooth cube laptop  yes
> 25   usb  lcd pc   no
> 26  fireware  lcd pc   no
> 27 infra  lcd pc   no
> 28 bluetooth  lcd pc   no
> 29   usb cube pc   no
> 30  fireware cube pc   no
> 31 infra cube pc   no
> 32 bluetooth cube pc   no
> 33   usb  lcd server   no
> 34  fireware  lcd server   no
> 35 infra  lcd server   no
> 36 bluetooth  lcd server   no
> 37   usb cube server   no
> 38  fireware cube server   no
> 39 infra cube server   no
> 40 bluetooth cube server   no
> 41   usb  lcd laptop   no
> 42  fireware  lcd laptop   no
> 43 infra  lcd laptop   no
> 44 bluetooth  lcd laptop   no
> 45   usb cube laptop   no
> 46  fireware cube laptop   no
> 47 infra cube laptop   no
> 48 bluetooth cube laptop   no
> 
> 
> See ?expand.grid for more information.
> 


After you do the above you will still want to cut it down to just
the rows you need.

As expained, use expand.grid.  Let's assume you used this statement:

dd <- expand.grid(interface = interface, screen = screen, 
   computer = computer, available = available)

There are several possibilities now:

1. you could list out dd on the console and note the number of the
rows you want to keep:

idx <- c(1,5,7)
dd2 <- dd[,idx]

or if you want most of them it may be easier to record which ones
you do not want:

ndix <- c(2,4,7)
dd2 <- dd[,-ndix]

2. Another possibility is to export it to a spreadsheet and visually 
delete the rows you don't want.  

3. A third possibility is to install JGR (which is a free Java GUI
front end to R).
First download and install JGR from:http://stats.math.uni-augsburg.de/JGR/
In JGR (I am using Windows and its possible that the instructions vary
slightly on other platforms):

1. create dd as explained 
2. bring up the object browser using the menu Tools | Object Browser
or just ctrl-B
3. Select dd from the object browser
4. This will put you into a spreadsheet in which you can select the
rows you want
 to delete (hold down ctrl for the 2nd and subsequent selection to have a 
non-contiguous multi-row selection).
5. Select Tools | Remove Rows
6. Click on Apply in the lower right of the spreadsheet.


7. Click on X on the upper right of the spreadsheet.

the menu entry Tools | Remove Rows.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with multinom ?

2005-06-11 Thread Marc Girondot
>Dear Marc,
>
>I get the same results -- same coefficients, standard errors, and fitted
>probabilities -- from multinom() and glm(). It's true that the deviances
>differ, but they, I believe, are defined only up to an additive constant:
>
>>  predict(dt.b, type="response")
> 1 2 3
>0.4524946 0.5032227 0.5538845
>
>>  predict(dt.m, type="probs")
> 1 2 3 4 5 6
>0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846
>
>These fitted probabilities *are* correct: Since the members of each pair
>(1,2), (3,4), and (5,6) have identical values of factor they are identical
>fitted probabilities.

I expected rather to obtain (note 1- before some terms):
> 1 2 3 4 5 6
0.4524948 1-0.4524948 0.5032229 1-0.5032229 0.5538846 1-0.5538846

...

Marc
-- 

__
Marc Girondot, Pr
Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, ENGREF et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1 69 
15 56 96   e-mail: [EMAIL PROTECTED]
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot
Fax in US: 1-425-732-6934

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with multinom ?

2005-06-11 Thread Marc Girondot
>On Sat, 11 Jun 2005, John Fox wrote:
>
>>Dear Marc,
>>
>>I get the same results -- same coefficients, standard errors, and fitted
>>probabilities -- from multinom() and glm(). It's true that the deviances
>>differ, but they, I believe, are defined only up to an additive constant:
>
>Yes. There are many variations on the definition 
>of (residual) deviance, but it compares -2 log 
>likelihood with a `saturated' model.  For 
>grouped data you have a choice: a separate term 
>for each group or for each observation.  A 
>binomial GLM uses the first but the second is 
>more
>normal in logistic regression (since it has a 
>direct interpretation via log-probability 
>scoring).
>
>multinom() is support software for a book (which 
>the R posting guide does ask you to consult): 
>this is discussed with a worked example on pp 
>203-4.

Dear Prof. Ripley,

I have your book... but I don't find the answer to my questions...

You propose that the difference in residual 
deviance between two versions of the same model 
(0.001841823 for glm and  106.2304 for 
multinom()) is due to a difference in the 
specification of the satured model. However, as 
RD=-2 Ln L model+2 Ln L saturated and that -2 Ln 
L model=11.1146... it seems impossible to me that 
RD > -2 Ln L model ...

Marc Girondot

Sorry to be so close-minded !
-- 

__
Marc Girondot, Pr
Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, ENGREF et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1 69 
15 56 96   e-mail: [EMAIL PROTECTED]
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot
Fax in US: 1-425-732-6934

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] combination which limited

2005-06-11 Thread Marc Schwartz
On Sat, 2005-06-11 at 20:44 +0200, Muhammad Subianto wrote:
> Dear R-helpers,
> I am learning about combination in R.
> I want to combination all of
> possible variable but it limited. 
> I am sorry I could not explain exactly.
> For usefull I give an example
>   interface <- c("usb","fireware","infra","bluetooth")
>   screen<- c("lcd","cube")
>   computer  <- c("pc","server","laptop")
>   available <- c("yes","no")
>   
> What the result I need, something like this below,
>   usb  lcd pc  yes
>   fireware lcd pc  yes
>   infralcd pc  yes
>   bluetoothlcd pc  yes
>   usb  cubepc  yes 
>   usb  lcd server  yes
>   usb  lcd laptop  yes
>   usb  lcd pc  no
>   
> How can I do that?
> I was wondering if someone can help me.
> Thanks you for your time and best regards,
> Muhammad Subianto

Use:

> expand.grid(interface, screen, computer, available)
Var1 Var2   Var3 Var4
1usb  lcd pc  yes
2   fireware  lcd pc  yes
3  infra  lcd pc  yes
4  bluetooth  lcd pc  yes
5usb cube pc  yes
6   fireware cube pc  yes
7  infra cube pc  yes
8  bluetooth cube pc  yes
9usb  lcd server  yes
10  fireware  lcd server  yes
11 infra  lcd server  yes
12 bluetooth  lcd server  yes
13   usb cube server  yes
14  fireware cube server  yes
15 infra cube server  yes
16 bluetooth cube server  yes
17   usb  lcd laptop  yes
18  fireware  lcd laptop  yes
19 infra  lcd laptop  yes
20 bluetooth  lcd laptop  yes
21   usb cube laptop  yes
22  fireware cube laptop  yes
23 infra cube laptop  yes
24 bluetooth cube laptop  yes
25   usb  lcd pc   no
26  fireware  lcd pc   no
27 infra  lcd pc   no
28 bluetooth  lcd pc   no
29   usb cube pc   no
30  fireware cube pc   no
31 infra cube pc   no
32 bluetooth cube pc   no
33   usb  lcd server   no
34  fireware  lcd server   no
35 infra  lcd server   no
36 bluetooth  lcd server   no
37   usb cube server   no
38  fireware cube server   no
39 infra cube server   no
40 bluetooth cube server   no
41   usb  lcd laptop   no
42  fireware  lcd laptop   no
43 infra  lcd laptop   no
44 bluetooth  lcd laptop   no
45   usb cube laptop   no
46  fireware cube laptop   no
47 infra cube laptop   no
48 bluetooth cube laptop   no


See ?expand.grid for more information.

Using:

> help.search("combinations")

would guide you to that function based upon a keyword search.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error with function lda in package MASS (dimnames not equal?)

2005-06-11 Thread Marc Feldesman


--- Joshua Gilbert <[EMAIL PROTECTED]> wrote:

> If this isn't the place to do so, tell me. But, I'd like to petition
> to alter the code of lda.default.

It seems to me that if you want to alter the code of lda.default, you
have everything you need to do so.  The code is there, it is GPL'd and
you can change it to your heart's content.  But if you are suggesting
that the lda function be changed for everyone, I'd be personally
opposed to doing so.  It behaves the way it should behave and if you're
looking for a way to save yourself from the way it works, then make the
change on a local copy of lda and you'll have solved the problem for
you.  For me, I vote for the status quo.


Dr. Marc R Feldesman
Professor & Chair Emeritus
Department of Anthropology
Portland State University
Portland, OR 97207

Please respond to all emails at:  [EMAIL PROTECTED]

"I've proven who am so many times, the magnetic strip is wearing thin"

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] combination which limited

2005-06-11 Thread Muhammad Subianto
Dear R-helpers,
I am learning about combination in R.
I want to combination all of
possible variable but it limited. 
I am sorry I could not explain exactly.
For usefull I give an example
  interface <- c("usb","fireware","infra","bluetooth")
  screen<- c("lcd","cube")
  computer  <- c("pc","server","laptop")
  available <- c("yes","no")
  
What the result I need, something like this below,
  usb  lcd pc  yes
  fireware lcd pc  yes
  infralcd pc  yes
  bluetoothlcd pc  yes
  usb  cubepc  yes 
  usb  lcd server  yes
  usb  lcd laptop  yes
  usb  lcd pc  no
  
How can I do that?
I was wondering if someone can help me.
Thanks you for your time and best regards,
Muhammad Subianto

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Error with function lda in package MASS (dimnames not equal?)

2005-06-11 Thread Joshua Gilbert
This is true, they are equal. I hadn't noticed that. Thank you.

Now, if lda fails on this given input (equal means), shouldn't we
catch it and give a slightly better error message? I've spent a good
while going through the debugging process with lda.default. From that
perspective it appears that there is a simple change to remove the
problem. I am not saying that it is correct in any shape or form, but
there is a point where a single transpose would silence the error.

So, from a usability standpoint, could we add a check for equal means
between classes and throw an error for that specific condition? Yes,
the user should not do that. But users may become more interested in
making the code run than checking on whether it's doing anything sane.

If this isn't the place to do so, tell me. But, I'd like to petition
to alter the code of lda.default.

On 6/10/05, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> lda.default <- MASS:::lda.default and proceed.
> 
> Look at the group means of your data: they are identical to machine
> accuracy.
> 
> The question has to be `why are you trying to use lda to separate
> two groups with identical means'?  Lda is not protected against that
> and it is rather unlikely unless you failed to inspect your data in any
> way.
> 
> On Fri, 10 Jun 2005, Joshua Gilbert wrote:
> 
> > This question appears to have been asked previously, but not answered.
> > the last response I can find to this previous thread is here:
> > http://tolstoy.newcastle.edu.au/R/help/04/07/0126.html. The asnwer was
> > to provide debugging info, not an answer.
> >
> > So the problem is that I'm trying to use lda on my dataset. You can
> > download my data here:
> > http://northstar-www.dartmouth.edu/~jgilbert/nolda, I used R's save
> > function to save objects data and classes (yes, I realize that I name
> > stomped the data function in package utils). To replicate my results,
> > simply enter the following:
> >> library(MASS)
> >> load('nolda')
> >> lda(data,classes)
> > Error in lda.default(x, grouping, ...) : length of 'dimnames' [2] not
> > equal to array extent
> >
> > Now, I don't know what that means.
> >> dimnames(data)
> > NULL
> >> dimnames(classes)
> > NULL
> >
> > As for debugging, I don't know how. I cannot debug lda.default as I
> > get the following:
> >> debug(lda.default)
> > Error: Object "lda.default" not found
> >
> > I think that that's pretty much it. Can anyone help me?
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> >
> 
> --
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Performance difference between 32-bit build and 64-bit bu ild on Solaris 8

2005-06-11 Thread Scott Gilpin
Andy, Prof. Ripley - thanks for your replies.

> CFLAGS was not specified, so should default to "-g -O2".  It is definitely
> worth checking, although almost all the time in these tests will be spent
> in Fortran code.
Yes - I verified that's the default.

> >> neither build uses a BLAS.
> 
> Well, they do, the one supplied with R (which is in Fortran).

I guess I should have said that neither build is using an optimized
BLAS.  (which I am planning to install - I just haven't had the chance
yet.)   I also get a message in config.log "ld: fatal: library -lblas:
not found".  I need to investigate this more.


> and for gcc 3.4.3 and the internal BLAS (which I had to build for these
> tests) I get
> 
> 32-bit
> [1]  9.96  0.09 10.12  0.00  0.00
> 64-bit
> [1]  9.93  0.04 10.04  0.00  0.00
> 
> so I am not seeing anything like the same performance difference between
> 32- and 64-bit builds (but it could well depend on the particular Sparc
> chip).

These timings are much, much less than what I reported (~700s and
2200s for 32 bit and 64 bit).  I read the admin manual and didn't see
anything specifically that needs to be set to use the internal BLAS.  
I guess I'll go back and do some more investigation.

Thanks for your help.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Calculating moments of a distribution

2005-06-11 Thread Uwe Ligges
Mohammad Ehsanul Karim wrote:

> Dear All,
> 
> Is there any existing package or direct function in R
> / S-plus that calculates moments (raw or central); the
> usual measures of skewness and kurtosis, and/or
> pearsonian k criterion?


See package e1071.

Uwe Ligges


> Thank you for your time. 
> 
> 
> --
> 
> Mohammad Ehsanul Karim 
> 
> Web: http://snipurl.com/ehsan 
> ISRT, University of Dhaka, BD
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Calculating moments of a distribution

2005-06-11 Thread Spencer Graves
  Have you tried RSiteSearch("skewness") and RSiteSearch("kurtosis")? 
The "fBasics" package has functions for that, as do other packages.

  spencer graves

Mohammad Ehsanul Karim wrote:

> Dear All,
> 
> Is there any existing package or direct function in R
> / S-plus that calculates moments (raw or central); the
> usual measures of skewness and kurtosis, and/or
> pearsonian k criterion?
> 
> Thank you for your time. 
> 
> 
> --
> 
> Mohammad Ehsanul Karim 
> 
> Web: http://snipurl.com/ehsan 
> ISRT, University of Dhaka, BD
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R-help Digest, Vol 28, Issue 11

2005-06-11 Thread Uwe Ligges
dwfu wrote:
> Dear all,
> I'm new using R and in (geo)statistics. I have a problem with solving
> my homework questions. We are working with variograms and trying to
> write down basic  equations for different models (spherical,
> exponential, Gaussian). I tried to use the 'gstat' and 'geoR' packages
> to solve the questions but as I said before I'm new in R and always
> encountered with some syntax errors (I can send some specific examples
> later). If one of you used this packages and could help me,  I will be
> very glad.

Please read the posting guide which tells you:
  - "Use an informative subject line."
  - "Basic statistics and classroom homework: R-help is not intended for 
these."
  - Provide small reproducible examples.

Uwe Ligges

> Best Wishes,
> Emre Duran
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Calculating moments of a distribution

2005-06-11 Thread Mohammad Ehsanul Karim
Dear All,

Is there any existing package or direct function in R
/ S-plus that calculates moments (raw or central); the
usual measures of skewness and kurtosis, and/or
pearsonian k criterion?

Thank you for your time. 


--

Mohammad Ehsanul Karim 

Web: http://snipurl.com/ehsan 
ISRT, University of Dhaka, BD

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R-help Digest, Vol 28, Issue 11

2005-06-11 Thread Ruben Roa
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of dwfu
> Sent: 11 June 2005 13:59
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] R-help Digest, Vol 28, Issue 11
> 
> 
> Dear all,
> I'm new using R and in (geo)statistics. I have a problem with solving
> my homework questions. We are working with variograms and trying to
> write down basic  equations for different models (spherical,
> exponential, Gaussian). I tried to use the 'gstat' and 'geoR' packages
> to solve the questions but as I said before I'm new in R and always
> encountered with some syntax errors (I can send some specific examples
> later). If one of you used this packages and could help me,  I will be
> very glad.
> Best Wishes,
> Emre Duran

For geoR, go here
http://www.est.ufpr.br/geoR/ 
and study the illustrative session.
Ruben

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R-help Digest, Vol 28, Issue 11

2005-06-11 Thread dwfu
Dear all,
I'm new using R and in (geo)statistics. I have a problem with solving
my homework questions. We are working with variograms and trying to
write down basic  equations for different models (spherical,
exponential, Gaussian). I tried to use the 'gstat' and 'geoR' packages
to solve the questions but as I said before I'm new in R and always
encountered with some syntax errors (I can send some specific examples
later). If one of you used this packages and could help me,  I will be
very glad.
Best Wishes,
Emre Duran

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Calling R from C/C

2005-06-11 Thread Prof Brian Ripley
This is discussed in detail in the R-admin manual for R 2.1.0 (the current 
version).  There is no point in repeating the whole discussion (several 
pages) here: that is the definitive account and you do need the whole 
picture.

On Sat, 11 Jun 2005, SUMIT MALHOTRA wrote:

> hi ppl,
> this somethin very urgent. plz anybody tell me for my problem:-
>
> how to make a module/lib that will allow to call easily R code/functions
> from C++ (C++ Builder 6). Is it possible without using any intermediate
> things.
>
> plz help
>
> srry no time for RTFM. Any idea or hint is appreciated.
>
> n I can help u a lot in projects sort of things.
>
> plz do reply.
>
>
> byee

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with multinom ?

2005-06-11 Thread Prof Brian Ripley

On Sat, 11 Jun 2005, John Fox wrote:


Dear Marc,

I get the same results -- same coefficients, standard errors, and fitted
probabilities -- from multinom() and glm(). It's true that the deviances
differ, but they, I believe, are defined only up to an additive constant:


Yes. There are many variations on the definition of (residual) deviance, 
but it compares -2 log likelihood with a `saturated' model.  For grouped 
data you have a choice: a separate term for each group or for each 
observation.  A binomial GLM uses the first but the second is more
normal in logistic regression (since it has a direct interpretation via 
log-probability scoring).


multinom() is support software for a book (which the R posting guide does 
ask you to consult): this is discussed with a worked example on pp 203-4.



dt

 output factor  n
1  m1.2 10
2  f1.2 12
3  m1.3 14
4  f1.3 14
5  m1.4 15
6  f1.4 12


dt.m <- multinom(output ~ factor, data=dt, weights=n)

# weights:  3 (2 variable)
initial  value 53.372333
iter  10 value 53.115208
iter  10 value 53.115208
iter  10 value 53.115208
final  value 53.115208
converged


dt2

  m  f factor
1 10 121.2
2 14 141.3
3 15 121.4


dt.b <- glm(cbind(m,f) ~ factor, data=dt2, family=binomial)



summary(dt.m)

Call:
multinom(formula = output ~ factor, data = dt, weights = n)

Coefficients:
  Values Std. Err.
(Intercept) -2.632443  3.771265
factor   2.034873  2.881479

Residual Deviance: 106.2304
AIC: 110.2304

Correlation of Coefficients:
[1] -0.9981598


summary(dt.b)


Call:
glm(formula = cbind(m, f) ~ factor, family = binomial, data = dt2)

Deviance Residuals:
  1 2 3
0.01932  -0.03411   0.01747

Coefficients:
   Estimate Std. Error z value Pr(>|z|)
(Intercept)   -2.632  3.771  -0.6980.485
factor 2.035  2.881   0.7060.480

(Dispersion parameter for binomial family taken to be 1)

   Null deviance: 0.5031047  on 2  degrees of freedom
Residual deviance: 0.0018418  on 1  degrees of freedom
AIC: 15.115

Number of Fisher Scoring iterations: 2


predict(dt.b, type="response")

   1 2 3
0.4524946 0.5032227 0.5538845


predict(dt.m, type="probs")

   1 2 3 4 5 6
0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846

These fitted probabilities *are* correct: Since the members of each pair
(1,2), (3,4), and (5,6) have identical values of factor they are identical
fitted probabilities.

(Note, by the way, the large negative correlation between the coefficients,
produced by the configuration of factor values.)

I hope this helps,
John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Marc Girondot
Sent: Saturday, June 11, 2005 3:06 AM
To: John Fox
Cc: r-help@stat.math.ethz.ch
Subject: [R] Problem with multinom ?

Thanks for your response.
OK, multinom() is a more logical in this context.

But similar problem occurs:

Let these data to be analyzed using classical glm with binomial error:

m   f   factor   m theo  f theo
-Ln L model-Ln L full  interecept
f
10  12  1.2  0.452494473  0.547505527
1.778835688  1.7786489632.632455675
-2.034882223
14  14  1.3  0.503222759  0.496777241  1.901401922  1.900820284
15  12  1.4  0.553884782  0.446115218  1.877062369  1.876909821

 Sum -Ln L
5.557299979  5.556379068  Residual deviance
 Deviance
11.11459996  11.11275814   0.001841823

If I try to use multinom() function to analyze these data,
the fitted parameters are correct but the residual deviance not.


 dt<-read.table('/try.txt'. header=T)
 dt

   output factor  n
1  m1.2 10
2  f1.2 12
3  m1.3 14
4  f1.3 14
5  m1.4 15
6  f1.4 12

 dt.plr <- multinom(output ~ factor. data=dt. weights=n. maxit=1000)

# weights:  3 (2 variable)
initial  value 53.372333
iter  10 value 53.115208
iter  10 value 53.115208
iter  10 value 53.115208
final  value 53.115208
converged

 dt.plr

Call:
multinom(formula = output ~ factor. data = dt. weights = n.
maxit = 1000)

Coefficients:
(Intercept)  factor
   -2.6324432.034873

Residual Deviance: 106.2304
AIC: 110.2304


 dt.pr1<-predict(dt.plr. . type="probs")
 dt.pr1

 1 2 3 4 5 6
0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846

Probability for 2. 4 and 6 are not correct and its explain
the non-correct residual deviance obtained in R.
Probably the problem I have is due to an incorrect data
format... could someone help me...
Thanks

(I know there is a simple way to analyze binomial data. but
in fine I want to use multinom() for 

[R] Calling R from C/C

2005-06-11 Thread SUMIT MALHOTRA
hi ppl,
this somethin very urgent. plz anybody tell me for my problem:-
 
 
 
how to make a module/lib that will allow to call easily R code/functions 
from C++ (C++ Builder 6). Is it possible without using any intermediate 
things.
 
plz help
 
srry no time for RTFM. Any idea or hint is appreciated.
 
n I can help u a lot in projects sort of things.
 
plz do reply.
 
 
byee


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] TaskView for ecological and environmental data

2005-06-11 Thread Gavin Simpson
Dear list,

Following off-list discussion with Graham Smith (who initiated the 
initial request for a task view on this topic) and Philippe Grosjean I 
have produced a Task View for ecological and environmental data 
analysis. Graham and Philippe commented on a earlier draft and it is now 
in a shape suitable for public scrutiny.

I have placed a mock-up of the html version of a task view page at the 
URI below and would welcome your comments and suggestions before I 
submit it to Achim Zeileis for inclusion in the ctv package.

http://ecrc3.geog.ucl.ac.uk/download/taskview/

I will now be out of the office for a week and am not sure what email 
availability I will have so it may take a little while for me to reply 
to and act on any suggestions you might make.

I would like to thank Graham and Philippe for encouraging me to maintain 
this view and for their helpful suggestions.

All the best,

Gavin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with multinom ?

2005-06-11 Thread John Fox
Dear Marc,

I get the same results -- same coefficients, standard errors, and fitted
probabilities -- from multinom() and glm(). It's true that the deviances
differ, but they, I believe, are defined only up to an additive constant:

> dt
  output factor  n
1  m1.2 10
2  f1.2 12
3  m1.3 14
4  f1.3 14
5  m1.4 15
6  f1.4 12

> dt.m <- multinom(output ~ factor, data=dt, weights=n)
# weights:  3 (2 variable)
initial  value 53.372333 
iter  10 value 53.115208
iter  10 value 53.115208
iter  10 value 53.115208
final  value 53.115208 
converged

> dt2
   m  f factor
1 10 121.2
2 14 141.3
3 15 121.4

> dt.b <- glm(cbind(m,f) ~ factor, data=dt2, family=binomial)

> summary(dt.m)
Call:
multinom(formula = output ~ factor, data = dt, weights = n)

Coefficients:
   Values Std. Err.
(Intercept) -2.632443  3.771265
factor   2.034873  2.881479

Residual Deviance: 106.2304 
AIC: 110.2304 

Correlation of Coefficients:
[1] -0.9981598

> summary(dt.b)

Call:
glm(formula = cbind(m, f) ~ factor, family = binomial, data = dt2)

Deviance Residuals: 
   1 2 3  
 0.01932  -0.03411   0.01747  

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)   -2.632  3.771  -0.6980.485
factor 2.035  2.881   0.7060.480

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 0.5031047  on 2  degrees of freedom
Residual deviance: 0.0018418  on 1  degrees of freedom
AIC: 15.115

Number of Fisher Scoring iterations: 2

> predict(dt.b, type="response")
1 2 3 
0.4524946 0.5032227 0.5538845 
 
> predict(dt.m, type="probs")
1 2 3 4 5 6 
0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846 

These fitted probabilities *are* correct: Since the members of each pair
(1,2), (3,4), and (5,6) have identical values of factor they are identical
fitted probabilities.

(Note, by the way, the large negative correlation between the coefficients,
produced by the configuration of factor values.)

I hope this helps,
 John


John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Marc Girondot
> Sent: Saturday, June 11, 2005 3:06 AM
> To: John Fox
> Cc: r-help@stat.math.ethz.ch
> Subject: [R] Problem with multinom ?
> 
> Thanks for your response.
> OK, multinom() is a more logical in this context.
> 
> But similar problem occurs:
> 
> Let these data to be analyzed using classical glm with binomial error:
> 
> m   f   factor   m theo  f theo 
> -Ln L model-Ln L full  interecept 
> f
> 10  12  1.2  0.452494473  0.547505527 
> 1.778835688  1.7786489632.632455675 
> -2.034882223
> 14  14  1.3  0.503222759  0.496777241  1.901401922  1.900820284
> 15  12  1.4  0.553884782  0.446115218  1.877062369  1.876909821
> 
>  Sum -Ln L
> 5.557299979  5.556379068  Residual deviance
>  Deviance 
> 11.11459996  11.11275814   0.001841823
> 
> If I try to use multinom() function to analyze these data, 
> the fitted parameters are correct but the residual deviance not.
> 
> >  dt<-read.table('/try.txt'. header=T)
> >  dt
>output factor  n
> 1  m1.2 10
> 2  f1.2 12
> 3  m1.3 14
> 4  f1.3 14
> 5  m1.4 15
> 6  f1.4 12
> >  dt.plr <- multinom(output ~ factor. data=dt. weights=n. maxit=1000)
> # weights:  3 (2 variable)
> initial  value 53.372333
> iter  10 value 53.115208
> iter  10 value 53.115208
> iter  10 value 53.115208
> final  value 53.115208
> converged
> >  dt.plr
> Call:
> multinom(formula = output ~ factor. data = dt. weights = n. 
> maxit = 1000)
> 
> Coefficients:
> (Intercept)  factor
>-2.6324432.034873
> 
> Residual Deviance: 106.2304
> AIC: 110.2304
> 
> >  dt.pr1<-predict(dt.plr. . type="probs")
> >  dt.pr1
>  1 2 3 4 5 6
> 0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846
> 
> Probability for 2. 4 and 6 are not correct and its explain 
> the non-correct residual deviance obtained in R.
> Probably the problem I have is due to an incorrect data 
> format... could someone help me... 
> Thanks
> 
> (I know there is a simple way to analyze binomial data. but 
> in fine I want to use multinom() for 5 categories of outputs.
> 
> 
> Thanks a lot
> 
> Marc
> -- 
> 
> __
> Marc Girondot, Pr
> Laboratoire Ecologie, Systématique et Evolution Equipe de 
> Conservation des Populations et des Communautés CNRS, ENGREF 
> et Université Paris-Sud 11 , UMR 8079 Bâtiment 362
> 91405 Orsay Cedex, France
> 
> Tel:  33 1 (0)1.69.15.72.30   

[R] Customer Service enquiry

2005-06-11 Thread A-4Service
Dear Valued Customer


Thank you for contacting Domainz. Your email enquiry has been received and
will be attended by a member of our Customer Service Team. 

Please note that this is an system generated email response to confirm the
receipt of your request, if you need to contact us urgently please do not
hesitate to call us on 0800 3662469, or try the 'Frequently asked questions'
on our website
http://www.domainz.net.nz/info.asp?menu=help&content=faq&page=FAQs

Thank you for choosing Domainz.


Domainz Customer Service Team

Domainz Limited - A Melbourne IT Company
[EMAIL PROTECTED]
p: +64 4 473 4567 f: +64 4 473 4569
Level 4, Greenock House, 102-112 Lambton Quay, Wellington, NZ   
www.domainz.net.nz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Sum up the values of a function

2005-06-11 Thread Petr Pikal
Hi

On 10 Jun 2005 at 13:58, Stefan Wagner wrote:

> I am really sorry, that the code appears in a bad way but it is really
> not my fault (at least I think so). I think my mail-programm is
> responsible for it because I sent it in a "normal" way without havin g
> trouble with my space bar. Here are the missing values (but please
> remember, this is only a short example, normally there are more than
> 400 values) x1 <- c(34.67,19.91,47.48,22.48,17.34,24.42) m1 <-
> c(1,0,0,0,0,0) n1 <- c(1,1,1,1,1,1) dec1 <- c(0,0,0,0,1,0) x2 <-
> c(50.98,30.63,31.37,21.17,21.49,39.41,46.85,38.08) m2 <-
> c(1,0,0,0,1,0,2,1) n2 <- c(3,3,3,3,3,3,3,3) dec2 <- c(0,0,0,0,0,0,0,0)
> 
> I hope this is all you need,

Well, actually I am not sure why you do all computations in so 
many functions. It will hide all things and one should go through all 
of them step by step without much knowledge why they are 
constructed this way.

Maybe some other list member can give you any reasonable advice.

Sorry I could not give you some positive answer.

Petr


> 
> Stefan
> 
> Zitat von Petr Pikal <[EMAIL PROTECTED]>:
> 
> > Hi
> >
> > Your example is not reproducible as we do not know x1.
> > Do you by chance seek for something like aggregate? If yes see
> >
> > ?aggregate
> > or
> > ?by
> >
> > BTW, do you have some trouble with your space bar?
> >
> > HTH
> >
> > Petr
> >
> > On 10 Jun 2005 at 13:08, Stefan Wagner wrote:
> >
> >> Dear R-Users,
> >>
> >> I have to do a maximum-likelihood estimation and have now a problem
> >> concerning how to sum up my function values in a smart way. I don't
> >> know how to explain it easyly, so I give you the code and show you
> >> where my problem is. I have shorten the code a little bit, so that
> >> you only get the necessary facts:
> >>
> >> ws12 <- function (z, i) (1/(1+exp(z[1]*(z[3]-x1[i]-
> >> z[4]*(m1[i]/n1[i]-0.5) ws37 <- function (z, i)
> >> (1/(1+exp(z[2]*(z[3]-x2[i]- z[5]*(m2[i]/n2[i]-0.5) wsAttack12
> >> <- function (z,i) (ws12(z,i)*dec1[i]+(1-ws12(z,i))*(1-dec1[i]))
> >> wsAttack37 <- function (z,i)
> >> (ws37(z,i)*dec2[i]+(1-ws37(z,i))*(1-dec2[i])) logwsAttack12 <-
> >> function (z,i) (log(wsAttack12(z,i))) logwsAttack37 <- function
> >> (z,i) (log(wsAttack37(z,i))) ws12sum <- function (z)
> >> (logwsAttack12(z,i=1)+logwsAttack12(z,i=2)+logwsAttack12(z,i=3)+log
> >> wsA ttack12(z,i=4)+logwsAttack12(z,i=5)+logwsAttack12(z,i=6))
> >> ws37sum <- function (z)
> >> (logwsAttack37(z,i=1)+logwsAttack37(z,i=2)+logwsAttack37(z,i=3)+log
> >> wsA
> >> ttack37(z,i=4)+logwsAttack37(z,i=5)+logwsAttack37(z,i=6)+logwsAttac
> >> k37 (z,i=7)+logwsAttack37(z,i=8)) wsLOG <- function (z) (ws12sum(z)
> >> + ws37sum(z)) LogSum <- function (z) (-sum(wsLOG(z))) SP <- c(0.16,
> >> 0.10, 44, 0.80, 46) out <- nlm (LogSum, p=SP) out
> >>
> >> For explanation: x1[i], x2[i], m1[i], m2[i], n1[i], n2[i] are given
> >> data and z[1:5] are my estimates. My problem is that I have more
> >> than one session with diffent number of datas so that I am
> >> searching for a general way of summing up my logwsAttack12 and
> >> logwsAttack37. The program should recognize how many data are in my
> >> table concerning ws12 and how many concerning ws37 and should, in
> >> dependency of z, sum them up. I hope you understand my problem and
> >> hopefully someone is able to solve it. Many thanks for the moment.
> >>
> >> Best regards,
> >>
> >> Stefan Wagner
> >>
> >> __
> >> R-help@stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide!
> >> http://www.R-project.org/posting-guide.html
> >
> > Petr Pikal
> > [EMAIL PROTECTED]
> >
> >
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] CRAN task view for genetics

2005-06-11 Thread Neil Shephard
Hi Gregor,

The compiled list of genetics packages is a great idea, I'll certainly find it 
useful as it took me quite some time to track them all down through CRAN.

I've a couple of things that could be added...

qvalue - A procedure for false discovery rate control.  General, but developed 
for the general area of genetics (see Sotrey & Tibshirani 2003 PNAS 
100:9440-9445)

SimHap - still in development (Beta testing), this is a Java GUI front end for 
performing a range of statistical genetics analyses, very useful for occasional 
users who can not get their heads round CLI's.  Registration with the author is 
currently required.

Regards

Neil
-- 
Neil Shephard
Genetics Statistician, ARC Epidemiology Unit
[EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Replacing for loop with tapply!?

2005-06-11 Thread Petr Pikal
Hi

On 10 Jun 2005 at 20:05, Sander Oom wrote:

> Dear all,
> 
> Dimitris and Andy, thanks for your great help. I have progressed to
> the following code which runs very fast and effective:
> 
> mat <- matrix(sample(-15:50, 15 * 10, TRUE), 15, 10)
> mat[mat>45] <- NA

> mat<-NA

By this you redefine mat as 

> str(mat)
 logi NA
>

and your code gives an error that it has to have some dimensions

+  apply(mat, 1, max, na.rm=TRUE))
Error in rowSums(mat > temp, na.rm = TRUE) : 
'x' must be an array of at least two dimensions
>

If your matrix has one row full of NA's it only complains but 
computes a value. 

> mat[3,]<-NA
> temps <- c(35, 37, 39)
> ind <- rbind(
+  t(sapply(temps, function(temp)
+rowSums(mat > temp, na.rm=TRUE) )),
+  rowSums(!is.na(mat), na.rm=FALSE),
+  apply(mat, 1, max, na.rm=TRUE))
Warning message:
no finite arguments to max; returning -Inf 
> ind <- t(ind)
> ind

> ind
  [,1] [,2] [,3] [,4] [,5]
 [1,]5539   48
 [2,]1119   42
 [3,]0000 -Inf
 
> mat
> temps <- c(35, 37, 39)
> ind <- rbind(
>  t(sapply(temps, function(temp)
>rowSums(mat > temp, na.rm=TRUE) )),
>  rowSums(!is.na(mat), na.rm=FALSE),
>  apply(mat, 1, max, na.rm=TRUE))
> ind <- t(ind)
> ind
> 
> However, some weather stations have missing values for the whole year.
> Unfortunately, the code breaks down (when uncommenting mat<-NA).
> 
> I have tried 'ifelse' statements in the functions, but it becomes even
> more of a mess. I could subset the matrix before hand, but this would
> mean merging with a complete matrix afterwards to make it compatible
> with other years. That would slow things down.
> 
> How can I make the code robust for rows containing all missing values?


which(rowSums(!is.na(mat))==0) 
This gives you indices which lines of your matrix has all values NA 
and you can use it for fine tuning of your code. What you need to 
do depends on what results do you want, how ind matrix should 
look like after processing mat with one or more rows full of NA's.

HTH
Petr


> 
> Thanks for your help,
> 
> Sander.
> 
> Dimitris Rizopoulos wrote:
> > for the maximum you could use something like:
> > 
> > ind[, 1] <- apply(mat, 2, max)
> > 
> > I hope it helps.
> > 
> > Best,
> > Dimitris
> > 
> > 
> > Dimitris Rizopoulos
> > Ph.D. Student
> > Biostatistical Centre
> > School of Public Health
> > Catholic University of Leuven
> > 
> > Address: Kapucijnenvoer 35, Leuven, Belgium
> > Tel: +32/16/336899
> > Fax: +32/16/337015
> > Web: http://www.med.kuleuven.ac.be/biostat/
> >  http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm
> > 
> > 
> > 
> > - Original Message - 
> > From: "Sander Oom" <[EMAIL PROTECTED]>
> > To: "Dimitris Rizopoulos" <[EMAIL PROTECTED]> Cc:
> >  Sent: Friday, June 10, 2005 12:10 PM
> > Subject: Re: [R] Replacing for loop with tapply!?
> > 
> > 
> >>Thanks Dimitris,
> >>
> >>Very impressive! Much faster than before.
> >>
> >>Thanks to new found R.basic, I can simply rotate the result with
> >>rotate270{R.basic}:
> >>
> >>>mat <- matrix(sample(-15:50, 365 * 15000, TRUE), 365, 15000)
> >>>temps <- c(37, 39, 41)
> >>>#
> >>>#ind <- matrix(0, length(temps), ncol(mat))
> >>>ind <- matrix(0, 4, ncol(mat))
> >>>(startDate <- date())
> >>[1] "Fri Jun 10 12:08:01 2005"
> >>>for(i in seq(along = temps)) ind[i, ] <- colSums(mat > temps[i])
> >>>ind[4, ] <- colMeans(max(mat))
> >>Error in colMeans(max(mat)) : 'x' must be an array of at least two
> >>dimensions
> >>>(endDate <- date())
> >>[1] "Fri Jun 10 12:08:02 2005"
> >>>ind <- rotate270(ind)
> >>>ind[1:10,]
> >>   V4 V3 V2 V1
> >>1   0 56 75 80
> >>2   0 46 53 60
> >>3   0 50 58 67
> >>4   0 60 72 80
> >>5   0 59 68 76
> >>6   0 55 67 74
> >>7   0 62 77 93
> >>8   0 45 57 67
> >>9   0 57 68 75
> >>10  0 61 66 76
> >>
> >>However, I have not managed to get the row maximum using your 
> >>method? It
> >>should be 50 for most rows, but my first guess code gives an error!
> >>
> >>Any suggestions?
> >>
> >>Sander
> >>
> >>
> >>
> >>Dimitris Rizopoulos wrote:
> >>>maybe you are looking for something along these lines:
> >>>
> >>>mat <- matrix(sample(-15:50, 365 * 15000, TRUE), 365, 15000)
> >>>temps <- c(37, 39, 41)
> >>>#
> >>>ind <- matrix(0, length(temps), ncol(mat))
> >>>for(i in seq(along = temps)) ind[i, ] <- colSums(mat > temps[i])
> >>>ind
> >>>
> >>>
> >>>I hope it helps.
> >>>
> >>>Best,
> >>>Dimitris
> >>>
> >>>
> >>>Dimitris Rizopoulos
> >>>Ph.D. Student
> >>>Biostatistical Centre
> >>>School of Public Health
> >>>Catholic University of Leuven
> >>>
> >>>Address: Kapucijnenvoer 35, Leuven, Belgium
> >>>Tel: +32/16/336899
> >>>Fax: +32/16/337015
> >>>Web: http://www.med.kuleuven.ac.be/biostat/
> >>> http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm
> >>>
> >>>
> >>>- Original Message - 
> >>>From: "Sander Oom" <[EMAIL PROTECTED]>
> >>>To: 
> >>>Sent: Friday, June 10, 2005 10:50 AM
> >

Re: [R] us zipcode data map

2005-06-11 Thread Roger Bivand
On Fri, 10 Jun 2005, Mike R wrote:

> i've search the email archives, searched the documention 
> of various map packages and done an R-site search, but 
> have been unable to find direct resources for creating maps 
> of the US that are colored or annotated or ... by zipcode 
> data.  
> 
> For example, create a map of the US and color each zipcode 
> region by its population using two vectors z,p containing the 
> zipcode and population, respectively.  I'm not looking for data 
> to fill z, and p.  I'm looking for some highend functions to 
> display on a map the data that I already have.

I'm replying to this original message, because your followups have been 
dropping the original query (please don't assume that everybody uses the 
e-mail system you have chosen yourself).

With the points you have found, you can delineate surrounding polygons and 
clip to the coastline using functions in the tripack, gpclib, and maps 
packages (being very careful to find out how to match the output polygon 
IDs to your data).

But unless you are aiming for A0 postscript output, for the continental US
viewed as a whole, almost all raster image processors will merge the
interesting metropolitan zip polygons, leaving the viewer looking at a
mess with only the areas with lowest population density legible. Even zip
code "maps" of small states are very difficult to use for look-up (reading
values from the polygon fill), one of the main functions of thematic
cartography.

So like most tasks, it can be done (this _is_ R, after all), but do you 
really need to do it?

Roger

> 
> Cheers,
> Mike
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problem with multinom ?

2005-06-11 Thread Marc Girondot
Thanks for your response.
OK, multinom() is a more logical in this context.

But similar problem occurs:

Let these data to be analyzed using classical glm with binomial error:

m   f   factor   m theo  f theo 
-Ln L model-Ln L full  interecept 
f
10  12  1.2  0.452494473  0.547505527 
1.778835688  1.7786489632.632455675 
-2.034882223
14  14  1.3  0.503222759  0.496777241  1.901401922  1.900820284
15  12  1.4  0.553884782  0.446115218  1.877062369  1.876909821

 Sum -Ln L 
5.557299979  5.556379068  Residual deviance
 Deviance 
11.11459996  11.11275814   0.001841823

If I try to use multinom() function to analyze 
these data, the fitted parameters are correct but 
the residual deviance not.

>  dt<-read.table('/try.txt'. header=T)
>  dt
   output factor  n
1  m1.2 10
2  f1.2 12
3  m1.3 14
4  f1.3 14
5  m1.4 15
6  f1.4 12
>  dt.plr <- multinom(output ~ factor. data=dt. weights=n. maxit=1000)
# weights:  3 (2 variable)
initial  value 53.372333
iter  10 value 53.115208
iter  10 value 53.115208
iter  10 value 53.115208
final  value 53.115208
converged
>  dt.plr
Call:
multinom(formula = output ~ factor. data = dt. weights = n. maxit = 1000)

Coefficients:
(Intercept)  factor
   -2.6324432.034873

Residual Deviance: 106.2304
AIC: 110.2304

>  dt.pr1<-predict(dt.plr. . type="probs")
>  dt.pr1
 1 2 3 4 5 6
0.4524948 0.4524948 0.5032229 0.5032229 0.5538846 0.5538846

Probability for 2. 4 and 6 are not correct and 
its explain the non-correct residual deviance 
obtained in R.
Probably the problem I have is due to an 
incorrect data format... could someone help me... 
Thanks

(I know there is a simple way to analyze binomial 
data. but in fine I want to use multinom() for 5 
categories of outputs.


Thanks a lot

Marc
-- 

__
Marc Girondot, Pr
Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, ENGREF et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1 69 
15 56 96   e-mail: [EMAIL PROTECTED]
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot
Fax in US: 1-425-732-6934

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Performance difference between 32-bit build and 64-bit bu ild on Solaris 8

2005-06-11 Thread Prof Brian Ripley
On Fri, 10 Jun 2005, Liaw, Andy wrote:

> I'm not familiar with Solaris, so take this with appropriate dose of NaCl...
>
> For the 64-bit build, why not have the -O2 for gcc, since you have it for
> g77 and g++?  If you just run vanilla configure for the 32-bit build, I
> believe it uses -O2 for all three compilers by default.  If that's the
> difference, perhaps it's sufficient to explain the performance difference.

CFLAGS was not specified, so should default to "-g -O2".  It is definitely 
worth checking, although almost all the time in these tests will be spent 
in Fortran code.

> The other thing is that solve(), and to some extent lm(), can benefit from
> an optimized BLAS.  You might want to check if the two builds are using
> equivalent BLAS.

He said:

>> neither build uses a BLAS.

Well, they do, the one supplied with R (which is in Fortran).

To supplement my timings earlier, with an optimized BLAS I gave

32-bit
[1] 4.99 0.03 5.02 0.00 0.00
64-bit
[1] 5.25 0.03 5.29 0.00 0.00

and for gcc 3.4.3 and the internal BLAS (which I had to build for these 
tests) I get

32-bit
[1]  9.96  0.09 10.12  0.00  0.00
64-bit
[1]  9.93  0.04 10.04  0.00  0.00

so I am not seeing anything like the same performance difference between 
32- and 64-bit builds (but it could well depend on the particular Sparc 
chip).

There are some problems in which the 64-bit builds _are_ much slower: 
complex matrix arithmetic is one (presumably because less effort has been 
spent on optimizing that).

>
> Andy
>
>> From: Scott Gilpin
>>
>> Hi everyone -
>>
>> I'm seeing a 32-bit build perform significantly faster (up to 3x) than
>> a 64 bit build on Solaris 8.  I'm running R version 2.1.0.  Here are
>> some of my system details, and some resulting timings:
>>
>>> uname -a
>> SunOS lonetree 5.8 Generic_117350-16 sun4u sparc SUNW,Sun-Fire-V440
>>
>> lonetree /home/sgilpin >gcc -v
>> Reading specs from /usr/local/lib/gcc/sparc-sun-solaris2.8/3.4.2/specs
>> Configured with: ../configure --with-as=/usr/ccs/bin/as
>> --with-ld=/usr/ccs/bin/ld --disable-nls
>> Thread model: posix
>> gcc version 3.4.2
>>
>> I built the 32 bit version of R with no changes to config.site.  I
>> built the 64 bit version with the following in config.site:
>>
>> CC="gcc -m64"
>> FFLAGS="-m64 -g -02"
>> LDFLAGS="-L/usr/local/lib/sparcv9 -L/usr/local/lib"
>> CXXFLAGS="-m64 -g -02"
>>
>> neither build uses a BLAS.  Both builds are installed on the same
>> machine, and the same disk.  The machine has virtually no load; R is
>> one of the only processes running during these timings:

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Performance difference between 32-bit build and 64-bit build on Solaris 8

2005-06-11 Thread Prof Brian Ripley
Your tests are of problems where you really should be using an optimized 
BLAS.  But because those pointers are twice the size, the L1 cache will 
hold half as many and so I am not surprised at a factor of three on a 
naive implementation.

For linear algebra on large matrices the key to good performance is to 
keep L1 cache misses to a minimum.  Using SunPerf and a 1000x1000 problem 
I got

32-bit
[1] 4.99 0.03 5.02 0.00 0.00
64-bit
[1] 5.25 0.03 5.29 0.00 0.00

and for your regression problem
32-bit
[1] 24.97  0.96 26.15  0.00  0.00
64-bit
[1] 26.25  1.06 27.52  0.00  0.00

So the moral appears to be to take the advice in the R-admin manual and 
tune your linear algebra system.


On Fri, 10 Jun 2005, Scott Gilpin wrote:

> Hi everyone -
>
> I'm seeing a 32-bit build perform significantly faster (up to 3x) than
> a 64 bit build on Solaris 8.  I'm running R version 2.1.0.  Here are
> some of my system details, and some resulting timings:
>
>> uname -a
> SunOS lonetree 5.8 Generic_117350-16 sun4u sparc SUNW,Sun-Fire-V440
>
> lonetree /home/sgilpin >gcc -v
> Reading specs from /usr/local/lib/gcc/sparc-sun-solaris2.8/3.4.2/specs
> Configured with: ../configure --with-as=/usr/ccs/bin/as
> --with-ld=/usr/ccs/bin/ld --disable-nls
> Thread model: posix
> gcc version 3.4.2
>
> I built the 32 bit version of R with no changes to config.site.  I
> built the 64 bit version with the following in config.site:
>
> CC="gcc -m64"
> FFLAGS="-m64 -g -02"
> LDFLAGS="-L/usr/local/lib/sparcv9 -L/usr/local/lib"
> CXXFLAGS="-m64 -g -02"
>
> neither build uses a BLAS.  Both builds are installed on the same
> machine, and the same disk.  The machine has virtually no load; R is
> one of the only processes running during these timings:
>
> First comparison:  solve on a large matrix
>
>> echo 'set.seed(1);M<-matrix(rnorm(9e6),3e3);system.time(solve(M))' |
> /disk/loneres01/R-2.1.0-32bit/bin/R -q --vanilla
>> set.seed(1);M<-matrix(rnorm(9e6),3e3);system.time(solve(M))
> [1] 713.45   0.38 713.93   0.00   0.00
>>
>
>> echo 'set.seed(1);M<-matrix(rnorm(9e6),3e3);system.time(solve(M))' |
> /disk/loneres01/R-2.1.0-64bit/bin/R -q --vanilla
>> set.seed(1);M<-matrix(rnorm(9e6),3e3);system.time(solve(M))
> [1] 2277.050.31 2278.380.000.00
>>
>
> Second comparison:  linear regression
>
> lonetree /home/sgilpin/R >echo 'set.seed(1);
> y<-matrix(rnorm(1*500),500);
> x<-matrix(runif(500*100),500);
> system.time(fit<-lm(y~x))' | /disk/loneres01/R-2.1.0-32bit/bin/R -q --vanilla
>> set.seed(1);y<-matrix(rnorm(1*500),500);x<-matrix(runif(500*100),500);system.time(fit<-lm(y~x))
> [1] 23.34  0.80 24.17  0.00  0.00
>>
>
> lonetree /home/sgilpin/R >echo 'set.seed(1);
> y<-matrix(rnorm(1*500),500);
> x<-matrix(runif(500*100),500);
> system.time(fit<-lm(y~x))' | /disk/loneres01/R-2.1.0-64bit/bin/R -q --vanilla
>> set.seed(1);y<-matrix(rnorm(1*500),500);x<-matrix(runif(500*100),500);system.time(fit<-lm(y~x))
> [1] 55.34  0.70 56.21  0.00  0.00
>>
>
> Final comparison:  stats-Ex.R (from R-devel)
> lonetree /home/sgilpin/R >time /disk/loneres01/R-2.1.0-32bit/bin/R -q
> --vanilla CMD BATCH stats-Ex.R
>
> real1m4.042s
> user0m47.400s
> sys 0m10.390s
> lonetree /home/sgilpin/R >time /disk/loneres01/R-2.1.0-64bit/bin/R -q
> --vanilla CMD BATCH stats-Ex.R
>
> real1m20.017s
> user1m3.590s
> sys 0m10.130s
>
> I've seen Prof. Ripley and others comment that a 64 bit build will be
> a little slower because the pointers are larger, and gc() will take
> longer, but these timings seem out of this range.
>
> Any thoughts?
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html